1
auzeonfung 2016-12-01 20:59:22 +08:00 via Android
服务器 ban 掉 Google 的 IP
|
2
stamaimer 2016-12-01 21:03:05 +08:00 via iPhone
@auzeonfung 你知道谷歌有多少 ip?
|
3
xmoiduts 2016-12-01 21:07:11 +08:00
题主搜一下 taobao ?
|
4
imcocc 2016-12-01 21:07:27 +08:00 via iPhone
搜索 屏蔽垃圾爬虫
用 useragent 匹配屏蔽 |
6
nikoo OP 一些研究收获:
Why do Google search results include pages disallowed in robots.txt? http://webmasters.stackexchange.com/questions/24569/why-do-google-search-results-include-pages-disallowed-in-robots-txt Does Google ignore robots.txt http://webmasters.stackexchange.com/questions/54879/does-google-ignore-robots-txt 总结上面两个帖子中的结论: Google 的确会无视 robots.txt 收录禁止收录的页面,解决方法是在所有页面中加入 <meta name="robots" content="noindex, nofollow"> Google 的解释是只要这个页面在其他被收录页面中有链接就会被收录并且无视 robots.txt 我感觉并不对,因为我的 wiki 里导入的文章没有也不可能在其他站点有链接,怎么就连标题带 URL 的被收录了呢 |
7
caiych 2016-12-01 21:19:57 +08:00 1
查 robots.txt 的细节的时候查到 google 的文档,里面写的是
> 如果您想从搜索结果中屏蔽自己的网页,请使用其他方法,例如密码保护或 noindex 标记或指令。 不知道楼主有没有设置这个… https://support.google.com/webmasters/answer/6062608?visit_id=1-636161949805851671-2329679117&hl=zh-Hans&rd=2 |
8
nikoo OP @caiych 非常感谢,很有收获的文档,感觉 Google 这样的做法有瑕疵:
robots.txt 指令无法阻止其他网站引用您的网址 尽管 Google 不会抓取 robots.txt 禁止访问的内容或将其编入索引,我们仍有可能在网络上的其他位置找到被禁止访问的网址并将其编入索引。因此,相关网址和其他公开显示的信息(如相关网站的链接中的定位文字)仍可能会出现在 Google 搜索结果中。您可以通过使用其他网址屏蔽方法(例如为您服务器上的文件提供密码保护或使用 noindex 元标记或响应标头),完全阻止您的网址出现在 Google 搜索结果中。 那么问题来了,在 使用元标记阻止搜索引擎将您的网页编入索引 https://support.google.com/webmasters/answer/93710 中, Google 爬虫会因为 robots.txt 限制无法访问"noindex 元标记",那我在自己页面设置"noindex 元标记"理论上是无效的(因为 robots.txt 限制) |
9
khaki 2016-12-01 21:44:24 +08:00
这里的文档更详细 https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt ,会不会是子域名的问题
|
10
auzeonfung 2016-12-01 21:59:55 +08:00
@stamaimer deny from 104.132.0.0/21
deny from 104.132.12.0/24 deny from 104.132.128.0/24 deny from 104.132.129.0/24 deny from 104.132.13.0/26 deny from 104.132.13.112/28 deny from 104.132.13.128/25 deny from 104.132.13.64/27 deny from 104.132.13.96/28 deny from 104.132.130.0/24 deny from 104.132.131.0/24 deny from 104.132.132.0/24 deny from 104.132.133.0/24 deny from 104.132.134.0/24 deny from 104.132.135.0/24 deny from 104.132.136.0/23 deny from 104.132.138.0/24 deny from 104.132.139.0/24 deny from 104.132.14.0/23 deny from 104.132.140.0/24 deny from 104.132.141.0/26 deny from 104.132.141.112/28 deny from 104.132.141.128/25 deny from 104.132.141.64/27 deny from 104.132.141.96/28 deny from 104.132.142.0/24 deny from 104.132.143.0/24 deny from 104.132.144.0/24 deny from 104.132.145.0/24 deny from 104.132.146.0/24 deny from 104.132.147.0/24 deny from 104.132.148.0/23 deny from 104.132.150.0/24 deny from 104.132.151.0/24 deny from 104.132.152.0/24 deny from 104.132.153.0/24 deny from 104.132.154.0/23 deny from 104.132.156.0/24 deny from 104.132.157.0/24 deny from 104.132.158.0/24 deny from 104.132.159.0/24 deny from 104.132.16.0/24 deny from 104.132.160.0/24 deny from 104.132.161.0/24 deny from 104.132.162.0/24 deny from 104.132.163.0/24 deny from 104.132.164.0/23 deny from 104.132.166.0/24 deny from 104.132.167.0/24 deny from 104.132.168.0/24 deny from 104.132.169.0/24 deny from 104.132.17.0/26 deny from 104.132.17.112/28 deny from 104.132.17.128/25 deny from 104.132.17.64/27 deny from 104.132.17.96/28 deny from 104.132.170.0/24 deny from 104.132.171.0/24 deny from 104.132.172.0/22 deny from 104.132.176.0/23 deny from 104.132.178.0/24 deny from 104.132.179.0/24 deny from 104.132.18.0/24 deny from 104.132.180.0/24 deny from 104.132.181.0/24 deny from 104.132.182.0/24 deny from 104.132.183.0/24 deny from 104.132.184.0/24 deny from 104.132.185.0/24 deny from 104.132.186.0/24 deny from 104.132.187.0/24 deny from 104.132.188.0/24 deny from 104.132.189.0/24 deny from 104.132.19.0/24 deny from 104.132.190.0/23 deny from 104.132.192.0/22 deny from 104.132.196.0/24 deny from 104.132.197.0/24 deny from 104.132.198.0/23 deny from 104.132.20.0/24 deny from 104.132.200.0/23 deny from 104.132.202.0/24 deny from 104.132.203.0/24 deny from 104.132.204.0/24 deny from 104.132.205.0/24 deny from 104.132.206.0/23 deny from 104.132.208.0/24 deny from 104.132.209.0/24 deny from 104.132.21.0/26 deny from 104.132.21.112/28 deny from 104.132.21.128/25 deny from 104.132.21.64/27 deny from 104.132.21.96/28 deny from 104.132.210.0/23 deny from 104.132.212.0/22 deny from 104.132.216.0/21 deny from 104.132.22.0/24 deny from 104.132.224.0/19 deny from 104.132.23.0/24 deny from 104.132.24.0/26 deny from 104.132.24.128/25 deny from 104.132.24.64/26 deny from 104.132.25.0/24 deny from 104.132.26.0/24 deny from 104.132.27.0/24 deny from 104.132.28.0/24 deny from 104.132.29.0/24 deny from 104.132.30.0/23 deny from 104.132.32.0/24 deny from 104.132.33.0/24 deny from 104.132.34.0/24 deny from 104.132.35.0/24 deny from 104.132.36.0/22 deny from 104.132.40.0/21 deny from 104.132.48.0/22 deny from 104.132.52.0/23 deny from 104.132.54.0/24 deny from 104.132.55.0/24 deny from 104.132.56.0/21 deny from 104.132.64.0/18 deny from 104.132.8.0/22 deny from 104.133.0.0/17 deny from 104.133.128.0/18 deny from 104.133.192.0/19 deny from 104.133.224.0/20 deny from 104.133.240.0/21 deny from 104.133.248.0/24 deny from 104.133.249.0/24 deny from 104.133.250.0/23 deny from 104.133.252.0/22 deny from 104.134.0.0/16 deny from 104.135.0.0/17 deny from 104.135.128.0/18 deny from 104.135.192.0/19 deny from 104.135.224.0/19 deny from 104.154.0.0/15 deny from 104.196.0.0/15 deny from 104.198.0.0/16 deny from 104.199.0.0/17 deny from 104.199.128.0/20 deny from 104.199.144.0/23 deny from 104.199.146.0/24 deny from 104.199.147.0/24 deny from 104.199.148.0/22 deny from 104.199.152.0/21 deny from 104.199.160.0/19 deny from 104.199.192.0/18 deny from 107.167.160.0/19 deny from 107.178.192.0/18 deny from 108.170.192.0/20 deny from 108.170.208.0/21 deny from 108.170.216.0/24 deny from 108.170.217.0/25 deny from 108.170.217.128/28 deny from 108.170.217.160/27 deny from 108.170.217.192/26 deny from 108.170.218.0/23 deny from 108.170.220.0/22 deny from 108.170.224.0/19 deny from 108.177.0.0/17 deny from 108.59.80.0/24 deny from 108.59.81.0/27 deny from 108.59.82.0/23 deny from 108.59.84.0/22 deny from 108.59.88.0/22 deny from 108.59.92.0/27 deny from 108.59.92.128/26 deny from 108.59.92.192/27 deny from 108.59.92.96/27 deny from 108.59.93.0/27 deny from 108.59.93.192/26 deny from 108.59.93.32/29 deny from 108.59.93.40/31 deny from 108.59.93.43/32 deny from 108.59.93.44/30 deny from 108.59.93.48/28 deny from 108.59.93.64/26 deny from 108.59.94.0/28 deny from 108.59.94.128/26 deny from 108.59.94.16/29 deny from 108.59.94.192/28 deny from 108.59.94.208/29 deny from 108.59.94.240/28 deny from 108.59.94.32/27 deny from 108.59.94.64/26 deny from 108.59.95.0/24 deny from 12.216.80.0/24 deny from 12.234.149.240/29 deny from 125.16.7.72/30 deny from 125.17.82.112/30 deny from 128.177.109.0/26 deny from 128.177.119.128/25 deny from 128.177.163.0/25 deny from 130.211.0.0/16 deny from 142.250.0.0/15 deny from 146.148.0.0/17 deny from 162.216.148.0/22 deny from 162.222.176.0/21 deny from 172.102.8.0/21 deny from 172.217.0.0/16 deny from 172.253.0.0/16 deny from 173.194.0.0/18 deny from 173.194.100.0/22 deny from 173.194.104.0/21 deny from 173.194.112.0/20 deny from 173.194.128.0/17 deny from 173.194.64.0/19 deny from 173.194.96.0/24 deny from 173.194.97.0/24 deny from 173.194.98.0/24 deny from 173.194.99.0/24 deny from 173.255.112.0/22 deny from 173.255.116.0/25 deny from 173.255.116.128/26 deny from 173.255.116.192/27 deny from 173.255.117.128/25 deny from 173.255.117.32/27 deny from 173.255.117.64/26 deny from 173.255.118.0/23 deny from 173.255.120.0/24 deny from 173.255.121.0/25 deny from 173.255.121.128/26 deny from 173.255.122.128/26 deny from 173.255.122.64/26 deny from 173.255.123.0/24 deny from 173.255.124.0/27 deny from 173.255.124.128/29 deny from 173.255.124.144/28 deny from 173.255.124.160/27 deny from 173.255.124.192/27 deny from 173.255.124.232/29 deny from 173.255.124.240/29 deny from 173.255.124.32/28 deny from 173.255.124.48/29 deny from 173.255.124.64/26 deny from 173.255.125.0/27 deny from 173.255.125.128/25 deny from 173.255.125.72/29 deny from 173.255.125.80/28 deny from 173.255.125.96/27 deny from 173.255.126.0/23 deny from 180.87.33.64/26 deny from 192.104.160.0/23 deny from 192.158.28.0/22 deny from 192.178.0.0/15 deny from 195.16.45.144/29 deny from 198.108.100.192/28 deny from 199.192.112.0/25 deny from 199.192.112.128/26 deny from 199.192.112.192/27 deny from 199.192.112.224/29 deny from 199.192.113.0/25 deny from 199.192.113.128/27 deny from 199.192.113.176/28 deny from 199.192.113.192/26 deny from 199.192.114.0/25 deny from 199.192.114.192/26 deny from 199.192.115.0/28 deny from 199.192.115.128/25 deny from 199.192.115.80/28 deny from 199.192.115.96/27 deny from 199.223.232.0/21 deny from 203.222.167.144/28 deny from 206.160.135.240/28 deny from 207.223.160.0/20 deny from 208.184.125.240/28 deny from 208.21.209.0/28 deny from 208.44.48.240/29 deny from 208.46.199.160/29 deny from 209.185.108.128/25 deny from 209.85.128.0/17 deny from 213.155.151.128/26 deny from 213.200.103.128/26 deny from 213.200.99.192/26 deny from 216.109.75.80/28 deny from 216.136.145.128/27 deny from 216.239.32.0/24 deny from 216.239.33.0/29 deny from 216.239.33.104/29 deny from 216.239.33.112/28 deny from 216.239.33.128/25 deny from 216.239.33.16/28 deny from 216.239.33.32/29 deny from 216.239.33.40/29 deny from 216.239.33.48/28 deny from 216.239.33.64/27 deny from 216.239.33.8/29 deny from 216.239.33.96/29 deny from 216.239.34.0/24 deny from 216.239.35.0/24 deny from 216.239.36.0/23 deny from 216.239.38.0/24 deny from 216.239.39.0/24 deny from 216.239.40.0/22 deny from 216.239.44.0/23 deny from 216.239.46.0/23 deny from 216.239.48.0/22 deny from 216.239.52.0/23 deny from 216.239.54.0/24 deny from 216.239.55.0/28 deny from 216.239.55.128/27 deny from 216.239.55.16/29 deny from 216.239.55.160/29 deny from 216.239.55.168/29 deny from 216.239.55.176/28 deny from 216.239.55.192/26 deny from 216.239.55.24/29 deny from 216.239.55.32/27 deny from 216.239.55.64/26 deny from 216.239.56.0/21 deny from 216.252.220.0/22 deny from 216.33.229.144/29 deny from 216.33.229.160/29 deny from 216.34.7.176/28 deny from 216.58.192.0/19 deny from 216.74.130.48/28 deny from 216.74.153.0/27 deny from 217.118.234.96/28 deny from 23.236.48.0/20 deny from 23.251.128.0/19 deny from 4.3.2.0/24 deny from 41.206.188.128/26 deny from 61.246.190.124/30 deny from 61.246.224.136/30 deny from 63.158.137.224/29 deny from 63.161.156.0/24 deny from 63.166.17.128/25 deny from 63.226.245.56/29 deny from 63.237.119.112/29 deny from 63.88.22.0/23 deny from 64.124.98.104/29 deny from 64.233.160.0/23 deny from 64.233.162.0/24 deny from 64.233.163.0/24 deny from 64.233.164.0/22 deny from 64.233.168.0/21 deny from 64.233.176.0/20 deny from 64.41.146.208/28 deny from 64.41.221.192/28 deny from 64.68.64.64/26 deny from 64.68.80.0/20 deny from 64.71.148.240/29 deny from 64.9.224.0/19 deny from 65.167.144.64/28 deny from 65.170.13.0/28 deny from 65.171.1.144/28 deny from 65.216.183.0/24 deny from 65.220.13.0/24 deny from 66.102.0.0/21 deny from 66.102.12.0/23 deny from 66.102.14.0/25 deny from 66.102.14.128/30 deny from 66.102.14.132/31 deny from 66.102.14.134/31 deny from 66.102.14.136/29 deny from 66.102.14.144/28 deny from 66.102.14.160/27 deny from 66.102.14.192/26 deny from 66.102.15.0/24 |
11
xiaoz 2016-12-01 22:01:01 +08:00 1
用 google 站长工具检测下你网站的 robots.txt ,之前我遇到了 robots.txt 包含 bom 头被 google 报错。
|
12
Vicer 2016-12-01 23:48:02 +08:00 via Android
学习一下
|
13
Showfom 2016-12-02 09:08:13 +08:00
|
14
stamaimer 2016-12-02 10:13:11 +08:00 via iPhone
学习了,同志们。
|