网站用的是 WordPress ,自带搜索功能 链接是 example.com/?s=“搜索内容”
有天 Google Search Engine 发邮件说有大量重复链接,看了下是爬虫大量请求 example.com/?s=“广告内容”,导致这些链接被 Google 识别,但 Google 没有编入搜索引擎
Bing 上的站长工具没有报告问题,将这些广告链接都编入搜索引擎了
在 Bing 上搜索过这些广告内容,发现好多 WordPress 站点都被这么搞了
现在的方案是在搜索时增加了验证码,想问问各位 V 友有什么更好的解决方案,毕竟搜索自己的站点关键词搜出来那些全国可飞广告真的不太好...
![]() |
1
opengps 5 天前
可以 robots.txt 指定这个路径不加入搜索引擎的索引
|
![]() |
2
olaloong 5 天前 via Android ![]() robots.txt
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /wp-content/cache/ Disallow: /wp-json/ Disallow: /xmlrpc.php Disallow: /cgi-bin/ Disallow: /trackback/ Disallow: /comments/ Disallow: /?s= Disallow: /author/ Allow: /wp-content/uploads/ |
![]() |
3
id7368 4 天前 via iPhone ![]() User-agent: *
Allow: /wp-*/uploads/* Allow: /wp-*/themes/* Allow: /archives/user/1 Disallow: /trackback Disallow: /wp-* Disallow: /\?p=* Disallow: /?p=* Disallow: /?s=* Disallow: /*/attachment/* Disallow: /archives/user/* User-agent: MJ12bot Disallow: / User-agent: istellabot Disallow: / User-agent: SemrushBot Disallow: / User-agent: SemrushBot-SA Disallow: / User-agent: Dotbot Disallow: / User-agent: CriteoBot/0.1 Disallow: / User-agent: ClaudeBot Disallow: / User-agent: AI2Bot Disallow: / User-agent: Ai2Bot-Dolma Disallow: / User-agent: Amazonbot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Applebot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / #User-agent: ChatGPT-User #Disallow: / User-agent: Claude-Web Disallow: / User-agent: ClaudeBot Disallow: / User-agent: cohere-ai Disallow: / User-agent: Diffbot Disallow: / User-agent: DuckAssistBot Disallow: / User-agent: FacebookBot Disallow: / User-agent: facebookexternalhit Disallow: / User-agent: FriendlyCrawler Disallow: / User-agent: Google-Extended Disallow: / User-agent: GoogleOther Disallow: / User-agent: GoogleOther-Image Disallow: / User-agent: GoogleOther-Video Disallow: / User-agent: GPTBot Disallow: / User-agent: iaskspider/2.0 Disallow: / User-agent: ICC-Crawler Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: img2dataset Disallow: / User-agent: ISSCyberRiskCrawler Disallow: / User-agent: Kangaroo Bot Disallow: / User-agent: Meta-ExternalAgent Disallow: / User-agent: Meta-ExternalFetcher Disallow: / #User-agent: OAI-SearchBot #Disallow: / User-agent: omgili Disallow: / User-agent: omgilibot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: PetalBot Disallow: / User-agent: Scrapy Disallow: / User-agent: Sidetrade indexer bot Disallow: / User-agent: Timpibot Disallow: / User-agent: VelenPublicWebCrawler Disallow: / User-agent: Webzio-Extended Disallow: / User-agent: YouBot Disallow: / |
![]() |
4
id7368 4 天前 via iPhone
如果允许用户注册那一定要把用户目录也屏蔽,这些人还会在用户名和简介了刷广告喂给爬虫
|
![]() |
7
ysc3839 3 天前 via Android ![]() 搜索页面不要把用户输入的内容写到页面内,只显示结果。
|