Before you tell me 'what have you tried', and 'test this yourself', I would like to note that robots.txt
updates awfully slow for my siteany site on search engines, so if you could provide theoretical experience, that would be appreciated.
For example, is it possible to allow:
http://www.example.com
And block:
http://www.example.com/?foo=foo
I'm not very sure.
Help?
txt and Canonical Tags. If you disallow a URL, bots can't read the canonical tag in order to follow those instructions. This means that any links that the page has acquired no longer pass SEO equity to the source material.
What is a robots. txt file used for? You can use a robots. txt file for web pages (HTML, PDF, or other non-media formats that Google can read), to manage crawling traffic if you think your server will be overwhelmed by requests from Google's crawler, or to avoid crawling unimportant or similar pages on your site.
Disallow directive in robots. txt. You can tell search engines not to access certain files, pages or sections of your website. This is done using the Disallow directive.
According to Wikipedia, "The robots.txt patterns are matched by simple substring comparisons" and as the GET string is a URL you should be able to just add:
Disallow: /?foo=foo
or something more fancy like
Disallow: /*?*
to disable all get strings. The asterisk is a wildcard symbol so it matches one or many characters of anything.
Example of a robots.txt with dynamic urls.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With