I have a asp.net download page which send a file to client but I want to deny robots download this file because the file is large and as I can see from the records a bot downloads this file about 20 times. This is slowing down the server and causes bandwidth consumption.
I coded this page to count downloads and detect .net framework of the client so I can post a setup file containing .net framework or not.
I need somehow to deny Google and other bots to reach this page.
My download link is like download.aspx?pack=msp
If lots of new content is added to your website, the search engine bots could more aggressively crawl your website to index the new content. There could be a problem with your website, and the bots could be triggering this fault causing a resource-intensive operation, such as an infinite loop.
One option to reduce server load from bots, spiders, and other crawlers is to create a robots. txt file at the root of your website. This tells search engines what content on your site they should and should not index.
Yes, add a robots.txt file to your site. It should contain a list of rules (suggestions really) how spiders should behave.
Check out this article for more info. Also for kicks, this is the robot.txt file used by Google.
You want a robots.txt file. For example:
User-agent: *
Disallow: /download.aspx
This doesn't forcibly block search engines, but most (including Google) will check for a robots.txt file and follow its instructions
The correct answer, as noted by the other two people, is to created a robots.txt file to make well-behaved robots not download things.
However, it is important to know that not all robots are well-behaved, and that robots.txt is only advisory. If you have pages which are not publicly linked, do not list them in robots.txt to "protect" them as some particularly badly-behaved robots actually scan the file to see what interesting URLs there may be that they don't already know about.
In lieu of a robots.txt file, where it isn't possible you can decorate your pages with a <meta name="robots" content="noindex">
tag.
Again, as Donnie mentioned, this is just a recommendation for bots and there is no requirement to follow it.
Implement a CAPTCHA method that provides a login mechanism to allow desirable users to access a protected folder where you keep your biggest files.
Instead of providing direct links to content that is easily parsed by bots, use Javascript on your download link to redirect your users. Many bots won't execute javascript, though bot obfuscation is often a moving target.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With