I've got several pages on my ASP.NET MVC 3 website (not that the technology matters here), where i render out certain URL's in a <code><script></code> tag on the page, so that my JavaScript (stored in an external file) can perform AJAX calls to the server. Something like this: <pre class="prettyprint"><code><html> ... <body> ... <script type="text/javascript"> $(function() { myapp.paths.someUrl = '/blah/foo'; // not hardcoded in reality, but N/A here }); </script> </body> </html> </code></pre> Now on the server-side, most of these URL's are protected with attributes stating that: a) They can only be accessed by AJAX (e.g XmlHttpRequest) b) They can only be accessed by HTTP POST (as it returns JSON - security) The problem is, for some reason, bots are crawling these URL's, and trying to do HTTP GET's on them, resulting in 404's. I was under the impression that bots shouldn't try and crawl javascript. So how are they getting a hold of these URL's? Is there any way i can prevent them from doing this? I can't really move these URL variables to an external file, because as the comment in the code above suggests, i render the URL's out with server-code (must be done on the actual page). I've basically been added routing to my website to HTTP 410 (Gone) these URL's (when it's not a AJAX POST). Which is really annoying, because it's adding another route to my already convuluted route table. Any tips/suggestions?

Disallow URL by the prefix in the robots.txt

How to stop bots from crawling my AJAX-based URL's?

Tags:

I've got several pages on my ASP.NET MVC 3 website (not that the technology matters here), where i render out certain URL's in a <script> tag on the page, so that my JavaScript (stored in an external file) can perform AJAX calls to the server.

Something like this:

<html>
   ...
   <body>
      ...
      <script type="text/javascript">
         $(function() {
            myapp.paths.someUrl = '/blah/foo'; // not hardcoded in reality, but N/A here
         });
      </script>
   </body>
</html>

Now on the server-side, most of these URL's are protected with attributes stating that:

a) They can only be accessed by AJAX (e.g XmlHttpRequest)

b) They can only be accessed by HTTP POST (as it returns JSON - security)

The problem is, for some reason, bots are crawling these URL's, and trying to do HTTP GET's on them, resulting in 404's.

I was under the impression that bots shouldn't try and crawl javascript. So how are they getting a hold of these URL's?

Is there any way i can prevent them from doing this?

I can't really move these URL variables to an external file, because as the comment in the code above suggests, i render the URL's out with server-code (must be done on the actual page).

I've basically been added routing to my website to HTTP 410 (Gone) these URL's (when it's not a AJAX POST). Which is really annoying, because it's adding another route to my already convuluted route table.

Any tips/suggestions?

685

asked Mar 25 '12 23:03

RPM1984

1 Answers

Disallow URL by the prefix in the robots.txt

answered Oct 23 '22 13:10

Eugene Retunsky

Related questions
                            
                                Sound notification in iOS 5.x in safari browser by js event (for eg. ajax response)
                            
                                How do I scroll a column at a different speed?
                            
                                How to rollback nodes that couldn't be moved in jstree
                            
                                using touchstart causes screen to become fuzzy on touchstart
                            
                                Distinguish link opened in current tab vs. new tab
                            
                                Safe Twitter OAuth authentication in JavaScript / jQuery (plus server side helper)
                            
                                In iOS, how to detect from Javascript (web) the device is connected via 3G or Wifi?
                            
                                Simple, but fully-featured, Backbone example application?
                            
                                PhoneGap on Android wont load external scripts
                            
                                HTML5 large canvas - moving and zooming whole content including background
                            
                                How catch iframe resize event from inside the iframe (iframe and the page - same domain)
                            
                                Find the javascript error that I'm getting in phonegap on android
                            
                                jQuery's .isWindow method?
                            
                                jQueryMobile and Themeroller - do you need to re-roll themes when updating jQueryMobile?
                            
                                transparent image background html5 canvas
                            
                                How to connect to nethack from Node.js?
                            
                                Performance issues using images with arbor.js
                            
                                Events for Inertial Scrolling on Mobile Safari
                            
                                Creating a zip archive from a folder and preserving structure with Node.js
                            
                                Advantages / Disadvantages to websites generated with Javascript

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to stop bots from crawling my AJAX-based URL's?

Tags:

javascript

url

asp.net

bots

web-crawler

RPM1984

People also ask

1 Answers

Eugene Retunsky

Recent Activity

Donate For Us