I am wondering how would I go abouts in detecting search crawlers? The reason I ask is because I want to suppress certain JavaScript calls if the user agent is a bot. I have found an example of how to to detect a certain browser, but am unable to find examples of how to detect a search crawler: <code>/MSIE (\d+\.\d+);/.test(navigator.userAgent); //test for MSIE x.x</code> Example of search crawlers I want to block: <pre class="prettyprint"><code>Google Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) Googlebot/2.1 (+http://www.googlebot.com/bot.html) Googlebot/2.1 (+http://www.google.com/bot.html) Baidu Baiduspider+(+http://www.baidu.com/search/spider_jp.html) Baiduspider+(+http://www.baidu.com/search/spider.htm) BaiDuSpider </code></pre>

This is the regex the ruby UA <code>agent_orange</code> library uses to test if a <code>userAgent</code> looks to be a bot. You can narrow it down for specific bots by referencing the bot userAgent list here: <pre class="prettyprint"><code>/bot|crawler|spider|crawling/i </code></pre> For example you have some object, <code>util.browser</code>, you can store what type of device a user is on: <pre class="prettyprint"><code>util.browser = { bot: /bot|googlebot|crawler|spider|robot|crawling/i.test(navigator.userAgent), mobile: ..., desktop: ... } </code></pre>

Detect Search Crawlers via JavaScript

Tags:

javascript

bots

web-crawler

I am wondering how would I go abouts in detecting search crawlers? The reason I ask is because I want to suppress certain JavaScript calls if the user agent is a bot.

I have found an example of how to to detect a certain browser, but am unable to find examples of how to detect a search crawler:

/MSIE (\d+\.\d+);/.test(navigator.userAgent); //test for MSIE x.x

Example of search crawlers I want to block:

Google  Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)  Googlebot/2.1 (+http://www.googlebot.com/bot.html)  Googlebot/2.1 (+http://www.google.com/bot.html)   Baidu  Baiduspider+(+http://www.baidu.com/search/spider_jp.html)  Baiduspider+(+http://www.baidu.com/search/spider.htm)  BaiDuSpider

674

asked Nov 19 '13 23:11

Jon

1 Answers

This is the regex the ruby UA agent_orange library uses to test if a userAgent looks to be a bot. You can narrow it down for specific bots by referencing the bot userAgent list here:

/bot|crawler|spider|crawling/i

For example you have some object, util.browser, you can store what type of device a user is on:

util.browser = {    bot: /bot|googlebot|crawler|spider|robot|crawling/i.test(navigator.userAgent),    mobile: ...,    desktop: ... }

answered Sep 24 '22 07:09

megawac

Related questions
                            
                                jquery version of array.contains
                            
                                google analytics - multiple trackers on one page (cookie conflict)
                            
                                Responsive Highcharts not sizing correctly until window resize
                            
                                Conditionally initializing a constant in Javascript
                            
                                Javascript error handling with try .. catch .. finally
                            
                                Show request's timestamp in Fiddler?
                            
                                Which JavaScript Array functions are mutating?
                            
                                How to show a spinner while loading an image via JavaScript
                            
                                How to create Javascript constants as properties of objects using const keyword?
                            
                                Razor Syntax and Javascript
                            
                                ember.js + handlebars: render vs outlet vs partial vs view vs control
                            
                                Why and when to use default export over named exports in es6 Modules?
                            
                                How might I get the script filename from within that script?
                            
                                How do I create multi-page applications with Meteor?
                            
                                Detecting combination keypresses (Control, Alt, Shift)?
                            
                                Modifying CSS class property values on the fly with JavaScript / jQuery
                            
                                Extract the current DOM and print it as a string, with styles intact
                            
                                How to pass two anonymous functions as arguments in CoffeScript?
                            
                                How to exclude specific class names in querySelectorAll()?
                            
                                Generating unique random numbers (integers) between 0 and 'x'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With