Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of results google (or other) search programmatically

I am making a little personal project. Ideally I would like to be able to make programmatically a google search and have the count of results. (My goal is to compare the results count between a lot (100000+) of different phrases).

Is there a free way to make a web search and compare the popularity of different texts, by using Google Bing or whatever (the source is not really important).

I tried Google but seems that freely I can do only 10 requests per day. Bing is more permissive (5000 free requests per month).

Is there other tools or way to have a count of number of results for a particular sentence freely ? Thanks in advance.

like image 292
LastMove Avatar asked Jul 31 '16 23:07

LastMove


1 Answers

There are several things you're going to need if you're seeking to create a simple search engine.

First of all you should read and understand where the field of information retrieval started with G. Salton's paper or at least read the wiki page on the vector space model. It will require you learning at least some undergraduate linear algebra. I suggest Gilbert Strang's MIT video lectures for this.

You can then move to the Brin/Page Pagerank paper which outlays the original concept behind the hyperlink matrix and quickly calculating eigenvectors for ranking or read the wiki page.

You may also be interested in looking at the code for Apache Lucene

To get into contemporary search algorithm techniques you need calculus and regression analysis to learn machine learning and deep learning as the current google search has moved away from Pagerank and utilizes these. This is partially due to how link farming enabled people to artificially engineer search results and the huge amount of meta data that modern browsers and web servers allow to be collected.

EDIT:

For the webcrawler only portion I'd recommend WebSPHINX. I used this in my senior research in college in conjunction with Lucene.

like image 93
Usi Avatar answered Sep 19 '22 15:09

Usi