Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr and web site indexing to create a site search

I was trying to build a 'site search' on a simple http site.

I have a site, lets call it www.mycompany.com, that is pure html.

Is there an easy way to use solr to index the entire site to build a full text search using solr as the engine?

I googled for a bit and could not find anything specific of the type: Do A Do B ... profit!

Let me also know if I am a bit off with what is solr for :P

Thanks in advance.

like image 442
feniix Avatar asked Mar 19 '10 22:03

feniix


People also ask

What is Solr indexing?

A Solr index can accept data from many different sources, including XML files, comma-separated value (CSV) files, data extracted from tables in a database, and files in common file formats such as Microsoft Word or PDF.

What is Solr used for?

Solr is a search server built on top of Apache Lucene, an open source, Java-based, information retrieval library. It is designed to drive powerful document retrieval applications - wherever you need to serve data to users based on their queries, Solr can work for you.

How does Solr search work?

Solr works by gathering, storing and indexing documents from different sources and making them searchable in near real-time. It follows a 3-step process that involves indexing, querying, and finally, ranking the results – all in near real-time, even though it can work with huge volumes of data.


1 Answers

Solr is only for indexing and searching text, it does not have a crawler since it's out the project's scope.

However take a look at Nutch, which is a crawler and not too hard to setup initially.

Nutch and Solr can be integrated if you need some Solr-specific feature to search the index.

like image 123
Mauricio Scheffer Avatar answered Nov 13 '22 04:11

Mauricio Scheffer