Can a raw Lucene index be loaded by Solr?

People also ask

Does Solr use Lucene?

Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™.

What is the relation between Lucene and Solr?

Solr is built on top of lucene to provide a search platform. SOLR is a wrapper over Lucene index. It is simple to understand: SOLR is car and Lucene is its engine. You just need to know how to drive car (SOLR) and also need to know few things of engine (Lucene) in case if there will be any issue in your car engine.

Is Lucene the same as Solr?

Lucene is a full-text search engine library, whereas Solr is a full-text search engine web application built on Lucene. One way to think about Lucene and Solr is as a car and its engine. The engine is Lucene; the car is Solr. A wide array of companies (Ford, Salesforce, etc.)

How indexing happens in Solr?

By adding content to an index, we make it searchable by Solr. A Solr index can accept data from many different sources, including XML files, comma-separated value (CSV) files, data extracted from tables in a database, and files in common file formats such as Microsoft Word or PDF.

Success! With Pascal's suggestion of changes to schema.xml I got it working in no time. Thanks!

Here are my complete steps for anyone interested:

Downloaded Solr and copied dist/apache-solr-1.4.0.war to tomcat/webapps
Copied example/solr/conf to /usr/local/solr/
Copied pre-existing Lucene index files to /usr/local/solr/data/index
Set solr.home to /usr/local/solr
In solrconfig.xml, changed dataDir to /usr/local/solr/data (Solr looks for the index directory inside)
Loaded my Lucene indexes into Luke for browsing (awesome tool)
In the example schema.xml, removed all fields and field types except for "string"
In the example schema.xml, added 14 field definitions corresponding to the 14 fields shown in Luke. Example: <field name="docId" type="string" indexed="true" stored="true"/>
In the example schema.xml, changed uniqueKey to the field in my index that seemed to be a document id
In the example schema.xml, changed defaultSearchField to the field in my index that seemed to contain terms
Started tomcat, saw no exceptions finally, and successfully ran some queries in localhost:8080/solr/admin

This is just proof for me that it can work. Obviously there's a lot more configuration to be done.

I have never tried this, but you would have to adjust the schema.xml to include all the fields of the documents that are in your Lucene index, because Solr won't allow you to search for a field if it is not defined in schema.xml.

The adjustment to schema.xml should also include defining the query-time analyzers to properly search in your field, especially if the field where indexed using custom analyzers.

In solrconfig.xml you may have to change settings in the indexDefaults and the mainIndex sections.

But I'd be happy to read answers from people who actually did it.

Three steps in the end:

Change schema.xml or (managed-schema)
Change <dataDir> in solrconfig.xml
Restart Solr

I have my study notes here for those who are new to Solr, like me :)
To generate some lucene indexes yourself, you can use my code here.

public class LuceneIndex {
    private static Directory directory;

    public static void main(String[] args) throws IOException {
        long startTime = System.currentTimeMillis();

        // open
        Path path = Paths.get("/tmp/myindex/index");
        directory = new SimpleFSDirectory(path);
        IndexWriter writer = getWriter();

        // index
        int documentCount = 10000000;
        List<String> fieldNames = Arrays.asList("id", "manu");

        FieldType myFieldType = new FieldType();
        myFieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
        myFieldType.setOmitNorms(true);
        myFieldType.setStored(true);
        myFieldType.setTokenized(true);
        myFieldType.freeze();

        for (int i = 0; i < documentCount; i++) {
            Document doc = new Document();
            for (int j = 0; j < fieldNames.size(); j++) {
                doc.add(new Field(fieldNames.get(j), fieldNames.get(j) + Integer.toString(i), myFieldType));
            }
            writer.addDocument(doc);
        }
        // close
        writer.close();
        System.out.println("Finished Indexing");
        long estimatedTime = System.currentTimeMillis() - startTime;
        System.out.println(estimatedTime);
    }
    private static IndexWriter getWriter() throws IOException {
        return new IndexWriter(directory, new IndexWriterConfig(new WhitespaceAnalyzer()));
    }
}

Related questions
                            
                                High charts remove dots from the line graph [duplicate]
                            
                                Post a json body with swagger
                            
                                Intent.setData vs Intent.putExtra
                            
                                Flask POST request is causing server to crash
                            
                                How to get the colour of a pixel at X,Y using c#?
                            
                                How to intercept node.js express request
                            
                                Implement a REST API in a Grails app
                            
                                RESTful API - Correct behaviour when spurious/not requested parameters are passed in the request
                            
                                render :json does not accept options
                            
                                Slim Framework always return 404 Error
                            
                                Instagram API doesn’t find any liked posts for sandbox users
                            
                                Java USB library [closed]
                            
                                Architecturing API keys and access tokens
                            
                                Questions About Consuming Your Own API with OAuth
                            
                                OAuth Refresh Token Best Practice [closed]
                            
                                Jenkins API: Get a list of jobs filtered by build parameter - What jobs have built this Git commit?
                            
                                What's the purpose of the client secret in OAuth2?
                            
                                Node.js Express route naming and ordering: how is precedence determined?
                            
                                Is it possible to use FastAPI with Django?
                            
                                Rails 4 [Best practices] Nested resources and shallow: true

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can a raw Lucene index be loaded by Solr?

Tags:

search

solr

lucene

api

People also ask

Recent Activity

Donate For Us