Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to forcefully abort data import in SOLR DIH HTTP API

Follow the steps to generate the error:

1. Configure the large amount of data (around 4 GB or more than 50 millions of records)
2. Give proper data-config.xml file for indexing the data from remote database server.
3. During indexing the data into solr from SQL SERVER 2010, at the half way unplug the     
   network cable and see the status in solr. e.g.
   localhost:8083/solr/core1/dataimport?command=status
   or
   localhost:8083/solr/core1/dataimport
4. Pass few seconds then again plug back the cable.
5. You can clearly see that there is just only "Time Elapsed" parameter increase.      
   "Total Rows Fetched" & "Total Documents Processed" remains same for infinite time.
6. You can regenerate this for small data also.
7. Work around is you need to restart the solr. (But this is not good solution) 

Note: This is very important issue because, so many organizations not using this valuable
products just because of the this database infinite connection issue. Solution can be: Forcefully abort the data indexing or provide mechanism for forcefully abort the indexing. Hope you guys knows that abort command is also not working.

like image 849
Sanket Thakkar Avatar asked Nov 10 '22 05:11

Sanket Thakkar


1 Answers

From Solr documentation (http://wiki.apache.org/solr/DataImportHandler)

Abort an ongoing operation by hitting the URL http://:/solr/dataimport?command=abort .

I just checked the source code for DIH and abort command is implemented

like image 193
Greg S Avatar answered Nov 15 '22 08:11

Greg S