Basically I'm trying to index word or pdf documents in Solr and found the ExtractingRequestHandler, but can't figure out how to write code in c# that performs the HTTP POST request like in the Solr wiki: http://wiki.apache.org/solr/ExtractingRequestHandler. I've installed Solr 3.4 on Tomcat 7 (7.0.22) using the files from the example/solr directory in the Solr zip and I haven't altered anything. The ExtractingRequestHandler should be configured out of the box in the solrconfig.xml and ready to use, right? Can some of you give an C# (HttpWebRequest) example of how you make the HTTP POST request and upload a PDF file like it is done using curl in the Solr wiki? I've look all over this site and many others trying to find an example or a tutorial on how this is done, but haven't found anything. EDIT: I finally managed to get it to work using SolrNet! In order for it to work you need to copy this to a lib-folder in your Solr installation directory from the Solr zip: <ul> <li>apache-solr-cell-3.4.0.jar file from the dist folder</li> <li>content of contrib\extraction\lib directory</li> </ul> With SolrNet 0.4.0 beta 2, this code does the job: <pre class="prettyprint"><code>Startup.Init<IndexDocument>("YOUR-SOLR-SERVICE-PATH"); var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexDocument>>(); using (FileStream fileStream = File.OpenRead("FILE-PATH-FOR-THE-FILE-TO-BE-INDEXED")) { var response = solr.Extract( new ExtractParameters(fileStream, "doc1") { ExtractFormat = ExtractFormat.Text, ExtractOnly = false }); } solr.Commit(); </code></pre> Sorry for the trouble. I hope however that others will find this useful.

I would recommend using the SolrNet client. It supports the ExtractingRequestHandler. Here the Deprecated repo on code.google.com

Index pdf documents in Solr from C# client

Tags:

Basically I'm trying to index word or pdf documents in Solr and found the ExtractingRequestHandler, but can't figure out how to write code in c# that performs the HTTP POST request like in the Solr wiki: http://wiki.apache.org/solr/ExtractingRequestHandler.

I've installed Solr 3.4 on Tomcat 7 (7.0.22) using the files from the example/solr directory in the Solr zip and I haven't altered anything. The ExtractingRequestHandler should be configured out of the box in the solrconfig.xml and ready to use, right?

Can some of you give an C# (HttpWebRequest) example of how you make the HTTP POST request and upload a PDF file like it is done using curl in the Solr wiki?

I've look all over this site and many others trying to find an example or a tutorial on how this is done, but haven't found anything.

EDIT:

I finally managed to get it to work using SolrNet!

In order for it to work you need to copy this to a lib-folder in your Solr installation directory from the Solr zip:

apache-solr-cell-3.4.0.jar file from the dist folder
content of contrib\extraction\lib directory

With SolrNet 0.4.0 beta 2, this code does the job:

Click to copy

Startup.Init<IndexDocument>("YOUR-SOLR-SERVICE-PATH");
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexDocument>>();

using (FileStream fileStream = File.OpenRead("FILE-PATH-FOR-THE-FILE-TO-BE-INDEXED"))
{
    var response =
        solr.Extract(
            new ExtractParameters(fileStream, "doc1")
            {
                ExtractFormat = ExtractFormat.Text,
                ExtractOnly = false
            });
}

solr.Commit();

Sorry for the trouble. I hope however that others will find this useful.

809

asked Jan 19 '12 23:01

jonasm

1 Answers

I would recommend using the SolrNet client. It supports the ExtractingRequestHandler.

Here the Deprecated repo on code.google.com

146

answered Sep 17 '22 17:09

Paige Cook

Related questions
                            
                                Should UI components ever be passed to a Business Logic assembly for binding
                            
                                WPF Combo box - Select Item by Tag
                            
                                Does Windows Phone 7 Mango support UDP broadcast?
                            
                                deny anonymous for all pages except the "~/" path in asp.net
                            
                                How to create reusable controls using knockout, jquery, and ASP.NET MVC?
                            
                                Interaction Service vs Interaction Request Objects
                            
                                Is MVC and MVP supervising controller the same? [duplicate]
                            
                                AjaxControlToolkit NoBotState is always InvalidBadResponse
                            
                                viewing exact sql after parameter substitution C#
                            
                                How would I set the label of my UserControl at design time?
                            
                                Visual Studio debugger crashes when viewing a variable
                            
                                Fast Repeat TakeWhile causes infinite loop
                            
                                POST JSON Dictionary without Key/Value Text
                            
                                Embedded Mono: Keeping references to C# objects in C++
                            
                                Using F# Option Type in C#
                            
                                Error when trying to connect to Oracle 10g database from C# program employing minimal set-up configuration
                            
                                Storing C# datetime to postgresql TimeStamp
                            
                                facebook c# sdk getting started
                            
                                Binding to X Y coordinates of element on WPF Canvas
                            
                                Saving an Image file to sql Server and converting byte array into image

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Index pdf documents in Solr from C# client

Tags:

c#

pdf

solr

tomcat

solrnet

jonasm

People also ask

1 Answers

Paige Cook

Recent Activity

Donate For Us