Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read Microsoft Works and/or One Note files from Java

Tags:

java

file

I'm looking for a way to read Microsoft Works (.wps) and One Note (.one) files in a Java application. Actually, all I care about is extracting readable text from these files so I can index them.

I've had success using the Apache POI and Tika libraries to extract text from most other Micrososft formats, but these two are still elusive.

Thanks, Frank

like image 974
Frank LaRosa Avatar asked Dec 30 '10 23:12

Frank LaRosa


1 Answers

From what I can tell the .one (One Note) file format is proprietary but there is a COM API: http://msdn.microsoft.com/en-us/library/ms788684(office.12).aspx#Office2007OneNoteWhatsNew_OneNote2007COMAPI which you might be able to write something to convert the data in another language and call it?

A couple of google search reveal programs that can convert wps files, but I don't see any java api or any documentation. It might be doable. Not sure how many files you are dealing with, but you might need to use another application to convert the file or have you users run another app to convert it first?

like image 171
jzd Avatar answered Oct 01 '22 00:10

jzd