Can I open a .doc file and get that file's contents using Ruby?
I recently dealt with this in a project and found that I wanted a lighter-weight library to get the text from .doc, .docx and .pdf files. DocRipper uses a combination of Antiword, grep and Poppler/pdftotext command-line tools to grab the text contents from files and return them as a utf-8 string.
dr = DocRipper::TextRipper.new('/path/to/file')
dr.text
=> "Document's text"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With