How would you go about reading odt files in PHP? I know you can use QueryPath, but that seems a bit of an overkill,.. I just want to read the file.
odt, files are zip compressed xml.
If all you need to do is read the file raw. Just unzip it and read it like a normal file.
If you need to parse out usable text, then enters the need for QueryPath or some other xslt parser.
OpenTBS is able to read and modify OpenDocument files in PHP.
Since OpenDocument files are XML files stored into a zip archive, you can also use the TbsZip class to simply read a zip archive under PHP without any other library dependency.
/*Name of the document file*/
$document = 'Template.odt';
/**Function to extract text*/
function extracttext($filename) {
$dataFile = "content.xml";
//Create a new ZIP archive object
$zip = new ZipArchive;
// Open the archive file
if (true === $zip->open($filename)) {
// If successful, search for the data file in the archive
if (($index = $zip->locateName($dataFile)) !== false) {
// Index found! Now read it to a string
$text = $zip->getFromIndex($index);
// Load XML from a string
// Ignore errors and warnings
$xml = new DOMDocument;
$xml->loadXML($text, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING);
// Return XML
return $xml->saveXML();
}
//Close the archive file
$zip->close();
}
// In case of failure return a message
return "File no`enter code here`t found";
}
echo extracttext($document);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With