Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reading odt files in php

How would you go about reading odt files in PHP? I know you can use QueryPath, but that seems a bit of an overkill,.. I just want to read the file.

like image 298
john mossel Avatar asked Nov 01 '10 14:11

john mossel


3 Answers

odt, files are zip compressed xml.

If all you need to do is read the file raw. Just unzip it and read it like a normal file.

If you need to parse out usable text, then enters the need for QueryPath or some other xslt parser.

like image 106
Harmon Wood Avatar answered Nov 04 '22 00:11

Harmon Wood


OpenTBS is able to read and modify OpenDocument files in PHP.

Since OpenDocument files are XML files stored into a zip archive, you can also use the TbsZip class to simply read a zip archive under PHP without any other library dependency.

like image 45
Skrol29 Avatar answered Nov 04 '22 01:11

Skrol29


/*Name of the document file*/
$document = 'Template.odt';

/**Function to extract text*/
function extracttext($filename) {

    $dataFile = "content.xml";     

    //Create a new ZIP archive object
    $zip = new ZipArchive;

    // Open the archive file
    if (true === $zip->open($filename)) {
        // If successful, search for the data file in the archive
        if (($index = $zip->locateName($dataFile)) !== false) {
            // Index found! Now read it to a string
            $text = $zip->getFromIndex($index);
            // Load XML from a string
            // Ignore errors and warnings
            $xml = new DOMDocument;
            $xml->loadXML($text, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING);
            // Return XML
            return $xml->saveXML();
        }
        //Close the archive file
        $zip->close();
    }   
    // In case of failure return a message
    return "File no`enter code here`t found";
}

echo extracttext($document);
like image 30
Nishant Bhatt Avatar answered Nov 04 '22 00:11

Nishant Bhatt