I have a OFX file downloaded from Citibank, this file has a DTD defined at http://www.ofx.net/DownloadPage/Files/ofx102spec.zip (file OFXBANK.DTD), the OFX file appear to be SGML valid. I'm trying with DomDocument of PHP 5.4.13, but I get several warning and file is not parsed. My Code is:
$file = "source/ACCT_013.OFX";
$dtd = "source/ofx102spec/OFXBANK.DTD";
$doc = new DomDocument();
$doc->loadHTMLFile($file);
$doc->schemaValidate($dtd);
$dom->validateOnParse = true;
The OFX file start as:
OFXHEADER:100
DATA:OFXSGML
VERSION:102
SECURITY:NONE
ENCODING:USASCII
CHARSET:1252
COMPRESSION:NONE
OLDFILEUID:NONE
NEWFILEUID:NONE
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<DTSERVER>20130331073401
<LANGUAGE>SPA
</SONRS>
</SIGNONMSGSRSV1>
<BANKMSGSRSV1>
<STMTTRNRS>
<TRNUID>0
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<STMTRS>
<CURDEF>COP
<BANKACCTFROM> ...
I'm open to install and use any program in Server (Centos) for call from PHP.
PD: This class http://www.phpclasses.org/package/5778-PHP-Parse-and-extract-financial-records-from-OFX-files.html don't work for me.
For Mac users, OFX files can be opened by using GnuCash, Intuit Quicken, Reilly Technologies Moneydance and Apple Numbers. For Microsoft Windows users, they can be opened using GnuCash, Sage Accpac, Microsoft Money, Intuit Quicken and Reilly Technologies Moneydance.
An OFX file is a financial data file created in the Open Financial Exchange (OFX) format, an open format for transferring data between vendors, consumers, and financial systems. It contains transactions, statements, and other financial information.
Open Financial Exchange (OFX) is a data-stream format for exchanging financial information that evolved from Microsoft's Open Financial Connectivity (OFC) and Intuit's Open Exchange file formats. Open Financial Exchange. Filename extension. .ofx.
Well first of all even XML is a subset of SGML a valid SGML file must not be a well-formed XML file. XML is more strict and does not use all features that SGML offers.
As DOMDocument
is XML (and not SGML) based, this is not really compatible.
Next to that problem, please see 2.2 Open Financial Exchange Headers in Ofexfin1.doc it explains you that
The contents of an Open Financial Exchange file consist of a simple set of headers followed by contents defined by that header
and further on:
A blank line follows the last header. Then (for type OFXSGML), the SGML-readable data begins with the <OFX> tag.
So locate the first blank line and strip everyhing until there. Then load the SGML part into DOMDocument by converting the SGML into XML first:
$source = fopen('file.ofx', 'r');
if (!$source) {
throw new Exception('Unable to open OFX file.');
}
// skip headers of OFX file
$headers = array();
$charsets = array(
1252 => 'WINDOWS-1251',
);
while(!feof($source)) {
$line = trim(fgets($source));
if ($line === '') {
break;
}
list($header, $value) = explode(':', $line, 2);
$headers[$header] = $value;
}
$buffer = '';
// dead-cheap SGML to XML conversion
// see as well http://www.hanselman.com/blog/PostprocessingAutoClosedSGMLTagsWithTheSGMLReader.aspx
while(!feof($source)) {
$line = trim(fgets($source));
if ($line === '') continue;
$line = iconv($charsets[$headers['CHARSET']], 'UTF-8', $line);
if (substr($line, -1, 1) !== '>') {
list($tag) = explode('>', $line, 2);
$line .= '</' . substr($tag, 1) . '>';
}
$buffer .= $line ."\n";
}
// use DOMDocument with non-standard recover mode
$doc = new DOMDocument();
$doc->recover = true;
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$save = libxml_use_internal_errors(true);
$doc->loadXML($buffer);
libxml_use_internal_errors($save);
echo $doc->saveXML();
This code-example then outputs the following (re-formatted) XML which also shows that DOMDocument loaded the data properly:
<?xml version="1.0"?>
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0</CODE>
<SEVERITY>INFO</SEVERITY>
</STATUS>
<DTSERVER>20130331073401</DTSERVER>
<LANGUAGE>SPA</LANGUAGE>
</SONRS>
</SIGNONMSGSRSV1>
<BANKMSGSRSV1>
<STMTTRNRS>
<TRNUID>0</TRNUID>
<STATUS>
<CODE>0</CODE>
<SEVERITY>INFO</SEVERITY>
</STATUS>
<STMTRS><CURDEF>COP</CURDEF><BANKACCTFROM> ...</BANKACCTFROM>
</STMTRS>
</STMTTRNRS>
</BANKMSGSRSV1>
</OFX>
I do not know whether or not this can be validated against the DTD then. Maybe this works. Additionally if the SGML is not written with the values that are of a tag on the same line (and only a single element on each line is required), then this fragile conversion will break.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With