I generate files, lets call them .dwrf files, which contain a significant amount of data. Currently we export those to .CSV and the resulting files are large (2GB+). I would like to cut out the export process and make the contents of a .dwrf file queryable directly from Excel or other applications.
What I would like to do is write a utility/service - lets call it dwrfMiner - to extract data from the file and pass it on as a datasource and link dwrfMiner to .dwrf files in some way so that Excel recognises it as an external data source.
Any ideas?
While writing an ODBC driver for this is probably overkill, if the format of the files you are working with is known in advance and isn't too hard to translate (it sounds like not considering you are already creating CSVs) then using an ODBC DSN sounds like your best bet.
There are a nice selection of ODBC drivers already built in to Windows (.txt, .csv, .mdb, .xl*, .dbf, Paradox .db, etc etc) and you can obtain other drivers from the web for a lot of common formats.
If the size of the existing format you're exporting to is too onerous (CSV) then the logical point to start is a transformation of your data to something more space-conscious that has ODBC support.
Failing that, your last option is the overkill option (Writing an ODBC driver).
Excel can query external data souces, but beware that Excel (all versions) have hard-limits on the number of rows they can display, per work-book. I think in Excel 2003 the limit is ~65k. It's higher in other versions.
See my question: reporting tool/viewer for large datasets (and I had much less than > 2GB).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With