A good rule of thumb is to avoid using SQLite in situations where the same database will be accessed directly (without an intervening application server) and simultaneously from many computers over a network. SQLite will normally work fine as the database backend to a website.
XML is a very useful technology for moving data between different databases or between databases and other programs.
SQLite Disadvantages SQLite is used to handle low to medium traffic HTTP requests. Database size is restricted to 2GB in most cases.
"You can store XML in a database designed specifically for XML, in a modified object database, or in a relational database."
Man do I have experience with this. I work on a project where we originally stored all of our data using XML, then moved to sqlite. There are many pros and cons to each technology, but it was performance that caused the switchover. Here is what we observed.
For small databases (a few meg or smaller), XML was much faster, and easier to deal with. Our data was naturally in a tree format, which made XML much more attractive, and XPATH allowed us to do many queries in one simple line rather than having to walk down an ancestry tree.
We were programming in a Win32 environment, and used the standard Microsoft DOM library. We would load all the data into memory, parse it into a dom tree and search, add, modify on the in memory copy. We would periodically save the data, and needed to rotate copies in case the machine crashed in the middle of a write.
We also needed to build up some "indexes" by hand using C++ tree maps. This, of course would be trivial to do with sql.
Note that the size of the data on the filesystem was a factor of 2-4 smaller than the "in memory" dom tree.
By the time the data got to 10M-100M size, we started to have real problems. Interestingly enough, at all data sizes, XML processing was much faster than sqlite turned out to be (because it was in memory, not on the hard drive)! The problem was actually twofold- first, loadup time really started to get long. We would need to wait a minute or so before the data was in memory and the maps were built. Of course once loaded the program was very fast. The second problem was that all of this memory was tied up all the time. Systems with only a few hundred meg would be unresponsive in other apps even though we ran very fast.
We actually looking into using a filesystem based xml database. There are a couple open sourced versions xml databases, we tried them. I have never tried to use a commercial xml database, so I can't comment on them. Unfortunately, we could never get the xml databases to work well at all. Even the act of populating the database with hundreds of meg of xml took hours.... Perhaps we were using it incorrectly. Another problem was that these databases were pretty heavyweight. They required java and had full client server architecture. We gave up on this idea.
We found sqlite then. It solved our problems, but at a price. When we initially plugged sqlite in, the memory and load time problems were gone. Unfortunately, since all processing was now done on the harddrive, the background processing load went way up. While earlier we never even noticed the CPU load, now the processor usage was way up. We needed to optimize the code, and still needed to keep some data in memory. We also needed to rewrite many simple XPATH queries as complicated multiquery algorithms.
So here is a summary of what we learned.
For tree data, XML is much easier to query and modify using XPATH.
For small datasets (less than 10M), XML blew away sqlite in performance.
For large datasets (greater than 10M-100M), XML load time and memory usage became a big problem, to the point that some computers become unusable.
We couldn't get any opensource xml database to fix the problems associated with large datasets.
SQLITE doesn't have the memory problems of XML dom, but it is generally slower in processing the data (it is on the hard drive, not in memory). (note- sqlite tables can be stored in memory, perhaps this would make it as fast.... We didn't try this because we wanted to get the data out of memory.)
Storing and querying tree data in a table is not enjoyable. However, managing transactions and indexing partially makes up for it.
I basically agree with Mitchel, that this can be highly specific depending on what are you gonna do with XML/sqlite. For your case (cache), it seems to me that using sqlite (or other embedded dbs) makes more sense.
First I don't really think that sqlite will need more overhead than XML. And I mean both development time overhead and runtime overhead. Only problem is that you have a dependance on sqlite library. But since you would need some library for XML anyway it doesn't matter (I assume project is in C/C++).
Advantages of sqlite over xml:
Disadvantages of sqlite:
Other things are on par for both solutions probably.
To sum it up, answers to your questions respectively:
You will not know, unless you test your specific application with both backends. Otherwise it's always just a guess. Basic support for both caches should not be a problem to code. Then benchmark and compare.
Because of the way XML files are organized, sqlite searches should always be faster (barring some corner cases where it doesn't matter anyway because it's blazingly fast). Speeding up searches in XML would require index database anyway, in your case that would mean having cache for cache, not a particularly good idea. But with sqlite you can have indexing as part of database.
Don't forget that you have a great database at your fingertips: the filesystem!
Lots of programmers forget that a decent directory-file structure is/has:
People are talking about splitting up XML files into multiple XML files... I would consider splitting your XML into multiple directories and multiple plaintext files.
Give it a go. It's refreshingly fast.
Depends on the kind and the size of the data.
I wouldn't use XML for storing RSS items. A feed reader makes constant updates as it receives data.
With XML, you need to load the data from file first, parse it, then store it for easy search/retrieval/update. Sounds like a database...
Also, what happens if your application crashes? if you use XML, what state is the data in the XML file versus the data in memory. At least with SQLite you get atomicity, so you are assured that your application will start with the same state as when the last database write was made.
XML is best used as an interchange format when you need to move data from your application to somewhere else or share information between applications. A database should be the preferred method of storage for almost any size application.
When should XML be used for data persistence instead of a database? Almost never. XML is a data transport language. It is slow to parse and awkward to query. Parse the XML (don't shred it!) and convert the resulting data into domain objects. Then persist the domain objects. A major advantage of a database for persistence is SQL which means unstructured queries and access to common tools and optimization techniques.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With