Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 275 Vote(s) - 3.42 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Xml or Sqlite, When to drop Xml for a Database?

#11
XML is good for storing data which is not completely structured and you typically want to exchange it with another application. I prefer to use a SQL database for data. XML is error prone as you can cause subtle errors due to typos or ommissions in the data itself. Some open source application frameworks use too many xml files for configuration, data, etc. I prefer to have it in SQL.

Since you ask for a rule of thumb, I would say that use XML based application data, configuration, etc if you are going to set it up once and not access/search it much. For active searches and updations, its best to go with SQL.

For example, a web server stores application data in a XML file and you dont really need to perform complex search, update the file. The web server starts, reads the xml file and thats that. So XML is perfect here. Suppose you use a framework like Struts. You need to use XML and the action configurations dont change much once the application is developed and deployed. So again, the XML file is a good way. Now if your Struts developed application allows extensive searches and updations, deletions, then SQL is the optimal way.


Offcourse, you will surely meet one or two developers in your organisation who will chant XML or SQL only and proclaim XML or SQL as the only way to go. Beware of such folks and do what 'feels' right for your application. Dont just follow a 'technology religion'.

Think of things like how often you need to update the data, how often you need to search the data. Then you will have your answer on what to use - XML or SQL.
Reply

#12
XML can be stored as text and as a binary file format.

If your primary goal is to let a computer read / write a file format effeciently you should work with a binary file format.

Databases are an easy to use way of storing and maintaining data.
They are not the fastest way to store data that is a binary file format.

What can speed things up is using an in memory database / database type. Sqlite has this option.

And this sounds like the best way to do it for you.
Reply

#13
My opinion is that you should use SQLite (or another appropriate embedded database) anytime you don't need a pure-text file format. Note, this is a pretty big exception. There are a lot of scenarios that require, or are benefited by, pure-text file formats.

As far as overhead goes, SQLite compiles to something like 250 k with normal flags. Many XML parsing libraries are larger than SQLite. You get no concurrency gains using XML. The SQLite binary file format is going to support much more efficient writes (largely because you can't append to the end of a well-formatted XML file). And even reading data, most of which I assume is fairly random access, is going to be faster using SQLite.

And to top it all off, you get access to the benefits of SQL like transactions and indexes.

Edit: Forgot to mention. One benefit of SQLite (as opposed to many databases) is that it allows any type in any row in any column. Basically, with SQLite you get the same freedom you have with XML in terms of datatypes. This also means that you don't have to worry about putting limits on text columns.
Reply

#14
When should XML be used for data persistence instead of a database? Almost never. XML is a data transport language. It is slow to parse and awkward to query. Parse the XML (don't shred it!) and convert the resulting data into domain objects. Then persist the domain objects. A major advantage of a database for persistence is SQL which means unstructured queries and access to common tools and optimization techniques.
Reply

#15
You should note that many large Relational DBs (Oracle and SQLServer) have XML datatypes to store data within a database and use XPath within the SQL statement to gain access to that data.

Also, there are native XML databases which work very much like SQLite in the sense they are one binary file holding a collection of documents (which could roughly be a table) then you can either XPath/XQuery on a single document or the whole collection. So with an XML database you can do things like store the days data as a separate XML document in the collection... so you just need to use that one document when your dealing with the data for today. But write an XQuery to figure out historical data on the collection of documents for that person. Slick.

I've used Berkeley XMLDB (now backed by Oracle). There are others if you search google for "Native XML Database". I've not seen a performance problem with storing/retrieving data in this manner.

XQuery is a different beast (but well worth learning), however you may be able to just use the XPaths you currently use with slight modifications.
Reply

#16
A database is great as part of your program. If quering the data is part of your business logic.
XML is best as a file format, especially if you data format is:

1, Hierarchal <br>
2, Likely to change in the future in ways you can't guess <br>
3, The data is going to live longer than the program
Reply

#17
I have made the switch to SQLite and I feel _much_ better knowing it's in a database.

There are a lot of other benefits from this:

* Adding new items is really simple
* Sorting by multiple columns
* Removing duplicates with a unique index

I've created 2 views, one for unread items and one for all items, not sure if this is the best use of views, but I really wanted to try using them.

I also benchmarked the xml vs sqlite using the **StopWatch** class, and the sqlite is faster, **although it could just be that my way of parsing xml files wasn't the fastest method**.

1. **Small # items and size (25 items, 30kb)**
* ~1.5 ms sqlite
* ~8.0 ms xml
2. **Large # of items (700 items, 350kb)**
* ~20 ms sqlite
* ~25 ms xml
3. **Large file size (850 items, 1024kb)**
* ~45 ms sqlite
* ~60 ms xml
Reply

#18
Man do I have experience with this. I work on a project where we originally stored all of our data using XML, then moved to SQLite. There are many pros and cons to each technology, but it was performance that caused the switchover. Here is what we observed.

For small databases (a few meg or smaller), XML was much faster, and easier to deal with. Our data was naturally in a tree format, which made XML much more attractive, and XPath allowed us to do many queries in one simple line rather than having to walk down an ancestry tree.

We were programming in a Win32 environment, and used the standard Microsoft DOM library. We would load all the data into memory, parse it into a DOM tree and search, add, modify on the in memory copy. We would periodically save the data, and needed to rotate copies in case the machine crashed in the middle of a write.

We also needed to build up some "indexes" by hand using C++ tree maps. This, of course would be trivial to do with SQL.

Note that the size of the data on the filesystem was a factor of 2-4 smaller than the "in memory" DOM tree.

By the time the data got to 10M-100M size, we started to have real problems. Interestingly enough, at all data sizes, XML processing was much faster than SQLite turned out to be (because it was in memory, not on the hard drive)! The problem was actually twofold- first, loadup time really started to get long. We would need to wait a minute or so before the data was in memory and the maps were built. Of course once loaded the program was very fast. The second problem was that all of this memory was tied up all the time. Systems with only a few hundred meg would be unresponsive in other apps even though we ran very fast.

We actually looking into using a filesystem based XML database. There are a couple open sourced versions XML databases, we tried them. I have never tried to use a commercial XML database, so I can't comment on them. Unfortunately, we could never get the XML databases to work well at all. Even the act of populating the database with hundreds of meg of XML took hours.... Perhaps we were using it incorrectly. Another problem was that these databases were pretty heavyweight. They required Java and had full client server architecture. We gave up on this idea.

We found SQLite then. It solved our problems, but at a price. When we initially plugged SQLite in, the memory and load time problems were gone. Unfortunately, since all processing was now done on the harddrive, the background processing load went way up. While earlier we never even noticed the CPU load, now the processor usage was way up. We needed to optimize the code, and still needed to keep some data in memory. We also needed to rewrite many simple XPath queries as complicated multiquery algorithms.

So here is a summary of what we learned.

1. For tree data, XML is much easier to query and modify using XPath.

2. For small datasets (less than 10M), XML blew away SQLite in performance.

3. For large datasets (greater than 10M-100M), XML load time and memory usage became a big problem, to the point that some computers become unusable.

4. We couldn't get any opensource XML database to fix the problems associated with large datasets.

5. SQLite doesn't have the memory problems of XML DOM, but it is generally slower in processing the data (it is on the hard drive, not in memory). (note- SQLite tables can be stored in memory, perhaps this would make it as fast.... We didn't try this because we wanted to get the data out of memory.)

6. Storing and querying tree data in a table is not enjoyable. However, managing transactions and indexing partially makes up for it.
Reply

#19
I basically agree with [Mitchel][1], that this can be highly specific depending on what are you going to do with XML and SQLite. For your case (cache), it seems to me that using SQLite (or other embedded databases) makes more sense.

First I don't really think that SQLite will need more overhead than XML. And I mean both development time overhead and runtime overhead. Only problem is that you have a dependence on SQLite library. But since you would need some library for XML anyway it doesn't matter (I assume project is in C/C++).

**Advantages of SQLite over XML:**

- everything in one file,
- performance loss is lower than XML as cache gets bigger,
- you can keep feed metadata separate from cache itself (other table), but accessible in the same way,
- SQL is probably easier to work with than XPath for most people.

**Disadvantages of SQLite:**

- can be problematic with multiple processes accessing same database (probably not your case),
- you should know at least basic SQL. Unless there will be hundreds of thousands of items in cache, I don't think you will need to optimize it much,
- maybe in some way it can be more dangerous from security standpoint (SQL injection). On the other hand, you are not coding web app, so this should not happen.

Other things are on par for both solutions probably.

To sum it up, answers to your questions respectively:

1. You will not know, unless you test your specific application with both back ends. Otherwise it's always just a guess. Basic support for both caches should not be a problem to code. Then benchmark and compare.

2. Because of the way XML files are organized, SQLite searches should always be faster (barring some corner cases where it doesn't matter anyway because it's blazingly fast). Speeding up searches in XML would require index database anyway, in your case that would mean having cache for cache, not a particularly good idea. But with SQLite you can have indexing as part of database.

[1]:

[To see links please register here]

Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through