Question

(postgre/my)sql/php/html/css/javascript vs xml/xsl/xsd/php/css/javascript

Trying to decide whether to go with an xml-document-based app or with SQl. Each xml document would be about 30k; say 2000 files. Essentially a choice between serving up html/javascript, or serving a 30k xml file (plus xsl/xsd/javascript). Involves some financial (ie non-floating.point) math, plus substantial data entry one day per week.

SQL-solution would invove fragmenting/reassembling data using, say, ten separate cross-referenced tables, and tie users into SQL access control systems.

Assuming xml-based solution really is more straightforward to install/maintain, and using money=cents-as-integers is okay, and "other things are equal", my questions are:

1) Is it really a good plan to have the server read/update/save a 30k xml files, say 2000 times over 8 hours once a week, every time data is updated? Or is that just a trivial load? (so that depends what else the server is doing I guess, and how fast the internet connection is)

2) How would that scale compared to an SQL-based solution? What would be the limiting factor?

3) Most importantly: what am I overlooking?

Was it helpful?

Solution

1) Not a good plan. Even if the load is not a problem you are basically building yourself a database when the problem is a solved one.

2) SQL is going to scale better base don what you've told.

3) NoSQL or XML based DB solutions like BaseX.

OTHER TIPS

You want to look at your solution architecture... Where are the XML files coming from and how do you get hold of them. You also need to look at the navigation you are looking for. How do users navigate to one specific XML file - these navigational data need to be available. So to answer your question:

  1. It is not a plan at all :-) - is is a tiny fragment of your solution, the load doesn't look big. You need to have a look at your meta data.

  2. It might not be an OR question. All SQL systems know XML column data types today: PrgressQL, MS-SQL, Oracle, IBM DB/2 (including the free community edition). I like DB/2 (probably because I work for IBM :-) )

  3. CouchDB, MongoDB -> JSON stores, XML databases as Karl suggested. Most important: caching, caching, caching! If you build in Java, use the guava libraries for a cache - once a file is transformed to the stuff you send down to the browser (using XSLT), cache that with generous expiry and have your load routine invalidate the cache

Hope that helps!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top