Question

I am planning to request the users of the my web application to submit bulk data in the following format.

BatchID:1
TotalRecords:200

Cust:CustID~FirstName~LastName~Age~Dob~City~State~Country
1~John~Abraham~35~10/10/1974~New York~NY~US

Order:OrderID~CustID~Qty~Amount~DOS
1~1~10~100.00~1/1/2012
2~1~100~1000.00~1/1/2012
1~1~10~100.00~1/1/2012

OrderDet:OrderDetID~OrderID~Reg1~Remarks
1~1~12393A~Testing order
2~1~23123B~tesitng order 1

Above shown is just a set of records. One flat file will have upto 200 files.

Do you think this is a right way of doing it? We need to allow batch upload so that users need not enter each record at a time from the website.

If you can think of any other format that will be of great help

Was it helpful?

Solution

XML with a defined schema would be a nice choice.

A couple problems offhand with your format:

  • What happens if the remarks contains a ~? eg. Test~ing Remarks. How to escape this?
  • Are the users of your application all in the same locale? Will they all expect to enter dates as Month/Day/Year? Or will it be Day/Month/Year? Are these UTC dates?
  • Do you need to quickly determine if the user entered "invalid" data (e.g. missing required fields?)

There are well known XML tools to help with validating what the user is uploading. It's worth considering...

I'm not a huge fan of CSV but at least there are well known tools to work with it. however, as I recall there is some variation in the CSV formats. For example, How to escape the comma? Is it by putting the string field in double quotes eg. "My first, second, third" etc. Is it by using \,? etc.

OTHER TIPS

I strongly recommend you use XML for a new project. It seems that you have some hierarchy to your data (e.g. Customers have Orders). That strongly supports using XML since you can reflect those relationships naturally.

If for some reason your users cannot do that, CSV is still very widely used and there's much better tool support than for ~ delimited files (for example, one can save to CSV from Excel).

I can tell you from experience that someone's order description will have a ~ character in it one day. I once worked at a company that used pipe (|) as a delimiter. Worked great until someone thought to name his company Acme ||.

In general:

  • If both sender and receiver use the same technology (for example, bulk import to Microsoft SQL Server) use whatever technology-specific format provides optimal performance.

  • If you want maximum flexibility, then XML is your best choice. Creating a matching XSchema, and validating your exports and imports, isn't a bad idea either.

  • Otherwise, "whatever works". CSV is a pop favorite, provided your data lends itself to a simple, homegeneous table format.

All things being equal, I'd probably vote "XML".

IMHO .. PSM

I notice nobody said JSON. So if an actual CSV is inapplicable, use JSON. It's tidier than XML and shorter, too. And, if you expect your users to understand your custom format, they'll definitely be able to deal with JSON without an XML-Schema-friendly editor.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top