Pergunta

I am looking to develop a transit app using GTFS static data. One of the constraints I've set to myself is that the app should use minimal mobile data transfers. Therefore, I would like to embed all the data in the app.

My issue is that GTFS data sets are usually quite large (85MB uncompressed for the city of Sydney for example). I've done a bit of reverse engineering on other apps out there and found out that some of them have managed to compress all that data into a much smaller file (I'm talking about a few MB at most).

Using 7zip, I've managed to compress my 85MB data set down to 5MB which is acceptable for me. The next step is for me to use that 7z file into my app and that's where I'm stuck. There's no way I'm going to uncompress it and put it in a SQL database as that will use too much space on the phone. So I was wondering what are my other options.

Thanks

Foi útil?

Solução

First, for embedding, I recommend using the Embedded XZ library (similar to 7zip). I have embedded this in a project and had good luck with it. Just be sure to compress data using 'xz --check=crc32' so it's compatible with Embedded XZ, and remember to initialize the CRC table.

As for a decompression strategy, you may need to segment the data in such a way that you can decompress different parts of it on demand (i.e., a tree of databases). I'm not familiar with your data's characteristics. Will a user need it all loaded at the same time? Or can it easily be compartmentalized?

Also, XZ can be a bit slow, even to decode. Have you evaluated how well regular gzip performs? That tends to be A) very fast; and B) available as a standard part of all embedded and mobile frameworks.

Outras dicas

Use protocol binary format (pbf) formely google and now open source. It is compact and very fast searchable, so no need to decompress it on a device and load it into a database on that device because pbf acts as a database. Just include pbf library in your code to query it. Of course you have to compress it once before distributing the data online.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top