Efficient methods of storing MySQL or JSON data in memory?

https://stackoverflow.com/questions/12449739

02-07-2021
|

Question

In platforms like Javascript, AS3 and Python which support and recommend untyped data arrays by default, arrays are usually the simplest and most effective way of storing arbitrary user data in memory. (tabular data, data from CSVs, data from JSON, etc)

.NET however likes everything in a strongly-typed format; you just can't dump anything into a List<object> array, although it works, it will be slower and clumsier to handle (type checks, type casting every usage). So usually you end up defining the data schema as a class, with properties corresponding to the columns, and storing data in instances of that class.

So what are the recommended methods of storing arbitrary data in memory, especially when the schema keeps upgrading (such as CSV or JSON) that you cannot "hard-code" for at the time of development.

Edit: Such data may include numbers (int/float), strings, dates, times, units, geospatial data, geometric data, embedded files, essentially everything a MySQL database or JSON file could store.

Edit: While in memory, this data can and would be used for every kind of processing; calculations to generate charts, string processing to search for data by substrings, number crunching algorithms for geospatial/3D data, etc, optimization algorithms that validate dirty data and optimize redundant data, etc.

Solution

There are probably several ways to go here.

You could create classes programatically through reflection.

How to dynamically create a class in C#?

then use generic collections and some LINQ to Objects to query the data.

Or, you could use or create your own datasource or datatables on the fly from whatever datasource and make it the type you like depending on your logic, if its a database you could check the type of the column of a datatable. Then you could query the datatable with the select method of the datatable object.

OTHER TIPS

One possible approach it to use Dictionary<string,object> or consider (4.0+) to use dynamic objects to store individual items. Than put items into List or Array or Dictionary again.

In ether case you'll also may need some sort of metadata about objects. I.e. property name to {type, restrictions,validation rules, whatever else} mapping.

I don't agree with some of the statements in the question.

First, "you can't dump anything in object[]". False! You can dump anything in object[] because everything is an object. You must only be able to cast it back to whatever you need.

Second, "it's slower". False again! Plain old arrays are the fastest collection to access, as soon as you know where to look. It takes O(1) to access a specific element in an array.

If you need to store a CSV in a C# class you could think about System.Data.DataTable but the most generic is definitely object[], where each element of it can be itself another object[] reaching your desired depth level.

If you really need random access you must go for Dictionary<string,object> as stated by Alexei

EDIT: here is an example

void ScanCollection(object[] collection)
{
    foreach (object item: collection)
    {
        if (item is string)
        ///Treat as string
        else if (item is float)
        ///Treat as float
        else if (item is SomethingElse)
        ///Treat as SomethingElse class

    }
}

Where have you had an array of object fail in .NET?

Since an object only supports 5 methods there is not much meat.

Object Members

All the collection support object as far as I know.

    List<Object> lObjs = new List<object>();
    HashSet<Object> hObjs = new HashSet<object>();
    Dictionary<int, Object> dObjs = new Dictionary<int, object>();
    Object[] aObjs = new Object[10];

    aObjs[0] = "string";
    aObjs[1] = 1d;
    aObjs[2] = new object();

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow