I ran some tests (not conclusive by any means but very indicative) to establish the memory footprint of different List<Map<String, Object>>
implementations. The baseline is Java's ArrayList<>
with the elements being instances of Guava's ImmutableMap
.
The implementations I compared to are the following:
- Implementation based on a
Map<String,List<Object>>
using aHashMap
andArrayList
s; - Implementation based on a
List<Object[]>
using anArrayList
; - Guava's
HashBasedTable<Integer,String,Object>
; - Guava's
ArrayTable<Integer,String,Object>
;
My test consisted in generating n random rows each having m columns and a "fill factor" of k, where the fill factor is defined as the probability that each row contains values for all the columns. For simplicity, the values are random strings of length l generated using Apache Commons RandomStringUtils
.
But let's get to the results. Having n = 200000, m = 50, l = 10 and k in (1.0, 7.5, 0.5) I got the following memory footprints as percentage of the baseline:
| k = 1.0 | k = 0.75 | k = 0.5 |
----------------------------------------
1. | 71 % | 71 % | 71 % |
2. | 71 % | 72 % | 73 % |
3. | 111 % | 107 % | 109 % |
4. | 71 % | 73 % | 76 % |
I tried reducing n to 20000 with about the same results.
I found the results above quite interesting. First of all, it looks like there isn't much space for improvement beyond 70% of the baseline. Second, I was pleasantly surprised to find out that the efficient Guava's ArrayTable is as good as the two implementations proposed in this question. I'll keep digging for more but I'm leaning towards solution 1.
Thanks