Why does this exceed the 65,535 byte limit in Java constructors and static Initializers?
Question
Disclaimer: I realize I can generate this at runtime in Java, this was needed for a very special case while performance testing some code. I've found a different approach, so now this is just more of a curiosity than anything practical.
I've tried the following as a static field, as an instance field, and initialized directly within the constructor. Every time eclipse is informing me that either "The code of constructor TestData() is exceeding the 65535 bytes limit" or "The code for the static initializer is exceeding the 65535 bytes limit".
There are 10,000 integers. If each int is 4 bytes (32bits), then would that not be 40,000 bytes? Is there really more that 25,0000 bytes of overhead in addition to the data just merely constructing the array?
The data is generated with this small bit of python:
#!/usr/bin/python
import random;
print "public final int[] RANDOM_INTEGERS = new int[] {";
for i in range(1,10000):
print str(int(random.uniform(0,0x7fffffff))) + ",";
print "};";
Here's a small sample:
public final int[] RANDOM_INTEGERS = new int[] {
963056418, 460816633, 1426956928, 1836901854, 334443802, 721185237, 488810483,
1734703787, 1858674527, 112552804, 1467830977, 1533524842, 1140643114, 1452361499,
716999590, 652029167, 1448309605, 1111915190, 1032718128, 1194366355, 112834025,
419247979, 944166634, 205228045, 1920916263, 1102820742, 1504720637, 757008315,
67604636, 1686232265, 597601176, 1090143513, 205960256, 1611222388, 1997832237,
1429883982, 1693885243, 1987916675, 159802771, 1092244159, 1224816153, 1675311441,
1873372604, 1787757434, 1347615328, 1868311855, 1401477617, 508641277, 1352501377,
1442984254, 1468392589, 1059757519, 1898445041, 1368044543, 513517087, 99625132,
1291863875, 654253390, 169170318, 2117466849, 1711924068, 564675178, 208741732,
1095240821, 1993892374, 87422510, 1651783681, 1536657700, 1039420228, 674134447,
1083424612, 2137469237, 1294104182, 964677542, 1506442822, 1521039575, 64073383,
929517073, 206993014, 466196357, 1139633501, 1692533218, 1934476545, 2066226407,
550646675, 624977767, 1494512072, 1230119126, 1956454185, 1321128794, 2099617717,
//.... to 10,0000 instances
Solution
Here is the bytecode for initializing an array with {1000001, 1000002, 1000003}:
5 iconst_3
6 newarray int [10]
8 dup
9 iconst_0
10 ldc <Integer 1000001> [12]
12 iastore
13 dup
14 iconst_1
15 ldc <Integer 1000002> [13]
17 iastore
18 dup
19 iconst_2
20 ldc <Integer 1000003> [14]
22 iastore
23 putfield net.jstuber.test.TestArrayInitializingConstructor.data : int[] [15]
So for this small array each element requires 5 bytes of Java bytecode. For your bigger array both the array index and the index into the constant pool will use 3 bytes for most elements, which leads to 8 bytes per array element. So for 10000 elements you'd have to expect about 80kB of byte code.
The code for initializing big arrays with 16 bit indices looks like this:
2016 dup
2017 sipush 298
2020 ldc_w <Integer 100298> [310]
2023 iastore
2024 dup
2025 sipush 299
2028 ldc_w <Integer 100299> [311]
OTHER TIPS
Array literals are translated into the byte code that fills the array with the values, so you need a few more bytes for each number.
Why not move that data out into a resource that you load at class-loading time in a static initializer block? This can easily be done by using MyClass.class.getClassLoader().getResourceAsStream()
. It seems that this it where it belongs, anyway.
Or better yet, create the random values in the static initializer block using the Java tools available. And if you need repeatable "random" numbers, then just seed the Random
instance with a fixed, but randomly choosen number each time.
Besides the values of the integers, the constructor and the initializer needs to contain the JVM instructions for loading the integers into the array.
A much simpler and more practical approach is to store the numbers in a file, either in a binary format or as text.
I don't know what java initialises arrays this way, but it does not initialise large arrays efficiently.
I think that code size in characters is more than 65535. Not the memory taken by 10000 integers.
I think it's possible that this is the amount of memory required to represent those ints alphanumerically. I think this limit might apply for the code itself, so, each int, for instance: 1494512072 actually takes 10 bytes ( one per digit ) instead of only 4 bytes used for the int32.