One approach is to store k elements in each node, starting with one parent node which spans the entire collision space. When inserting element k+1, you subdivide the space and place the new element in the correct quadrant.
Additionally you can use this approach to statically allocate the data structure, assuming you know the maximum number of nodes that will be used, and that there will be some maximum density. This requires a fixed array of nodes and elements to be allocated for the life of the application, but it avoids costly dynamic allocations, which should be a speed gain.