Your performance is getting eaten alive by the sheer amount of stl heap allocations caused by repeatedly creating and populating all those Sets. If you profiled the code (say with a quick and easy non-instrumenting tool like Sleepy), I'm certain you'd find that to be the case. You're using Sets to avoid having a given particle added to a bucket more than once - I get that. If Duck's suggestion doesn't give you what you need, I think you could dramatically improve performance by using preallocated arrays or vectors, and getting uniqueness in those containers by adding an "added" flag to the particle that gets set when the item is added. Then just check that flag before adding, and be sure to clear the flags before the next cycle. (If the number of particles is constant, you can do this extremely efficiently with a preallocated array dedicated to storing the flags, then memsetting to 0 at the end of the frame.)
That's the basic idea. If you decide to go this route and get stuck somewhere, I'll help you work out the details.