A bit late, but I ended up developing an appropriate solution to this problem. There were two large issues here:
- Performance. Pulling tens of thousands of records into memory was not a viable way of handling this process.
- Testability.
The solution is made up of multiple parts as well:
- Don't load all the shipments into memory. In my particular scenario, I used a Traversable PDO Statement to step through the results one-by-one. There is the caveat that it must re-run the query a second time, but it's possible there are other ways to reset the pointer that don't involve doing this.
- Separate the calculations needed to build each junction record from the process needed to pull them. By walking across the shipments and distributing amounts with an implementation similar to Martin Fowler's Quantity pattern, we can test that step separately as well.
- Finally, separate persisting the records to your persistence pattern of choice (Data Mapper, Gateway, etc.). Using this final piece you can wrap everything and integration test it if needed.
I don't have a practical example here but this is the general idea.