Handling large amount of events in event sourcing

Question 1

I think the most common approach is to use snapshots and persistent read models. That is, you don't actually replay your events very often, except when you need to build a new read model or change the way an existing one works. By storing snapshots of your domain objects, you avoid having to replay long streams of events.

One could argue that storing snapshots and persistent read models isn't a whole lot different than just doing CQRS without event-sourcing. But the old events are there in the event that you made a mistake in your read model, or need to derive new information, or have other strict auditing requirements.

In our application, where we have many events that have low business value, we plan to scrub events heavily during execution so that our event logs stay smaller. But I imagine for some objects we will still fall back to snapshots and persistent models.

Question 2

Look at your "active streamset". Are there streams that have a lifecycle where they tend to come into existence, mutate over a relatively short period of time, and then die as they reach their final state? If so, these streams could be moved to cheaper storage (backup). The only reason you'd need them is for replaying purposes, so you may want to either make them still accessible (albeit at a slower response rate) or keep a compressed copy for replay purposes around. In any case, do question if there are streams you can move out of the event store or at least out the active streamset.

Another option is to partition your streams across multiple physical event stores. Maybe there is a geographical boundary that can be used, or maybe there's something that naturally partitions them (the domain you are in usually provides hints). It's the kind of thing where you need to reflect about advantages and disadvantages.

This technique is not restricted to event sourcing. It can equally be applied to state-based models (it's just data afterall).