Data Retrieval Throughput - ETS lookup vs inter-process Messaging

Question 1

Short answer: profile. Try different approaches and verify how your system behaves.

Firstly, I would look at ETS' {read_concurrency, true} option. From the documentation:

{read_concurrency,boolean()} Performance tuning. Default is false. When set to true, the table is optimized for concurrent read operations. When this option is enabled on a runtime system with SMP support, read operations become much cheaper; especially on systems with multiple physical processors. However, switching between read and write operations becomes more expensive. You typically want to enable this option when concurrent read operations are much more frequent than write operations, or when concurrent reads and writes comes in large read and write bursts (i.e., lots of reads not interrupted by writes, and lots of writes not interrupted by reads). You typically do not want to enable this option when the common access pattern is a few read operations interleaved with a few write operations repeatedly. In this case you will get a performance degradation by enabling this option. The read_concurrency option can be combined with the write_concurrency option. You typically want to combine these when large concurrent read bursts and large concurrent write bursts are common.

Secondly, I would look at caching possibilities. Are the processes reading that information only once or multiple times? If they're accessing it multiple times, you could read it once and store it in your process state.

Thirdly, you could try to replicate and distribute that piece of information across your system. Divide et impera.

Question 2

If you use the process approach, in order to avoid having all the read requests serialized on the message queue of the 'server' process you must replicate.

Using an ETS table with read_concurrency feels more natural and it is something that I used when developing the parallel version of Dialyzer. However, ETS access was never a bottleneck in that case.