Pergunta

The hash table data structure can be easily spread across multiple machines with a simple algorithm to distribute the keys:

machine_to_query = item_key % machine_count

When you want to read and write key value pairs, you use the key to work out which machine stores the data, then you talk to that machine. If you want a count of the total number of items, you'd need to request the count from each server and add them up.

What algorithms exist for efficiently managing data structures where the data is partitioned across multiple machines? Distributed algorithms, not parallel algorithms.

How might something like a sorted array work in a distributed fashion? Efficiently.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
scroll top