Frage

Apache Flume User Guide says spooling directory source may duplicate events under certain circumstances. Here is the line from docs: "Despite the reliability guarantees of this source, there are still cases in which events may be duplicated if certain downstream failures occur."

What are those cases? Especially if we are using durable channel like file channel i dont see any reason to duplicate events occur?

War es hilfreich?

Lösung

You described the file channel as durable. You didn't say once and only once delivery.

Flume's guarantee is at least once delivery. Batches will be resent on failure. This can lead to duplicate events.

Example: node 1 is sending events to node 2. All the events are sent and node 2 acknowledges receipt. However. Network conditions are such that that message is lost. Node 2 has stored the batch but node 1 will resend it. Thus, duplicate events.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top