In general, Flume works better with batches of events. This is because the File channel fsync
s for every batch. Thus, waiting for a period of time is a good trade-off to collect a batch of events.
Flume-ng Spool directory source polling directory for new files instead of using native WatchService API
Question
I was digging into the Flume-ng's SpoolingDirectorySource
src and found that it polls the spool directory after the specified POLL_DELAY_MS
parameter to generate new events. These events are then handled by ReliableSpoolingFileEventReader
in a seperate thread.
I was wondering why ReliableSpoolingFileEventReader
does not use WatchService
API, which is pretty low level as well as thread-safe. Is there any specific design constraint which favored polling over watcher? \
Thanks.
Solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow