Question

I have N files, the content within each file is sorted, and now I want to merge these N files using sort -m, it is a N-way merge.

But I have this problem, each of these N files is updating, which means the content within each file is the output of another program, the program writes the output into each file constantly.

For example, at this moment, the first file looks like this:

1
3
5

10 minutes later, it is updated by the program and looks like this:

1
3
5
7
9
11

If the N files are not updating, I can simply use sort -m, but now how to do it?

To clarify, what I want in the end is that, all the content in N files should be merged into one final file, that means if those files are updated, the newly updated content should also be merged.

UPDATE

Bash on Linux, lines in each file is in monotonically ascending order, no duplicates between files.

No correct solution

OTHER TIPS

Since this is linux, you can rely on the inotifywait utility, from the inotify-utils package:

#!/bin/bash

FILES_TO_WATCH=("file1.txt" "file2.txt")
MERGED_FILE="merged.txt"

log() {
  echo "[$(date -R)] $1" 1>&2
}

merge_files() {
  log "Updating merged file"
  sort -m "${FILES_TO_WATCH[@]}" > "$MERGED_FILE"
}

wait_for_changes() {
  local changed_file
  changed_file=$(inotifywait -qe modify "${FILES_TO_WATCH[@]}" --format "%w")

  log "File '$changed_file' changed"
}

merge_files

while wait_for_changes; do
  merge_files
done

Breakdown of the inotifywait command:

-q

Be quiet, as opposed to logging status messages to stderr.

-e modify

Listen for the "modify" event. For other events, see man inotifywait.

--format "%w"

Make it print only the filename when a modify event occurs.

${FILES_TO_WATCH[@]}

Expand the files on the FILES_TO_WATCH array.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top