Question

BEGIN BACKGROUND INFO

I have a program in C++ that is inherently serial, and takes between 10 seconds to 30 minutes to run depending on the characteristics of the model I pass in. I have automated running the program with bash. Deep in the C++ code, I am writing out some metrics to a file, but I can't access the name of the model. So I had been writing to the same file with bash after running the C++ function. My bash file was as follows:

#!bin/bash
for i in *.run; do
  bash $i
  echo $i >> output.txt
done

This resulted in an output file essentially in this format:

metrics for model #1 written by C++
name for model #1 written by bash
metrics for model #2 written by C++
name for model #2 written by bash
etc...

With some 300 models, this took 12 hours to run, but knowing it was serial, I tried to run multiple instances of my C++ program in the background with this bash file

#!bin/bash
run_my_program()
{
  bash $1
  echo $1 >> output.txt
}

for i in *.run; do
  while [`jobs | wc -l` -ge 8 ]; do
    sleep 5
  done
  run_my_program $i &
done

This cut my execution time down to around 2 hours, however it destroyed the structure of my output file, making it hard to identify which metrics were attached to which file.

metrics for model #1 written by C++
name for model #1 written by bash
metrics for model # ?? written by C++
metrics for model # ?? written by C++
name for model #2 written by bash
name for model #3 written by bash
etc...

END BACKGROUND INFO

So I'm trying to use intermediate files for each model, based on the process id, then concatenating the temp files into the main one.

#!bin/bash
run_my_program()
{
  bash $1 &
  myId=$!
  myFile="Temp_$myId.txt"
  wait $myId
  echo $1 >> $myFile
  cat $myFile >> output.txt
}

for i in *.run; do
  while [`jobs | wc -l` -ge 8 ]; do
    sleep 5
  done
  run_my_program $i &
done

My C++ code that opens the temp file for writing is as follows

pid_t myprocess = getpid();
char tmpStr[10];
sprintf(tmpStr,"%d",myprocess);
std::string filename = "Temp_" + std::string(tmpStr) + ".txt";
fout.open(filename.c_str(),std::fstream::out | std::fstream::app);

Unfortunately the numbers in myprocess from C++ and myId from bash just aren't matching up.

QUESTION: Is there something I'm missing about how $! and getpid() work that's giving me inconsistent numbers? Am I doomed to run my models in serial?

Was it helpful?

Solution

After

bash $1 &

$! is the pid of the bash child process. Presumably, $1 (something.run) is a script which ends up running the C++ program as a child, but that child process will be yet another pid.

You might be able to modify your .run file to exec the C++ program instead of spawning a child, but that will only work if you don't need to do anything in the .run file afterwards.

Another relatively simple solution is to generate a subprocess count in your driver loop, and pass it through the .run file to the C++ program, which can then use it as a tag in log messages. That has the advantage of allowing the log messages to come from various different programs, if that is useful.

Yet another simple solution is to output all logging information from your C++ program to stderr. Then the .run script which actually calls the program can redirect stderr to a log file created using $$ -- the pid of the .run script -- which will be the same as $! in the driver.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top