Вопрос

I would like to run a large number of intensive processes in parallel, where I loop over different parameters with for loops. A number of answers to similar questions mention that running processes in parallel can be done with xargs, but none of them seem to mention if or how this can be done if the parameters change for each command.

As an example (pseudo code):

for paramA in 1 2 3
  for paramB in 1 2 3
    ./intensiveCommand $paramA $paramB
  end
end

I would like to parallellize intensiveCommand

Or is there an easier way then using xargs?

Это было полезно?

Решение

You can use GNU parallel. It has a --load option avoid overloading the computer.

parallel --load 100% ./intensiveCommand ::: 1 2 3 ::: 1 2 3

Другие советы

number_of_cores=4   #<-- number of processorcores, in my case: 4

for paramA in 1 2 3
do
    for paramB in 1 2 3
    do
        #========== automatic load regulator ==================
        sleep 1
        while [  $( pgrep -c "intensiveCommand" ) -ge "$number_of_cores" ]
        do
            kill -SIGSTOP $$
        done
        #======================================vvvvvvvvvvvvvvvv            

        ( ./intensiveCommand $paramA $paramB ; kill -SIGCONT $$ ) &

    done
done

This program will put itself on hold if there are running as much intensiveCommands as there are cores. A finished intensiveCommand will let the program continue (see the kill -SIGCONT $$ ). The program checks again and will launch intensiveCommands until it locks again when the max number of intensiveCommands is reached again.

The sleep is to overcome the latency between the launch of an intensiveCommand and it's appearance in the process table.

Very tight scheduling in "1 slot per core",rock solid and simple.

#/bin/bash

#use the filedescriptor as a kind of queue to fill the processing slots.

exec 3< <(

    for PARAM_A in 1 2 3
    do
        for PARAM_B in 1 2 3
        do
             echo $PARAM_A $PARAM_B
        done
    done
)

#4 seperate processing slots running parallel 
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &

#only exit when 100% sure that all processes ended
while pgrep "intensiveCommand" &>"/dev/null" ; do wait ; done

I wrote this which works pretty nicely - read the comments at the top to see how it works.

#!/bin/bash
################################################################################
# File: core
# Author: Mark Setchell
# 
# Primitive, but effective tool for managing parallel execution of jobs in the
# shell. Based on, and requiring REDIS.
#
# Usage:
#
# core -i 8 # Initialise to 8 cores, or specify 0 to use all available cores
# for i in {0..63}
# do
#   # Wait for a core, do a process, release core
#   (core -p; process; core -v)&
# done
# wait
################################################################################
function usage {
    echo "Usage: core -i ncores # Initialise with ncores. Use 0 for all cores."
    echo "       core -p        # Wait (forever) for free core."
    echo "       core -v        # Release core."
    exit 1
}

function init {
    # Delete list of cores in REDIS
    echo DEL cores | redis-cli > /dev/null 2>&1
    for i in `seq 1 $NCORES`
    do
       # Add another core to list of cores in REDIS
       echo LPUSH cores 1 | redis-cli > /dev/null 2>&1
    done
    exit 0
}

function WaitForCore {
    # Wait forever for a core to be available
    echo BLPOP cores 0 | redis-cli > /dev/null 2>&1
    exit 0
}

function ReleaseCore {
    # Release or give back a core
    echo LPUSH cores 1 | redis-cli > /dev/null 2>&1
    exit 0
}

################################################################################
# Main
################################################################################
while getopts "i:pv" optname
  do
    case "$optname" in
      "i")
        if [ $OPTARG -lt 1 ]; then
           NCORES=`sysctl -n hw.logicalcpu`;    # May differ if not on OSX, maybe "nproc" on Linux
        else
           NCORES=$OPTARG
        fi
    init $NCORES
        ;;
      "p")
    WaitForCore
        ;;
      "v")
    ReleaseCore
        ;;
      "?")
        echo "Unknown option $OPTARG"
        ;;
    esac
done
usage

By way of example, the following takes 10 seconds (not 80) to do 16 waits of 5 seconds each:

core -i 8 
for i in {0..15}
do
   # Wait for a core, do a process, release core
   (core -p ; sleep 5 ; core -v)&
done
wait
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top