Running a limited number of child processes in parallel in bash?
I have a large set of files for which some heavy processing needs to be done.
This processing in single threaded, uses a few hundred MiB of RAM (on the machine used to start the job) and takes a few minutes to run.
My current usecase is to start a hadoop job on the input data, but I've had this same problem in other cases before.
In order to fully utilize the available CPU power I want to be able to run several those tasks in paralell.
However a very simple example shell script like this will trash the system performance due to excessive load and swapping:
find . -type f | while read name ;
do
some_heavy_processing_command ${name} &
done
So what I want is essentially similar to what "gmake -j4" does.
I know bash supports the "wait" command but that only waits untill all child processes have completed. In the past I've created scripting that does a 'ps' command and then grep the child processes out by name (yes, i know ... ugly).
What is the simplest/cleanest/best solution to do what I want?
Edit: Thanks to Frederik: Yes indeed this is a duplicate of How to limit number of threads/sub-processes used in a function in bash
The "xargs --max-procs=4" works like a charm.
(So I voted to close my own question)
bash parallel-processing
add a comment |
I have a large set of files for which some heavy processing needs to be done.
This processing in single threaded, uses a few hundred MiB of RAM (on the machine used to start the job) and takes a few minutes to run.
My current usecase is to start a hadoop job on the input data, but I've had this same problem in other cases before.
In order to fully utilize the available CPU power I want to be able to run several those tasks in paralell.
However a very simple example shell script like this will trash the system performance due to excessive load and swapping:
find . -type f | while read name ;
do
some_heavy_processing_command ${name} &
done
So what I want is essentially similar to what "gmake -j4" does.
I know bash supports the "wait" command but that only waits untill all child processes have completed. In the past I've created scripting that does a 'ps' command and then grep the child processes out by name (yes, i know ... ugly).
What is the simplest/cleanest/best solution to do what I want?
Edit: Thanks to Frederik: Yes indeed this is a duplicate of How to limit number of threads/sub-processes used in a function in bash
The "xargs --max-procs=4" works like a charm.
(So I voted to close my own question)
bash parallel-processing
8
possible duplicate of stackoverflow.com/questions/6511884/… I'd usexargs --max-procs=4for this...
– Fredrik Pihl
Jul 6 '11 at 8:57
4
it seems like a job for GNU parallel, but I'm not sure it adds extra power toxargs --max-procs, which I didn't know
– larsen
Jul 6 '11 at 10:14
@Niels: I've been usingscreenfor the purpose, though it's a bit messy this way, especially when started from within anotherscreensession ;)
– 0xC0000022L
Jul 6 '11 at 13:38
add a comment |
I have a large set of files for which some heavy processing needs to be done.
This processing in single threaded, uses a few hundred MiB of RAM (on the machine used to start the job) and takes a few minutes to run.
My current usecase is to start a hadoop job on the input data, but I've had this same problem in other cases before.
In order to fully utilize the available CPU power I want to be able to run several those tasks in paralell.
However a very simple example shell script like this will trash the system performance due to excessive load and swapping:
find . -type f | while read name ;
do
some_heavy_processing_command ${name} &
done
So what I want is essentially similar to what "gmake -j4" does.
I know bash supports the "wait" command but that only waits untill all child processes have completed. In the past I've created scripting that does a 'ps' command and then grep the child processes out by name (yes, i know ... ugly).
What is the simplest/cleanest/best solution to do what I want?
Edit: Thanks to Frederik: Yes indeed this is a duplicate of How to limit number of threads/sub-processes used in a function in bash
The "xargs --max-procs=4" works like a charm.
(So I voted to close my own question)
bash parallel-processing
I have a large set of files for which some heavy processing needs to be done.
This processing in single threaded, uses a few hundred MiB of RAM (on the machine used to start the job) and takes a few minutes to run.
My current usecase is to start a hadoop job on the input data, but I've had this same problem in other cases before.
In order to fully utilize the available CPU power I want to be able to run several those tasks in paralell.
However a very simple example shell script like this will trash the system performance due to excessive load and swapping:
find . -type f | while read name ;
do
some_heavy_processing_command ${name} &
done
So what I want is essentially similar to what "gmake -j4" does.
I know bash supports the "wait" command but that only waits untill all child processes have completed. In the past I've created scripting that does a 'ps' command and then grep the child processes out by name (yes, i know ... ugly).
What is the simplest/cleanest/best solution to do what I want?
Edit: Thanks to Frederik: Yes indeed this is a duplicate of How to limit number of threads/sub-processes used in a function in bash
The "xargs --max-procs=4" works like a charm.
(So I voted to close my own question)
bash parallel-processing
bash parallel-processing
edited May 23 '17 at 12:00
Community♦
11
11
asked Jul 6 '11 at 8:29
Niels BasjesNiels Basjes
6,51273953
6,51273953
8
possible duplicate of stackoverflow.com/questions/6511884/… I'd usexargs --max-procs=4for this...
– Fredrik Pihl
Jul 6 '11 at 8:57
4
it seems like a job for GNU parallel, but I'm not sure it adds extra power toxargs --max-procs, which I didn't know
– larsen
Jul 6 '11 at 10:14
@Niels: I've been usingscreenfor the purpose, though it's a bit messy this way, especially when started from within anotherscreensession ;)
– 0xC0000022L
Jul 6 '11 at 13:38
add a comment |
8
possible duplicate of stackoverflow.com/questions/6511884/… I'd usexargs --max-procs=4for this...
– Fredrik Pihl
Jul 6 '11 at 8:57
4
it seems like a job for GNU parallel, but I'm not sure it adds extra power toxargs --max-procs, which I didn't know
– larsen
Jul 6 '11 at 10:14
@Niels: I've been usingscreenfor the purpose, though it's a bit messy this way, especially when started from within anotherscreensession ;)
– 0xC0000022L
Jul 6 '11 at 13:38
8
8
possible duplicate of stackoverflow.com/questions/6511884/… I'd use
xargs --max-procs=4 for this...– Fredrik Pihl
Jul 6 '11 at 8:57
possible duplicate of stackoverflow.com/questions/6511884/… I'd use
xargs --max-procs=4 for this...– Fredrik Pihl
Jul 6 '11 at 8:57
4
4
it seems like a job for GNU parallel, but I'm not sure it adds extra power to
xargs --max-procs, which I didn't know– larsen
Jul 6 '11 at 10:14
it seems like a job for GNU parallel, but I'm not sure it adds extra power to
xargs --max-procs, which I didn't know– larsen
Jul 6 '11 at 10:14
@Niels: I've been using
screen for the purpose, though it's a bit messy this way, especially when started from within another screen session ;)– 0xC0000022L
Jul 6 '11 at 13:38
@Niels: I've been using
screen for the purpose, though it's a bit messy this way, especially when started from within another screen session ;)– 0xC0000022L
Jul 6 '11 at 13:38
add a comment |
7 Answers
7
active
oldest
votes
#! /usr/bin/env bash
set -o monitor
# means: run background processes in a separate processes...
trap add_next_job CHLD
# execute add_next_job when we receive a child complete signal
todo_array=($(find . -type f)) # places output into an array
index=0
max_jobs=2
function add_next_job {
# if still jobs to do then add one
if [[ $index -lt ${#todo_array[*]} ]]
# apparently stackoverflow doesn't like bash syntax
# the hash in the if is not a comment - rather it's bash awkward way of getting its length
then
echo adding job ${todo_array[$index]}
do_job ${todo_array[$index]} &
# replace the line above with the command you want
index=$(($index+1))
fi
}
function do_job {
echo "starting job $1"
sleep 2
}
# add initial set of jobs
while [[ $index -lt $max_jobs ]]
do
add_next_job
done
# wait for all jobs to complete
wait
echo "done"
Having said that Fredrik makes the excellent point that xargs does exactly what you want...
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
add a comment |
I know I'm late to the party with this answer but I thought I would post an alternative that, IMHO, makes the body of the script cleaner and simpler. (Clearly you can change the values 2 & 5 to be appropriate for your scenario.)
function max2 {
while [ `jobs | wc -l` -ge 2 ]
do
sleep 5
done
}
find . -type f | while read name ;
do
max2; some_heavy_processing_command ${name} &
done
wait
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
add a comment |
With GNU Parallel it becomes simpler:
find . -type f | parallel some_heavy_processing_command {}
Learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
add a comment |
I think I found a more handy solution using make:
#!/usr/bin/make -f
THIS := $(lastword $(MAKEFILE_LIST))
TARGETS := $(shell find . -name '*.sh' -type f)
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
some_heavy_processing_command $@
$(THIS): ; # Avoid to try to remake this makefile
Call it as e.g. 'test.mak', and add execute rights. If You call ./test.mak it will call the some_heavy_processing_command one-by-one. But You can call as ./test.mak -j 4, then it will run four subprocesses at once. Also You can use it on a more sophisticated way: run as ./test.mak -j 5 -l 1.5, then it will run maximum 5 sub-processes while the system load is under 1.5, but it will limit the number of processes if the system load exceeds 1.5.
It is more flexible than xargs, and make is part of the standard distribution, not like parallel.
add a comment |
This code worked quite well for me.
I noticed one issue in which the script couldn't end.
If you run into a case where the script wont end due to max_jobs being greater than the number of elements in the array, the script will never quit.
To prevent the above scenario, I've added the following right after the "max_jobs" declaration.
if [ $max_jobs -gt ${#todo_array[*]} ];
then
# there are more elements found in the array than max jobs, setting max jobs to #of array elements"
max_jobs=${#todo_array[*]}
fi
add a comment |
Another option:
PARALLEL_MAX=...
function start_job() {
while [ $(ps --no-headers -o pid --ppid=$$ | wc -l) -gt $PARALLEL_MAX ]; do
sleep .1 # Wait for background tasks to complete.
done
"$@" &
}
start_job some_big_command1
start_job some_big_command2
start_job some_big_command3
start_job some_big_command4
...
add a comment |
Here is a very good function I used to control the maximum # of jobs from bash or ksh. NOTE: the - 1 in the pgrep subtracts the wc -l subprocess.
function jobmax
{
typeset -i MAXJOBS=$1
sleep .1
while (( ($(pgrep -P $$ | wc -l) - 1) >= $MAXJOBS ))
do
sleep .1
done
}
nproc=5
for i in {1..100}
do
sleep 1 &
jobmax $nproc
done
wait # Wait for the rest
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f6593531%2frunning-a-limited-number-of-child-processes-in-parallel-in-bash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
#! /usr/bin/env bash
set -o monitor
# means: run background processes in a separate processes...
trap add_next_job CHLD
# execute add_next_job when we receive a child complete signal
todo_array=($(find . -type f)) # places output into an array
index=0
max_jobs=2
function add_next_job {
# if still jobs to do then add one
if [[ $index -lt ${#todo_array[*]} ]]
# apparently stackoverflow doesn't like bash syntax
# the hash in the if is not a comment - rather it's bash awkward way of getting its length
then
echo adding job ${todo_array[$index]}
do_job ${todo_array[$index]} &
# replace the line above with the command you want
index=$(($index+1))
fi
}
function do_job {
echo "starting job $1"
sleep 2
}
# add initial set of jobs
while [[ $index -lt $max_jobs ]]
do
add_next_job
done
# wait for all jobs to complete
wait
echo "done"
Having said that Fredrik makes the excellent point that xargs does exactly what you want...
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
add a comment |
#! /usr/bin/env bash
set -o monitor
# means: run background processes in a separate processes...
trap add_next_job CHLD
# execute add_next_job when we receive a child complete signal
todo_array=($(find . -type f)) # places output into an array
index=0
max_jobs=2
function add_next_job {
# if still jobs to do then add one
if [[ $index -lt ${#todo_array[*]} ]]
# apparently stackoverflow doesn't like bash syntax
# the hash in the if is not a comment - rather it's bash awkward way of getting its length
then
echo adding job ${todo_array[$index]}
do_job ${todo_array[$index]} &
# replace the line above with the command you want
index=$(($index+1))
fi
}
function do_job {
echo "starting job $1"
sleep 2
}
# add initial set of jobs
while [[ $index -lt $max_jobs ]]
do
add_next_job
done
# wait for all jobs to complete
wait
echo "done"
Having said that Fredrik makes the excellent point that xargs does exactly what you want...
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
add a comment |
#! /usr/bin/env bash
set -o monitor
# means: run background processes in a separate processes...
trap add_next_job CHLD
# execute add_next_job when we receive a child complete signal
todo_array=($(find . -type f)) # places output into an array
index=0
max_jobs=2
function add_next_job {
# if still jobs to do then add one
if [[ $index -lt ${#todo_array[*]} ]]
# apparently stackoverflow doesn't like bash syntax
# the hash in the if is not a comment - rather it's bash awkward way of getting its length
then
echo adding job ${todo_array[$index]}
do_job ${todo_array[$index]} &
# replace the line above with the command you want
index=$(($index+1))
fi
}
function do_job {
echo "starting job $1"
sleep 2
}
# add initial set of jobs
while [[ $index -lt $max_jobs ]]
do
add_next_job
done
# wait for all jobs to complete
wait
echo "done"
Having said that Fredrik makes the excellent point that xargs does exactly what you want...
#! /usr/bin/env bash
set -o monitor
# means: run background processes in a separate processes...
trap add_next_job CHLD
# execute add_next_job when we receive a child complete signal
todo_array=($(find . -type f)) # places output into an array
index=0
max_jobs=2
function add_next_job {
# if still jobs to do then add one
if [[ $index -lt ${#todo_array[*]} ]]
# apparently stackoverflow doesn't like bash syntax
# the hash in the if is not a comment - rather it's bash awkward way of getting its length
then
echo adding job ${todo_array[$index]}
do_job ${todo_array[$index]} &
# replace the line above with the command you want
index=$(($index+1))
fi
}
function do_job {
echo "starting job $1"
sleep 2
}
# add initial set of jobs
while [[ $index -lt $max_jobs ]]
do
add_next_job
done
# wait for all jobs to complete
wait
echo "done"
Having said that Fredrik makes the excellent point that xargs does exactly what you want...
answered Jul 6 '11 at 9:54
DunesDunes
23.9k44468
23.9k44468
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
add a comment |
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
I now understand the code, but had to think a bit. Especially the part about why these would run in parallel (well, because they are subprocesses) eluded me. I think it would be worthwhile adding comments for that part into the code as well.
– 0xC0000022L
Jul 6 '11 at 18:42
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
Although my current application works great with the xargs --max-procs I'm still giving you the credit for being "the answer" because your script is usable in more situations. Thanks.
– Niels Basjes
Jul 7 '11 at 20:34
add a comment |
I know I'm late to the party with this answer but I thought I would post an alternative that, IMHO, makes the body of the script cleaner and simpler. (Clearly you can change the values 2 & 5 to be appropriate for your scenario.)
function max2 {
while [ `jobs | wc -l` -ge 2 ]
do
sleep 5
done
}
find . -type f | while read name ;
do
max2; some_heavy_processing_command ${name} &
done
wait
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
add a comment |
I know I'm late to the party with this answer but I thought I would post an alternative that, IMHO, makes the body of the script cleaner and simpler. (Clearly you can change the values 2 & 5 to be appropriate for your scenario.)
function max2 {
while [ `jobs | wc -l` -ge 2 ]
do
sleep 5
done
}
find . -type f | while read name ;
do
max2; some_heavy_processing_command ${name} &
done
wait
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
add a comment |
I know I'm late to the party with this answer but I thought I would post an alternative that, IMHO, makes the body of the script cleaner and simpler. (Clearly you can change the values 2 & 5 to be appropriate for your scenario.)
function max2 {
while [ `jobs | wc -l` -ge 2 ]
do
sleep 5
done
}
find . -type f | while read name ;
do
max2; some_heavy_processing_command ${name} &
done
wait
I know I'm late to the party with this answer but I thought I would post an alternative that, IMHO, makes the body of the script cleaner and simpler. (Clearly you can change the values 2 & 5 to be appropriate for your scenario.)
function max2 {
while [ `jobs | wc -l` -ge 2 ]
do
sleep 5
done
}
find . -type f | while read name ;
do
max2; some_heavy_processing_command ${name} &
done
wait
answered Jan 17 '13 at 20:10
BruceHBruceH
35327
35327
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
add a comment |
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
2
2
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
Dude, this works brilliantly! Thanks! :)
– mkgrunder
Oct 8 '13 at 19:43
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
This worked for me after changing the while syntax to: while [ $(jobs | wc -l) -ge 2 ]
– Jeffrey Cordero
Jun 23 '17 at 15:07
add a comment |
With GNU Parallel it becomes simpler:
find . -type f | parallel some_heavy_processing_command {}
Learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
add a comment |
With GNU Parallel it becomes simpler:
find . -type f | parallel some_heavy_processing_command {}
Learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
add a comment |
With GNU Parallel it becomes simpler:
find . -type f | parallel some_heavy_processing_command {}
Learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
With GNU Parallel it becomes simpler:
find . -type f | parallel some_heavy_processing_command {}
Learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
answered Feb 3 '13 at 15:20
Ole TangeOle Tange
19.4k35567
19.4k35567
add a comment |
add a comment |
I think I found a more handy solution using make:
#!/usr/bin/make -f
THIS := $(lastword $(MAKEFILE_LIST))
TARGETS := $(shell find . -name '*.sh' -type f)
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
some_heavy_processing_command $@
$(THIS): ; # Avoid to try to remake this makefile
Call it as e.g. 'test.mak', and add execute rights. If You call ./test.mak it will call the some_heavy_processing_command one-by-one. But You can call as ./test.mak -j 4, then it will run four subprocesses at once. Also You can use it on a more sophisticated way: run as ./test.mak -j 5 -l 1.5, then it will run maximum 5 sub-processes while the system load is under 1.5, but it will limit the number of processes if the system load exceeds 1.5.
It is more flexible than xargs, and make is part of the standard distribution, not like parallel.
add a comment |
I think I found a more handy solution using make:
#!/usr/bin/make -f
THIS := $(lastword $(MAKEFILE_LIST))
TARGETS := $(shell find . -name '*.sh' -type f)
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
some_heavy_processing_command $@
$(THIS): ; # Avoid to try to remake this makefile
Call it as e.g. 'test.mak', and add execute rights. If You call ./test.mak it will call the some_heavy_processing_command one-by-one. But You can call as ./test.mak -j 4, then it will run four subprocesses at once. Also You can use it on a more sophisticated way: run as ./test.mak -j 5 -l 1.5, then it will run maximum 5 sub-processes while the system load is under 1.5, but it will limit the number of processes if the system load exceeds 1.5.
It is more flexible than xargs, and make is part of the standard distribution, not like parallel.
add a comment |
I think I found a more handy solution using make:
#!/usr/bin/make -f
THIS := $(lastword $(MAKEFILE_LIST))
TARGETS := $(shell find . -name '*.sh' -type f)
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
some_heavy_processing_command $@
$(THIS): ; # Avoid to try to remake this makefile
Call it as e.g. 'test.mak', and add execute rights. If You call ./test.mak it will call the some_heavy_processing_command one-by-one. But You can call as ./test.mak -j 4, then it will run four subprocesses at once. Also You can use it on a more sophisticated way: run as ./test.mak -j 5 -l 1.5, then it will run maximum 5 sub-processes while the system load is under 1.5, but it will limit the number of processes if the system load exceeds 1.5.
It is more flexible than xargs, and make is part of the standard distribution, not like parallel.
I think I found a more handy solution using make:
#!/usr/bin/make -f
THIS := $(lastword $(MAKEFILE_LIST))
TARGETS := $(shell find . -name '*.sh' -type f)
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
some_heavy_processing_command $@
$(THIS): ; # Avoid to try to remake this makefile
Call it as e.g. 'test.mak', and add execute rights. If You call ./test.mak it will call the some_heavy_processing_command one-by-one. But You can call as ./test.mak -j 4, then it will run four subprocesses at once. Also You can use it on a more sophisticated way: run as ./test.mak -j 5 -l 1.5, then it will run maximum 5 sub-processes while the system load is under 1.5, but it will limit the number of processes if the system load exceeds 1.5.
It is more flexible than xargs, and make is part of the standard distribution, not like parallel.
edited Nov 25 '15 at 20:01
answered Jul 16 '13 at 12:38
TrueYTrueY
5,89412538
5,89412538
add a comment |
add a comment |
This code worked quite well for me.
I noticed one issue in which the script couldn't end.
If you run into a case where the script wont end due to max_jobs being greater than the number of elements in the array, the script will never quit.
To prevent the above scenario, I've added the following right after the "max_jobs" declaration.
if [ $max_jobs -gt ${#todo_array[*]} ];
then
# there are more elements found in the array than max jobs, setting max jobs to #of array elements"
max_jobs=${#todo_array[*]}
fi
add a comment |
This code worked quite well for me.
I noticed one issue in which the script couldn't end.
If you run into a case where the script wont end due to max_jobs being greater than the number of elements in the array, the script will never quit.
To prevent the above scenario, I've added the following right after the "max_jobs" declaration.
if [ $max_jobs -gt ${#todo_array[*]} ];
then
# there are more elements found in the array than max jobs, setting max jobs to #of array elements"
max_jobs=${#todo_array[*]}
fi
add a comment |
This code worked quite well for me.
I noticed one issue in which the script couldn't end.
If you run into a case where the script wont end due to max_jobs being greater than the number of elements in the array, the script will never quit.
To prevent the above scenario, I've added the following right after the "max_jobs" declaration.
if [ $max_jobs -gt ${#todo_array[*]} ];
then
# there are more elements found in the array than max jobs, setting max jobs to #of array elements"
max_jobs=${#todo_array[*]}
fi
This code worked quite well for me.
I noticed one issue in which the script couldn't end.
If you run into a case where the script wont end due to max_jobs being greater than the number of elements in the array, the script will never quit.
To prevent the above scenario, I've added the following right after the "max_jobs" declaration.
if [ $max_jobs -gt ${#todo_array[*]} ];
then
# there are more elements found in the array than max jobs, setting max jobs to #of array elements"
max_jobs=${#todo_array[*]}
fi
edited Feb 2 '12 at 21:28
animuson♦
42.5k22116130
42.5k22116130
answered Feb 2 '12 at 17:46
masseomasseo
493
493
add a comment |
add a comment |
Another option:
PARALLEL_MAX=...
function start_job() {
while [ $(ps --no-headers -o pid --ppid=$$ | wc -l) -gt $PARALLEL_MAX ]; do
sleep .1 # Wait for background tasks to complete.
done
"$@" &
}
start_job some_big_command1
start_job some_big_command2
start_job some_big_command3
start_job some_big_command4
...
add a comment |
Another option:
PARALLEL_MAX=...
function start_job() {
while [ $(ps --no-headers -o pid --ppid=$$ | wc -l) -gt $PARALLEL_MAX ]; do
sleep .1 # Wait for background tasks to complete.
done
"$@" &
}
start_job some_big_command1
start_job some_big_command2
start_job some_big_command3
start_job some_big_command4
...
add a comment |
Another option:
PARALLEL_MAX=...
function start_job() {
while [ $(ps --no-headers -o pid --ppid=$$ | wc -l) -gt $PARALLEL_MAX ]; do
sleep .1 # Wait for background tasks to complete.
done
"$@" &
}
start_job some_big_command1
start_job some_big_command2
start_job some_big_command3
start_job some_big_command4
...
Another option:
PARALLEL_MAX=...
function start_job() {
while [ $(ps --no-headers -o pid --ppid=$$ | wc -l) -gt $PARALLEL_MAX ]; do
sleep .1 # Wait for background tasks to complete.
done
"$@" &
}
start_job some_big_command1
start_job some_big_command2
start_job some_big_command3
start_job some_big_command4
...
answered Nov 18 '14 at 21:31
Jeff KaufmanJeff Kaufman
438310
438310
add a comment |
add a comment |
Here is a very good function I used to control the maximum # of jobs from bash or ksh. NOTE: the - 1 in the pgrep subtracts the wc -l subprocess.
function jobmax
{
typeset -i MAXJOBS=$1
sleep .1
while (( ($(pgrep -P $$ | wc -l) - 1) >= $MAXJOBS ))
do
sleep .1
done
}
nproc=5
for i in {1..100}
do
sleep 1 &
jobmax $nproc
done
wait # Wait for the rest
add a comment |
Here is a very good function I used to control the maximum # of jobs from bash or ksh. NOTE: the - 1 in the pgrep subtracts the wc -l subprocess.
function jobmax
{
typeset -i MAXJOBS=$1
sleep .1
while (( ($(pgrep -P $$ | wc -l) - 1) >= $MAXJOBS ))
do
sleep .1
done
}
nproc=5
for i in {1..100}
do
sleep 1 &
jobmax $nproc
done
wait # Wait for the rest
add a comment |
Here is a very good function I used to control the maximum # of jobs from bash or ksh. NOTE: the - 1 in the pgrep subtracts the wc -l subprocess.
function jobmax
{
typeset -i MAXJOBS=$1
sleep .1
while (( ($(pgrep -P $$ | wc -l) - 1) >= $MAXJOBS ))
do
sleep .1
done
}
nproc=5
for i in {1..100}
do
sleep 1 &
jobmax $nproc
done
wait # Wait for the rest
Here is a very good function I used to control the maximum # of jobs from bash or ksh. NOTE: the - 1 in the pgrep subtracts the wc -l subprocess.
function jobmax
{
typeset -i MAXJOBS=$1
sleep .1
while (( ($(pgrep -P $$ | wc -l) - 1) >= $MAXJOBS ))
do
sleep .1
done
}
nproc=5
for i in {1..100}
do
sleep 1 &
jobmax $nproc
done
wait # Wait for the rest
answered Jan 23 '15 at 0:05
user2709129user2709129
1
1
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f6593531%2frunning-a-limited-number-of-child-processes-in-parallel-in-bash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
8
possible duplicate of stackoverflow.com/questions/6511884/… I'd use
xargs --max-procs=4for this...– Fredrik Pihl
Jul 6 '11 at 8:57
4
it seems like a job for GNU parallel, but I'm not sure it adds extra power to
xargs --max-procs, which I didn't know– larsen
Jul 6 '11 at 10:14
@Niels: I've been using
screenfor the purpose, though it's a bit messy this way, especially when started from within anotherscreensession ;)– 0xC0000022L
Jul 6 '11 at 13:38