Linux Bash:在多个内核上运行相同的程序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15343561/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bash: Running the same program over multiple cores
提问by Joe
I have access to a machine where I have access to 10 of the cores -- and I would like to actually use them. What I am used to doing on my own machine would be something like this:
我可以访问一台可以访问 10 个内核的机器——我想实际使用它们。我习惯在自己的机器上做的事情是这样的:
for f in *.fa; do
myProgram (options) "./$f" "./$f.tmp"
done
I have 10 files I'd like to do this on -- let's call them blah00.fa, blah01.fa, ... blah09.fa.
我有 10 个文件我想对它们进行处理——我们称它们为 blah00.fa、blah01.fa、... blah09.fa。
The problem with this approach is that myProgram only uses 1 core at a time, and doing it like this on the multi-core machine I'd be using 1 core at a time 10 times, so I wouldn't be using my mahcine to its max capability.
这种方法的问题是 myProgram 一次只使用 1 个核心,在多核机器上这样做我会一次使用 1 个核心 10 次,所以我不会使用我的机器它的最大能力。
How could I change my script so that it runs all 10 of my .fa files at the same time? I looked at Run a looped process in bash across multiple coresbut I couldn't get the command from that to do what I wanted exactly.
我怎样才能更改我的脚本,以便它同时运行我的所有 10 个 .fa 文件?我查看了在 bash 中跨多个内核运行循环进程,但我无法从中获取命令来执行我想要的操作。
采纳答案by chepner
You could use
你可以用
for f in *.fa; do
myProgram (options) "./$f" "./$f.tmp" &
done
wait
which would start all of you jobs in parallel, then wait until they allcomplete before moving on. In the case where you have more jobs than cores, you would start all of them and let your OS scheduler worry about swapping processes in an out.
这将并行启动所有作业,然后等到它们全部完成后再继续。如果您有比核心更多的作业,您将启动所有这些作业并让您的操作系统调度程序担心交换进程。
One modification is to start 10 jobs at a time
一项修改是一次启动 10 个作业
count=0
for f in *.fa; do
myProgram (options) "./$f" "./$f.tmp" &
(( count ++ ))
if (( count = 10 )); then
wait
count=0
fi
done
but this is inferior to using parallel
because you can't start new jobs as old ones finish, and you also can't detect if an older job finished before you manage to start 10 jobs. wait
allows you to wait on a single particular process or allbackground processes, but doesn't let you know when any one of an arbitrary set of background processes complete.
但这不如使用,parallel
因为您无法在旧作业完成时开始新作业,并且您也无法在设法开始 10 个作业之前检测旧作业是否已完成。wait
允许您等待单个特定进程或所有后台进程,但不会让您知道任意一组后台进程中的任何一个何时完成。
回答by Ole Tange
With GNU Parallel you can do:
使用 GNU Parallel,您可以:
parallel myProgram (options) {} {.}.tmp ::: *.fa
From: http://git.savannah.gnu.org/cgit/parallel.git/tree/README
来自:http: //git.savannah.gnu.org/cgit/parallel.git/tree/README
= Full installation =
= 完全安装 =
Full installation of GNU Parallel is as simple as:
完全安装 GNU Parallel 非常简单:
./configure && make && make install
If you are not root you can add ~/bin to your path and install in ~/bin and ~/share:
如果您不是 root,您可以将 ~/bin 添加到您的路径并安装在 ~/bin 和 ~/share 中:
./configure --prefix=$HOME && make && make install
Or if your system lacks 'make' you can simply copy src/parallel src/sem src/niceload src/sql to a dir in your path.
或者,如果您的系统缺少“make”,您可以简单地将 src/parallel src/sem src/niceload src/sql 复制到路径中的目录。
= Minimal installation =
= 最少安装 =
If you just need parallel and do not have 'make' installed (maybe the system is old or Microsoft Windows):
如果您只需要并行并且没有安装“make”(可能系统是旧的或 Microsoft Windows):
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
mv parallel sem dir-in-your-$PATH/bin/
Watch the intro videos to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
观看介绍视频以了解更多信息:https: //www.youtube.com/playlist?list=PL284C9FF2488BC6D1
回答by Sergey Vyazmitinov
# Wait while instance count less than , run additional instance and exit
function runParallel () {
cmd=
args=
number=
currNumber="1024"
while true ; do
currNumber=`ps -e | grep -v "grep" | grep " $" | wc -l`
if [ $currNumber -lt $number ] ; then
break
fi
sleep 1
done
echo "run: $cmd $args"
$cmd $args &
}
loop=0
# We will run 12 sleep commands for 10 seconds each
# and only five of them will work simultaneously
while [ $loop -ne 12 ] ; do
runParallel "sleep" 10 5
loop=`expr $loop + 1`
done