且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

根据ssh并行检查特定程序是否仍在运行

更新时间:2023-12-05 09:22:04

以下内容可让所有连接完成,然后再开始下一批中的任何连接,因此可能等待30秒以上-但应该给您一个好主意如何做您要寻找的东西:

The following lets all connections complete before starting any in the next batch, and thus can potentially wait for more than 30 seconds -- but should give you a good idea of how to do what you're looking for:

hosts=( host1 host2 host3 )
user=someuser
script="script you want to run on each remote host"

last_time=$(( SECONDS - 30 ))
while (( ( SECONDS - last_time ) >= 30 )) || \
      sleep $(( 30 - (SECONDS - last_time) )); do
  last_time=$SECONDS
  declare -A pids=( )
  for host in "${hosts[@]}"; do
    ssh "${user}@${host}" "$script" & pids[$!]="$host"
  done
  for pid in "${!pids[@]}"; do
    wait "$pid" || {
      echo "Failure monitoring host ${pids[$pid]} at time $SECONDS" >&2
    }
  done
done


现在,更大的图景:不要那样.


Now, bigger picture: Don't do that.

几乎每个操作系统都有一个流程监督框架. Ubuntu拥有Upstart; Fedora和CentOS 7已经系统化; MacOS X已启动; runit,daemontools和其他工具可以安装在任何地方(并且非常非常容易使用-查看运行脚本,网址为

Almost every operating system has a process supervision framework. Ubuntu has Upstart; Fedora and CentOS 7 have systemd; MacOS X has launchd; runit, daemontools, and others can be installed anywhere (and are very, very easy to use -- look at the run scripts at http://smarden.org/runit/runscripts.html for examples).

使用这些工具是监视进程并确保在退出时重新启动的正确方法:与这种(非常高开销的)解决方案不同,它们几乎没有开销,因为它们依赖于操作系统来通知进程的父进程退出时,而不是进行进程轮询(并且仅在通过SSH进行连接,协商一对会话密钥,启动Shell来运行脚本等所有开销之后,等等)

Using these tools are the Right Way to monitor a process and ensure that it restarts whenever it exits: Unlike this (very high-overhead) solution they have almost no overhead at all, since they rely on the operating system notifying a process's parent when that process exits, rather than doing the work of polling for a process (and that only after all the overhead of connecting via SSH, negotiating a pair of session keys, starting a shell to run your script, etc, etc, etc).

是的,这可能是一个小型私人项目.尽管如此,您仍在为自己制造额外的复杂性(因此也带来了额外的错误),并且,如果您学会使用工具正确地做到这一点,那么当您拥有不是的东西时,您就会知道如何正确地做事这不是一个小型私人项目.

Yes, this may be a small private project. Still, you're making extra complexity (and thus, extra bugs) for yourself -- and if you learn to use the tools to do this right, you'll know how to do things right when you have something that isn't a small private project.