且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Celery:WorkerLostError:Worker 过早退出:信号 9(SIGKILL)

更新时间:2021-12-31 00:44:54

您的工作人员收到的 SIGKILL 是由另一个进程发起的.您的 supervisord 配置看起来不错,并且 killasgroup 只会影响主管启动的 kill(例如 ctl 或插件) - 如果没有该设置,它无论如何都会将信号发送给调度员,而不是孩子.

The SIGKILL your worker received was initiated by another process. Your supervisord config looks fine, and the killasgroup would only affect a supervisor initiated kill (e.g. the ctl or a plugin) - and without that setting it would have sent the signal to the dispatcher anyway, not the child.

很可能您有内存泄漏,并且操作系统的 oomkiller 正在暗杀您的进程以进行不良行为.

Most likely you have a memory leak and the OS's oomkiller is assassinating your process for bad behavior.

grep oom/var/log/messages.如果您看到消息,那就是您的问题.

grep oom /var/log/messages. If you see messages, that's your problem.

如果没有找到任何东西,请尝试在 shell 中手动运行定期进程:

If you don't find anything, try running the periodic process manually in a shell:

MyPeriodicTask().run()

然后看看会发生什么.如果您没有针对此主机的仙人掌、神经节等良好工具,我会在另一个终端中从顶部监控系统和流程指标.

And see what happens. I'd monitor system and process metrics from top in another terminal, if you don't have good instrumentation like cactus, ganglia, etc for this host.