-
Notifications
You must be signed in to change notification settings - Fork 6
Description
We have an mfserv plugin that responds to HTTP requests using data fetched from a database, and we want to control database connections closures.
Normally, this plugin is configured so that circus spawns 8 processes, lasting 1 hour each. The process hierarchy is as such : circus -> signal_wrapper.py -> bjoern_wrapper.py. When max_age is reached, circus sends a signal to signal_wrapper.py, which forwards it to bjoern_wrapper.py, and eventually to our plugin. We understand that SIGTERM is first sent, and then after some timeout (3s according to the arguments given to signal_wrapper.py), SIGKILL is sent.
We have implemented a function in our plugin that is triggered upon signal reception. It simply retrieves the database connection and closes it. In practice, we see that SIGTERM is well caught and that the function starts. However, we can tell from the log that the function is never fully executed, and SIGKILL is eventually sent after 3s.
From the log messages, it is clear that the function did not exceed the timeout as the part of the function that was executed lasted ~0.4ms, so it is not clear why it does not run until the end.
@mrechte and I tried to investigate the propagation of signals through processes, but it seems that everything works normally. @matthieumarrast and I tried to dive deeper in the code of the socket_down function in signal_wrapper ; an idea would be to keep the socket busy during the execution of our exit handler, but this remains to be tested, and this would still look like a patch-up job. If anybody has a better understanding of what happens when circus terminates processes, it would be much appreciated.
Steps to reproduce (mfserv 2.2) :
- create an mfserv plugin for a python WSGI application,
- implement an exit handler function and use
signalfrom the standard library to bind it to signal catch; make the function long enough (e.g. usingtime.sleep()) and print messages in the log file, - observe the logs; if the function is long enough, not all messages will be in the log, indicating that the function has not been fully executed.