fix: update status label should call after task is started#4886
fix: update status label should call after task is started#4886ningmingxiao wants to merge 1 commit into
Conversation
863667c to
62a5e09
Compare
6c4c518 to
7acc864
Compare
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
|
How to test? |
|
I can add some sleep or use dlv to reproduce it tomorrow. The container use restart labels, I forget to show it. |
|
I create local branch to reproduce it nerdctl run --restart=always -d busybox sleep 1000 (nerdctl exited sucessfully) containerd log find container will exited and recreated I also create a pr to record events containerd/containerd#13324 |
|
I’m not a committer, but since this is a strange issue, I took a look at the PR. For example, by adding the following to [plugins."io.containerd.internal.v1.restart"]
interval = "100ms"So the race itself does seem to exist. However, with the default interval of How did you discover this issue in the first place, and in what situation would not being able to resolve it cause problems? Is there some critical scenario behind it? |
|
our user use nerdctl create container and then use nerdctl exec it to check something, but exec failed. It happened several times and I add some debug log at cotainerd kill api.I also use sysctl kernel.monitor_signals=0x100 to enable trace signal 9 then find container main process is killed by shim not because of oom and I also print all container events you can see my pr for containerd @haytok |
fix: containerd/containerd#13350
how it happened
nerdctl run -d --restart=always busybox sleep 10000
step 1
after taskutil.NewTask task is created
setp 2
containerd find desiredStatus is running but task != running (task status is created)
step 3. nerdctl start the task sucessfully
step 4 containerd will kill and delete task (containerd find task is not running but actually is running )
step5 : nerdctl exec failed
step6: containerd recreate the task
@AkihiroSuda @ChengyuZhu6 can you take a look ?