On a rare occasion it appears that workers somehow end up processing all of the requests twice and therefore end up locking themselves up. This is the cause for the random worker lock up, but I'm not sure why it happens.
warn{"timestamp":"2025-10-22T09:47:21.604176Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"f6bf3a8df3a1b6bc5d7fe27ae5419ace6900fc3cc414226c29b9e0eb03bc645b-5264\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.604176Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"f6bf3a8df3a1b6bc5d7fe27ae5419ace6900fc3cc414226c29b9e0eb03bc645b-5264\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.614965Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"e928292e301bc3ac4d92adb91ed6c4c0f970dbd1f25e8770dd5bd169f640b75f-13180\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.614965Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"e928292e301bc3ac4d92adb91ed6c4c0f970dbd1f25e8770dd5bd169f640b75f-13180\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.636644Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"4111923225b3a30fda94fdc797ace9d6561bcfb4fe8afac60e8bc4b41eea9173-13249\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.636644Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"4111923225b3a30fda94fdc797ace9d6561bcfb4fe8afac60e8bc4b41eea9173-13249\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.639999Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"a70c0ab328b077af31640a2f880b9da5ee39730626eae7869bc791271a961943-5887\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.639999Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"a70c0ab328b077af31640a2f880b9da5ee39730626eae7869bc791271a961943-5887\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.677947Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"26d6ae6eb8c14431e2e931d35b1ab90f8c369452885c002f4c654770ff2b4742-13242\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.677947Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"26d6ae6eb8c14431e2e931d35b1ab90f8c369452885c002f4c654770ff2b4742-13242\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.678043Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"96f50fb4fba62a463cf13c9586e2c1b953af94263744a8094850ba8569b3752c-5754\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.678043Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"96f50fb4fba62a463cf13c9586e2c1b953af94263744a8094850ba8569b3752c-5754\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.678292Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"6a6b7c3a06506bcdc570529758eee869e5f9d5dc6ac0c77ff584019d88af9d7d-5789\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.678292Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"6a6b7c3a06506bcdc570529758eee869e5f9d5dc6ac0c77ff584019d88af9d7d-5789\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.895726Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"bc9c9905197ec6852f8fde799daaf413aab7fba337861924a23f9aa4aa2d70c8-6052\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
warn{"timestamp":"2025-10-22T09:47:21.895726Z","level":"WARN","fields":{"message":"Running prepare_action for DigestInfo(\"bc9c9905197ec6852f8fde799daaf413aab7fba337861924a23f9aa4aa2d70c8-6052\")"},"target":"nativelink_worker::running_actions_manager","span":{"name":"worker_start_action"},"spans":[{"name":"\"worker_0\"","name":"worker"},{"name":"worker_start_action"}]}
Having looked at the command digests, there's nothing special about them. So it would appear that somehow the worker has managed to register twice with the scheduler or something?
On a rare occasion it appears that workers somehow end up processing all of the requests twice and therefore end up locking themselves up. This is the cause for the random worker lock up, but I'm not sure why it happens.
Here's an example log:
This log came from a worker with this added to the start of
inner_prepare_action:Having looked at the command digests, there's nothing special about them. So it would appear that somehow the worker has managed to register twice with the scheduler or something?