-
Notifications
You must be signed in to change notification settings - Fork 970
Description
Description
A user in Discord recently posted the following log:
Apr 28 19:09:08.447 CRIT Beacon block processing error error: ValidatorPubkeyCacheLockTimeout, service: beacon
Apr 28 19:09:08.447 WARN BlockProcessingFailure outcome: ValidatorPubkeyCacheLockTimeout, msg: unexpected condition in processing block.
This user was not the first one, I've personally seen quite a few users report this same rror.
The validator_pubkey_cache is protected by a TimeoutRwLock. This means there can only be lock contention when something holds the write lock.
I think (it's worth confirming) this is the only place we take a write-lock on the pubkey cache:
lighthouse/beacon_node/beacon_chain/src/beacon_chain.rs
Lines 1633 to 1641 in 3a24ca5
| // If there are new validators in this block, update our pubkey cache. | |
| // | |
| // We perform this _before_ adding the block to fork choice because the pubkey cache is | |
| // used by attestation processing which will only process an attestation if the block is | |
| // known to fork choice. This ordering ensure that the pubkey cache is always up-to-date. | |
| self.validator_pubkey_cache | |
| .try_write_for(VALIDATOR_PUBKEY_CACHE_LOCK_TIMEOUT) | |
| .ok_or(Error::ValidatorPubkeyCacheLockTimeout)? | |
| .import_new_pubkeys(&state)?; |
I think (also worth confirming) that bulk of the time holding the write-lock is spent inside this function:
lighthouse/beacon_node/beacon_chain/src/validator_pubkey_cache.rs
Lines 124 to 169 in 3a24ca5
| /// Adds zero or more validators to `self`. | |
| fn import<I>(&mut self, validator_keys: I) -> Result<(), BeaconChainError> | |
| where | |
| I: Iterator<Item = PublicKeyBytes> + ExactSizeIterator, | |
| { | |
| self.pubkey_bytes.reserve(validator_keys.len()); | |
| self.pubkeys.reserve(validator_keys.len()); | |
| self.indices.reserve(validator_keys.len()); | |
| for pubkey in validator_keys { | |
| let i = self.pubkeys.len(); | |
| if self.indices.contains_key(&pubkey) { | |
| return Err(BeaconChainError::DuplicateValidatorPublicKey); | |
| } | |
| // The item is written to disk _before_ it is written into | |
| // the local struct. | |
| // | |
| // This means that a pubkey cache read from disk will always be equivalent to or | |
| // _later than_ the cache that was running in the previous instance of Lighthouse. | |
| // | |
| // The motivation behind this ordering is that we do not want to have states that | |
| // reference a pubkey that is not in our cache. However, it's fine to have pubkeys | |
| // that are never referenced in a state. | |
| match &mut self.backing { | |
| PubkeyCacheBacking::File(persistence_file) => { | |
| persistence_file.append(i, &pubkey)?; | |
| } | |
| PubkeyCacheBacking::Database(store) => { | |
| store.put_item(&DatabasePubkey::key_for_index(i), &DatabasePubkey(pubkey))?; | |
| } | |
| } | |
| self.pubkeys.push( | |
| (&pubkey) | |
| .try_into() | |
| .map_err(BeaconChainError::InvalidValidatorPubkeyBytes)?, | |
| ); | |
| self.pubkey_bytes.push(pubkey); | |
| self.indices.insert(pubkey, i); | |
| } | |
| Ok(()) | |
| } |
So, it stands to reason that if we can optimize out some time spent in ValidatorPubkeyCache::import then we can avoid ValidatorPubkeyCacheLockTimeout errors.
I think the first step should be to add metrics to ValidatorPubkeyCache::import and determine what's taking the longest.
Potential Optimization
I'm guessing that the write-to-disk part is taking the longest.
It's really important that we retain the current behaviour where the on-disk "backing" is updated somewhat-atomically with the internal Vecs and HashMap.
One potential option is to have a "prepare" step where we can copy the backing to a new random file and add the new keys to it. Then, we have a "commit" phase where we replace the old backing with the new (only a single file copy operation) and update the internal structures.
The "prepare" step only needs a read-lock, whist the "commit" still needs a write-lock (but is hopefully much faster).
Once we determine it is indeed writing to disk that's slow we can expand upon this.