Skip to content

Conversation

@kpop-dfinity
Copy link
Contributor

@kpop-dfinity kpop-dfinity commented Jan 30, 2026

Background

During subnet splitting, when replicas detect that a subnet should be split at a summary height H, they will immediately start producing CUP shares at height H + dkg_interval_length, according to the post-split subnet memberships, i.e.
half of the node will produce the shares as members of the original subnet, and the other half as the member of the new subnet.

Implement the logic of subnet splitting CUP share creation, validation, and aggregation.

CUP shares creation

When a replica detects that a "subnet splitting" block (denoted by a new field, subnet-splitting-status in the DkgSummary struct) has been finalized and the state at the block's height has been committed, it will create a CUP in the following way:

  1. Create a new summary block, on the fly, at the next CUP height, with dkg transcript taken from the registry;
  2. Create a dummy random beacon at the next CUP height,
  3. Check if node belongs to the high threshold committee, based on the transcript from the newly created summary block, if not, return None,
  4. Create a CUP, with the block and random beacon created in steps 1. and 2., respectively, and sign it using the key material from the block.

CUP shares validation

Validation works in the following way:

  1. Check if the hash of the block in the CUP share matches the recreated summary block's hash
  2. Check if the random beacon in the CUP share matches the recreated random beacon
  3. Check if the signature can be verified using the key material from the recreated summary block
  4. Check if the state hash matches our local state's hash

Note

As the field subnet_splitting_status is not yet population, the new code branches are never executed.

@kpop-dfinity kpop-dfinity added the CI_ALL_BAZEL_TARGETS Runs all bazel targets and uploads them to S3 label Jan 30, 2026
@github-actions github-actions bot added the feat label Jan 30, 2026
@kpop-dfinity kpop-dfinity changed the title feat(subnet-splitting): splitting cup share making/validating/aggregating feat(consenus):[CON-1565] splitting cup share making/validating/aggregating Feb 2, 2026
@kpop-dfinity kpop-dfinity changed the title feat(consenus):[CON-1565] splitting cup share making/validating/aggregating feat(consenus): [CON-1565] splitting cup share making/validating/aggregating Feb 3, 2026
@kpop-dfinity kpop-dfinity marked this pull request as ready for review February 3, 2026 13:51
@kpop-dfinity kpop-dfinity requested review from a team as code owners February 3, 2026 13:51
height,
initial_dkg_attempts,
} = self;

Copy link
Contributor

@eichhorl eichhorl Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to do something like this?

Suggested change
if let Some(status) = subnet_splitting_status {
status.hash(state);
}

assert!(
!self.records.is_empty(),
"Cannot setup a registry without records."
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think self.records[0] two lines above would have panicked before reaching this assert.

use std::sync::Arc;

/// CatchUpPackage maker is responsible for creating beacon shares
/// [`CatchUpPackage`] maker is responsible for creating beacon shares
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// [`CatchUpPackage`] maker is responsible for creating beacon shares
/// [`CatchUpPackage`] maker is responsible for creating catch up package shares

Comment on lines +683 to +696
SubnetSplittingError(String),
}

/// Reasons for why a dkg payload might be invalid.
#[derive(PartialEq, Debug)]
#[allow(clippy::large_enum_variant)]
pub enum InvalidDkgPayloadReason {
CryptoError(CryptoError),
DkgVerifyDealingError(DkgVerifyDealingError),
MismatchedDkgSummary(DkgSummary, DkgSummary),
MissingDkgConfigForDealing,
DkgStartHeightDoesNotMatchParentBlock,
DkgSummaryAtNonStartHeight(Height),
SubnetSplittingNotEnabled,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "not enabled" mean? I don't think these variants are used anywhere (yet?).

subnet_id: SubnetId,
registry: &dyn RegistryClient,
registry_version: RegistryVersion,
subnet_splitting_status: Option<SubnetSplittingStatus>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the Option part will disappear eventually? Do I understand correctly that calling this function with SubnetSplittingStatus::Scheduled { .. } doesn't make sense? Maybe we could change the argument to Option<SubnetId> where None corresponds to SubnetSplittingStatus::NotScheduled and Some(id) corresponds to SubnetSplittingStatus::Done { id }?

Or we could use the new CatchUpPackageType enum from below


pub struct DependenciesBuilder {
pool_config: ArtifactPoolConfig,
records: Vec<(u64, SubnetId, SubnetRecord)>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
records: Vec<(u64, SubnetId, SubnetRecord)>,
subnet_records: Vec<(u64, SubnetId, SubnetRecord)>,

Comment on lines +192 to +198
if let Some(last_version) = last_version {
add_subnet_list_record(
&registry_data_provider,
last_version,
Vec::from_iter(subnet_ids),
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this twice if last_version != version (see line 174)?

cup_block: &Block,
cup_type: CatchUpPackageType,
) -> Result<NiDkgId, String> {
// TODO: can we always take the transcript from the block?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively I would say yes, because the summary block is the first block of the current interval

}

let cup_content =
CatchUpContent::from_share_content(shares[0].content.clone(), block.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this clone necessary?

Suggested change
CatchUpContent::from_share_content(shares[0].content.clone(), block.clone());
CatchUpContent::from_share_content(shares[0].content.clone(), block);

fn aggregate_catch_up_package_shares_for_summary_block(
&self,
pool: &PoolReader<'_>,
summary_block: &Block,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
summary_block: &Block,
summary_block: Block,

.get_catch_up_package_shares(block.height())
.collect::<Vec<_>>();

if shares.len() < threshold {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess despite this check we might still not have enough shares below, because get_catch_up_package_shares above might return shares created by the other half of the subnet?

Or would the signature verification fail when validating those shares?

InvalidArtifactReason::InvalidHeightInSplittingCatchUpPackageShare.into(),
);
}
CatchUpPackageType::PostSplit { .. } | CatchUpPackageType::Normal => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we ever expect to see the CatchUpPackageType::PostSplit { .. } case here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI_ALL_BAZEL_TARGETS Runs all bazel targets and uploads them to S3 @consensus feat @ic-interface-owners

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants