Skip to content

HDFS-17722. DataNode stuck decommissioning on standby NameNode due to excess replica timing race.#8295

Open
balodesecurity wants to merge 6 commits intoapache:trunkfrom
balodesecurity:HDFS-17722
Open

HDFS-17722. DataNode stuck decommissioning on standby NameNode due to excess replica timing race.#8295
balodesecurity wants to merge 6 commits intoapache:trunkfrom
balodesecurity:HDFS-17722

Conversation

@balodesecurity
Copy link
Copy Markdown
Contributor

Problem

On a standby NameNode, a DataNode can get stuck in the DECOMMISSION_INPROGRESS state indefinitely when a timing race causes a replica to be flagged as excess instead of live during decommissioning.

Sequence:

  1. File is written to DN-A, DN-B, DN-C (RF=3).
  2. DN-A is marked for decommission.
  3. The block manager schedules re-replication → copies a new replica to DN-D.
  4. On the standby NN, the block report for DN-D arrives before the decommission state for DN-A is propagated. The standby marks DN-D's replica as excess (it looks like an over-replicated block).
  5. The decommission monitor on the standby calls isSufficient(): numLive=2 (DN-B, DN-C) satisfies RF=3? No. It sees only 2 live copies, so decommission stalls.
  6. Meanwhile DN-A is never fully decommissioned because isSufficient() never returns true.

The excess replica on DN-D is a physically present block copy and contributes to durability — ignoring it causes the deadlock.

Fix

In DatanodeAdminManager.isSufficient(), count excess replicas alongside live replicas for the sufficiency check on non-under-construction blocks:

final int numLiveAndExcess = numLive + numberReplicas.excessReplicas();
if (numLiveAndExcess >= blockManager.getDefaultStorageNum(block)
    && blockManager.hasMinStorage(block, numLive)) {
  return true;
}

The hasMinStorage guard (checks dfs.replication.min, default 1) ensures decommission does not proceed if zero live replicas exist — excess-only replicas are not guaranteed durable. After decommission completes, if the excess replica on DN-D is subsequently deleted, the block manager's normal under-replication detection will schedule re-replication.

Testing

Unit testsTestDatanodeAdminManagerIsSufficient (5 tests, no cluster required):

Test Scenario Expected
testExcessReplicaCountsTowardSufficiency HDFS-17722 bug: live=1, excess=1, RF=2 true
testNormalDecommissionStillSufficient Baseline: live=2, excess=0, RF=2 true
testNoLiveReplicaBlocksDecommission Safety guard: live=0, excess=2, RF=2 false
testInsufficientEvenWithExcess live=0, excess=1, RF=2 — not enough either way false
testExcessAboveRFWithMinLive live=1, excess=2, RF=2 — excess over-covers RF true
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0

Docker integration — 3-DataNode cluster with 1 NameNode and RF=3, 5 scenarios:

  • Scenario 1: Clean decommission (RF=2) — PASS
  • Scenario 2: RF=3→2 creates excess replicas, then decommission DN2 — PASS
  • Scenario 3: Same scenario on DN3 — PASS
  • Scenario 4: Repeated decommission + recommission cycles (3 rounds) — PASS
  • Scenario 5: Data integrity check after decommission — PASS
Results: 0 failure(s) — ALL TESTS PASSED

Related

… excess replica timing race.

In HA mode, a timing race can cause the standby NN to incorrectly mark a
replica as excess before it learns that a DataNode is decommissioning. This
leaves the standby's isSufficient() check permanently returning false
(live=1 < RF=2), so the decommission monitor never calls setDecommissioned()
and logs under-replication warnings indefinitely.

Fix: in isSufficient(), count excess replicas (physically-present block
copies) alongside live replicas when checking decommission sufficiency for
non-UC blocks. A hasMinStorage guard ensures at least dfs.replication.min
live copies exist for durability. If the excess replica is later deleted,
the block manager detects under-replication and schedules re-replication.
…cess replica fix.

Tests cover:
- Bug scenario: live=1 + excess=1 >= RF=2 → decommission allowed (HDFS-17722 fix)
- Normal case: live=2, excess=0 → decommission allowed (not broken by fix)
- Safety guard: live=0, excess=2 → decommission blocked (no durable copy)
- Insufficient even with excess: live=0 + excess=1 < RF=2 → blocked
- Excess above RF with min live: live=1 + excess=2 >= RF=2, live >= min → allowed
@balodesecurity
Copy link
Copy Markdown
Contributor Author

Docker Integration Test Results

Tested on a 3-DataNode Docker cluster (1 NameNode + 3 DataNodes, RF=3, balodesecurity/hadoop HDFS-17722 branch):

--- Scenario 1: Clean decommission (RF=2, decom DN2) ---
  [PASS] DN2 decommissioned cleanly (RF=2)

--- Scenario 2: HDFS-17722 — RF=3→2 creates EXCESS, then decom DN2 ---
  [PASS] DN2 decommissioned with EXCESS replicas present (HDFS-17722 FIX VERIFIED!)
  [PASS] All 3 files accessible after decommission

--- Scenario 3: HDFS-17722 on DN3 (variant) ---
  [PASS] DN3 decommissioned with EXCESS replicas (HDFS-17722 fix verified on DN3)

--- Scenario 4: Repeated decom/recommission cycles (3 rounds) ---
  [PASS] Round 1: DN2 decommissioned + recommissioned (Normal)
  [PASS] Round 2: DN2 decommissioned + recommissioned (Normal)
  [PASS] Round 3: DN2 decommissioned + recommissioned (Normal)

--- Scenario 5: Data integrity after decommission ---
  [PASS] DN2 decommissioned
  [PASS] Data integrity OK: content matches

Results: 0 failure(s) — ALL TESTS PASSED

Note on replicating the bug naturally: In a single-NameNode setup the race does not occur naturally (the block manager processes setrep deletions before the decommission check runs in the same thread). The bug is specific to the standby NameNode path. The unit tests in TestDatanodeAdminManagerIsSufficient directly exercise isSufficient() with the exact replica counts that trigger the deadlock. The Docker tests verify no regression in normal decommission behavior.

@balodesecurity
Copy link
Copy Markdown
Contributor Author

CI failed due to Jenkins OOM kill (exit code 137) — unrelated to the patch. Requesting retest.

/retest

@hadoop-yetus
Copy link
Copy Markdown

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 27m 30s trunk passed
+1 💚 compile 0m 56s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 0m 57s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 0m 58s trunk passed
+1 💚 mvnsite 1m 3s trunk passed
+1 💚 javadoc 0m 49s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 50s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 2m 30s trunk passed
+1 💚 shadedclient 18m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 49s the patch passed
+1 💚 compile 0m 46s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 0m 46s the patch passed
+1 💚 compile 0m 49s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 0m 49s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 46s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 mvnsite 0m 53s the patch passed
+1 💚 javadoc 0m 36s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 38s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 2m 29s the patch passed
+1 💚 shadedclient 17m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 174m 48s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 asflicense 0m 30s The patch does not generate ASF License warnings.
253m 55s
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/2/artifact/out/Dockerfile
GITHUB PR #8295
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 086f748bd33c 5.15.0-141-generic #151-Ubuntu SMP Sun May 18 21:35:19 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5b9b0ff
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/2/testReport/
Max. process+thread count 3556 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/2/console
versions git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link
Copy Markdown

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 27m 5s trunk passed
+1 💚 compile 0m 54s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 0m 56s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 0m 55s trunk passed
+1 💚 mvnsite 1m 0s trunk passed
+1 💚 javadoc 0m 49s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 48s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
-1 ❌ spotbugs 1m 5s /branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in trunk failed.
+1 💚 shadedclient 21m 8s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 44s the patch passed
+1 💚 compile 0m 40s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 0m 40s the patch passed
+1 💚 compile 0m 41s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 0m 41s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 39s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1)
+1 💚 mvnsite 0m 45s the patch passed
+1 💚 javadoc 0m 32s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 34s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
-1 ❌ spotbugs 0m 42s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 shadedclient 21m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 177m 1s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
252m 24s
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/4/artifact/out/Dockerfile
GITHUB PR #8295
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ae5715b647a1 5.15.0-173-generic #183-Ubuntu SMP Fri Mar 6 13:29:34 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2432aa2
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/4/testReport/
Max. process+thread count 4328 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/4/console
versions git=2.43.0 maven=3.9.11
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link
Copy Markdown

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 27m 10s trunk passed
+1 💚 compile 0m 58s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 0m 57s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 0m 58s trunk passed
+1 💚 mvnsite 1m 2s trunk passed
+1 💚 javadoc 0m 48s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 48s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 2m 24s trunk passed
+1 💚 shadedclient 18m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 49s the patch passed
+1 💚 compile 0m 46s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 0m 46s the patch passed
+1 💚 compile 0m 51s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 0m 51s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 46s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 mvnsite 0m 52s the patch passed
+1 💚 javadoc 0m 36s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 39s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 2m 30s the patch passed
+1 💚 shadedclient 18m 17s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 179m 34s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 31s The patch does not generate ASF License warnings.
259m 10s
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/5/artifact/out/Dockerfile
GITHUB PR #8295
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 59f205fc8622 5.15.0-171-generic #181-Ubuntu SMP Fri Feb 6 22:44:50 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 24427d3
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/5/testReport/
Max. process+thread count 4688 (vs. ulimit of 10000)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8295/5/console
versions git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants