Commit 874a6b7
[CELEBORN-1838] Interrupt spark task should not report fetch failure
What changes were proposed in this pull request?
Do not trigger fetch failure if a spark task attempt is interrupted(speculation enabled).
Do not trigger fetch failure if the RPC of getReducerFileGroup is timeout.
This PR is intended for celeborn-0.5 branch.
Why are the changes needed?
Avoid unnecessary fetch failures and stage re-runs.
Does this PR introduce any user-facing change?
NO.
How was this patch tested?
1. GA.
2. Manually tested on cluster with spark speculation tasks.
Here is the test case
```scala
sc.parallelize(1 to 100, 100).flatMap(i => {
(1 to 150000).iterator.map(num => num)
}).groupBy(i => i, 100)
.map(i => {
if (i._1 < 5) {
Thread.sleep(15000)
}
i
})
.repartition(400).count
```
<img width="1384" alt="截屏2025-01-18 16 16 16" src="https://github.com/user-attachments/assets/adf64857-5773-4081-a7d0-fa3439e751eb" />
<img width="1393" alt="截屏2025-01-18 16 16 22" src="https://github.com/user-attachments/assets/ac9bf172-1ab4-4669-a930-872d009f2530" />
<img width="1258" alt="截屏2025-01-18 16 19 15" src="https://github.com/user-attachments/assets/6a8ff3e1-c1fb-4ef2-84d8-b1fc6eb56fa6" />
<img width="892" alt="截屏2025-01-18 16 17 27" src="https://github.com/user-attachments/assets/f9de3841-f7d4-4445-99a3-873235d4abd0" />
Closes apache#3070 from FMX/branch-0.5-b1838.
Authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>1 parent 39a40dd commit 874a6b7
9 files changed
Lines changed: 292 additions & 27 deletions
File tree
- client-flink/common/src/main/java/org/apache/celeborn/plugin/flink/client
- client-spark/spark-3-4
- src
- main/scala/org/apache/spark/shuffle/celeborn
- test/scala/org/apache/spark/shuffle/celeborn
- client/src
- main/java/org/apache/celeborn/client
- test/java/org/apache/celeborn/client
- project
- tests/spark-it/src/test/scala/org/apache/celeborn/tests/spark
Lines changed: 6 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
265 | 265 | | |
266 | 266 | | |
267 | 267 | | |
268 | | - | |
269 | | - | |
270 | | - | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
271 | 271 | | |
272 | 272 | | |
273 | 273 | | |
| |||
281 | 281 | | |
282 | 282 | | |
283 | 283 | | |
284 | | - | |
| 284 | + | |
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | 288 | | |
289 | | - | |
| 289 | + | |
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
94 | 99 | | |
95 | 100 | | |
Lines changed: 27 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | | - | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
| |||
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
36 | | - | |
| 38 | + | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
43 | | - | |
| 45 | + | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| |||
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
60 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
61 | 65 | | |
62 | 66 | | |
63 | 67 | | |
| |||
111 | 115 | | |
112 | 116 | | |
113 | 117 | | |
114 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
115 | 121 | | |
116 | 122 | | |
117 | 123 | | |
| |||
370 | 376 | | |
371 | 377 | | |
372 | 378 | | |
373 | | - | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
374 | 395 | | |
375 | 396 | | |
376 | 397 | | |
| |||
Lines changed: 95 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| 59 | + | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
| |||
181 | 184 | | |
182 | 185 | | |
183 | 186 | | |
| 187 | + | |
184 | 188 | | |
185 | 189 | | |
186 | 190 | | |
| |||
Lines changed: 17 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
170 | 171 | | |
171 | 172 | | |
172 | 173 | | |
173 | | - | |
| 174 | + | |
174 | 175 | | |
175 | 176 | | |
176 | 177 | | |
| |||
1742 | 1743 | | |
1743 | 1744 | | |
1744 | 1745 | | |
1745 | | - | |
| 1746 | + | |
1746 | 1747 | | |
1747 | 1748 | | |
1748 | 1749 | | |
1749 | 1750 | | |
| 1751 | + | |
1750 | 1752 | | |
1751 | 1753 | | |
1752 | 1754 | | |
| |||
1768 | 1770 | | |
1769 | 1771 | | |
1770 | 1772 | | |
1771 | | - | |
| 1773 | + | |
1772 | 1774 | | |
1773 | 1775 | | |
| 1776 | + | |
1774 | 1777 | | |
1775 | 1778 | | |
1776 | 1779 | | |
| |||
1779 | 1782 | | |
1780 | 1783 | | |
1781 | 1784 | | |
1782 | | - | |
| 1785 | + | |
1783 | 1786 | | |
1784 | 1787 | | |
| 1788 | + | |
1785 | 1789 | | |
1786 | 1790 | | |
1787 | 1791 | | |
| |||
1800 | 1804 | | |
1801 | 1805 | | |
1802 | 1806 | | |
| 1807 | + | |
1803 | 1808 | | |
1804 | | - | |
| 1809 | + | |
1805 | 1810 | | |
1806 | 1811 | | |
1807 | 1812 | | |
| |||
1814 | 1819 | | |
1815 | 1820 | | |
1816 | 1821 | | |
1817 | | - | |
| 1822 | + | |
1818 | 1823 | | |
1819 | 1824 | | |
1820 | 1825 | | |
1821 | | - | |
| 1826 | + | |
1822 | 1827 | | |
1823 | 1828 | | |
1824 | 1829 | | |
1825 | 1830 | | |
1826 | 1831 | | |
1827 | | - | |
| 1832 | + | |
1828 | 1833 | | |
1829 | | - | |
| 1834 | + | |
| 1835 | + | |
1830 | 1836 | | |
1831 | | - | |
| 1837 | + | |
1832 | 1838 | | |
1833 | 1839 | | |
1834 | 1840 | | |
| |||
1899 | 1905 | | |
1900 | 1906 | | |
1901 | 1907 | | |
1902 | | - | |
| 1908 | + | |
1903 | 1909 | | |
1904 | 1910 | | |
1905 | 1911 | | |
| |||
0 commit comments