Skip to content

Conversation

@wuwenchi
Copy link
Contributor

@wuwenchi wuwenchi commented Jun 5, 2025

What problem does this PR solve?

Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation based on the previous snapshot. This operation reads manifest files, and the file reading process uses a global thread pool. However, users may have their own authentication information, which requires the use of doAs to ensure context. Therefore, the thread pool provided by Iceberg cannot be used.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jun 5, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@wuwenchi wuwenchi marked this pull request as ready for review June 5, 2025 03:08
@wuwenchi
Copy link
Contributor Author

wuwenchi commented Jun 5, 2025

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 5, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 5, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@doris-robot
Copy link

TPC-H: Total hot run time: 33791 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 18ccbe544fddd8157084770d3790dade9db8eca6, data reload: false

------ Round 1 ----------------------------------
q1	26304	5109	5023	5023
q2	1988	286	196	196
q3	10284	1237	685	685
q4	10234	1015	521	521
q5	7548	2339	2378	2339
q6	177	165	136	136
q7	875	726	599	599
q8	9297	1254	1111	1111
q9	6834	5032	5114	5032
q10	6837	2306	1880	1880
q11	493	289	284	284
q12	349	348	213	213
q13	17796	3681	3074	3074
q14	236	226	214	214
q15	557	485	496	485
q16	433	438	373	373
q17	614	868	385	385
q18	7632	7170	7108	7108
q19	1388	938	546	546
q20	335	335	229	229
q21	3855	3209	2391	2391
q22	1085	1043	967	967
Total cold run time: 115151 ms
Total hot run time: 33791 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5119	5082	5070	5070
q2	244	320	222	222
q3	2165	2664	2278	2278
q4	1338	1811	1355	1355
q5	4514	4421	4390	4390
q6	216	173	128	128
q7	1981	1893	1731	1731
q8	2570	2649	2525	2525
q9	7098	7105	7137	7105
q10	2968	3199	2756	2756
q11	561	520	533	520
q12	689	758	641	641
q13	3490	3911	3273	3273
q14	285	301	284	284
q15	518	484	481	481
q16	442	483	433	433
q17	1121	1573	1364	1364
q18	7789	7535	7297	7297
q19	794	774	911	774
q20	1987	2026	1884	1884
q21	4742	4291	4363	4291
q22	1103	1021	1011	1011
Total cold run time: 51734 ms
Total hot run time: 49813 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185994 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 18ccbe544fddd8157084770d3790dade9db8eca6, data reload: false

query1	1006	488	500	488
query2	6593	1827	1824	1824
query3	6743	223	222	222
query4	25645	24160	23040	23040
query5	4298	605	464	464
query6	306	218	189	189
query7	4619	484	284	284
query8	256	218	212	212
query9	8593	2638	2645	2638
query10	471	335	273	273
query11	15196	15153	14870	14870
query12	155	109	112	109
query13	1652	531	412	412
query14	8748	6225	6288	6225
query15	208	199	164	164
query16	7127	660	484	484
query17	954	736	587	587
query18	1973	417	313	313
query19	201	193	168	168
query20	123	121	121	121
query21	218	129	113	113
query22	4229	4267	3957	3957
query23	34004	33177	33100	33100
query24	8487	2367	2368	2367
query25	542	459	429	429
query26	1252	266	149	149
query27	2758	498	336	336
query28	4327	2135	2127	2127
query29	785	557	432	432
query30	283	220	186	186
query31	934	849	767	767
query32	74	69	64	64
query33	562	395	321	321
query34	790	835	527	527
query35	764	802	738	738
query36	946	970	873	873
query37	113	104	79	79
query38	4057	4195	4063	4063
query39	1482	1410	1419	1410
query40	211	118	108	108
query41	65	71	68	68
query42	132	110	106	106
query43	514	500	483	483
query44	1314	825	820	820
query45	180	176	173	173
query46	839	1006	626	626
query47	1735	1774	1711	1711
query48	392	419	303	303
query49	754	516	394	394
query50	635	674	404	404
query51	4108	4252	4214	4214
query52	110	107	103	103
query53	224	256	183	183
query54	572	579	506	506
query55	84	80	86	80
query56	317	320	298	298
query57	1147	1161	1072	1072
query58	284	259	249	249
query59	2512	2637	2605	2605
query60	320	317	298	298
query61	127	128	126	126
query62	807	706	625	625
query63	224	185	187	185
query64	4354	1026	724	724
query65	4222	4160	4189	4160
query66	1145	424	311	311
query67	15751	15440	15480	15440
query68	7934	872	519	519
query69	459	314	269	269
query70	1151	1086	1129	1086
query71	422	340	299	299
query72	5620	4726	4965	4726
query73	653	635	356	356
query74	9117	9162	8911	8911
query75	3552	3225	2685	2685
query76	3431	1195	765	765
query77	779	382	323	323
query78	9971	10096	9340	9340
query79	2014	834	578	578
query80	586	504	437	437
query81	462	259	221	221
query82	260	131	100	100
query83	253	255	239	239
query84	290	107	93	93
query85	906	358	323	323
query86	360	326	290	290
query87	4476	4517	4345	4345
query88	3692	2327	2292	2292
query89	394	311	278	278
query90	1921	214	218	214
query91	147	145	111	111
query92	77	60	63	60
query93	1590	928	575	575
query94	654	380	308	308
query95	383	298	281	281
query96	492	568	286	286
query97	2711	2765	2685	2685
query98	239	211	207	207
query99	1349	1407	1292	1292
Total cold run time: 270898 ms
Total hot run time: 185994 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 18ccbe544fddd8157084770d3790dade9db8eca6, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.11
query3	0.25	0.20	0.20
query4	1.59	0.20	0.10
query5	0.45	0.42	0.44
query6	1.13	0.66	0.66
query7	0.03	0.01	0.02
query8	0.05	0.04	0.03
query9	0.60	0.52	0.50
query10	0.57	0.58	0.56
query11	0.16	0.10	0.11
query12	0.15	0.11	0.12
query13	0.62	0.60	0.59
query14	0.78	0.81	0.82
query15	0.89	0.86	0.85
query16	0.38	0.39	0.38
query17	1.04	1.01	1.03
query18	0.23	0.21	0.20
query19	1.93	1.81	1.84
query20	0.01	0.01	0.02
query21	15.40	0.92	0.53
query22	0.75	1.18	0.67
query23	14.97	1.41	0.62
query24	7.16	0.95	0.78
query25	0.48	0.18	0.08
query26	0.72	0.16	0.15
query27	0.05	0.04	0.05
query28	9.34	0.91	0.45
query29	12.54	4.03	3.28
query30	0.25	0.09	0.06
query31	2.82	0.61	0.40
query32	3.24	0.55	0.47
query33	3.09	3.15	3.12
query34	15.78	5.17	4.56
query35	4.55	4.54	4.52
query36	0.68	0.49	0.49
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.32 s
Total hot run time: 29.01 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉
Increment coverage report
Complete coverage report

@wuwenchi
Copy link
Contributor Author

wuwenchi commented Jun 5, 2025

run feut

@CalvinKirs CalvinKirs merged commit 77d7e08 into apache:master Jun 5, 2025
34 of 35 checks passed
github-actions bot pushed a commit that referenced this pull request Jun 5, 2025
### What problem does this PR solve?

Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation
based on the previous snapshot. This operation reads manifest files, and
the file reading process uses a global thread pool. However, users may
have their own authentication information, which requires the use of
doAs to ensure context. Therefore, the thread pool provided by Iceberg
cannot be used.

### Release note

None
github-actions bot pushed a commit that referenced this pull request Jun 29, 2025
### What problem does this PR solve?

Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation
based on the previous snapshot. This operation reads manifest files, and
the file reading process uses a global thread pool. However, users may
have their own authentication information, which requires the use of
doAs to ensure context. Therefore, the thread pool provided by Iceberg
cannot be used.

### Release note

None
morningman pushed a commit to morningman/doris that referenced this pull request Jul 1, 2025
Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation
based on the previous snapshot. This operation reads manifest files, and
the file reading process uses a global thread pool. However, users may
have their own authentication information, which requires the use of
doAs to ensure context. Therefore, the thread pool provided by Iceberg
cannot be used.

None
morrySnow pushed a commit that referenced this pull request Jul 2, 2025
morningman pushed a commit that referenced this pull request Jul 9, 2025
### What problem does this PR solve?

Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation
based on the previous snapshot. This operation reads manifest files, and
the file reading process uses a global thread pool. However, users may
have their own authentication information, which requires the use of
doAs to ensure context. Therefore, the thread pool provided by Iceberg
cannot be used.

### Release note

None
yiguolei pushed a commit that referenced this pull request Jul 10, 2025
…51508 (#51528)

Cherry-picked from #51508

---------

Co-authored-by: wuwenchi <[email protected]>
Co-authored-by: Mingyu Chen (Rayner) <[email protected]>
morningman pushed a commit to morningman/doris that referenced this pull request Jul 13, 2025
### What problem does this PR solve?

Problem Summary:

When Iceberg generates a new snapshot, it performs a merge operation
based on the previous snapshot. This operation reads manifest files, and
the file reading process uses a global thread pool. However, users may
have their own authentication information, which requires the use of
doAs to ensure context. Therefore, the thread pool provided by Iceberg
cannot be used.

### Release note

None
dataroaring pushed a commit that referenced this pull request Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants