WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@freemandealer
Copy link
Contributor

this queue contains large amount of duplicated items dedup this queue to shrink memory. orders are updated in batch and ignore internal orders within a batch.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

this queue contains large amount of duplicated items
dedup this queue to shrink memory. orders are updated
in batch and ignore internal orders within a batch.

Signed-off-by: zhengyu <[email protected]>
@Thearas
Copy link
Contributor

Thearas commented Dec 10, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@freemandealer
Copy link
Contributor Author

run buildall

if (drained >= limit) {
break;
}
std::lock_guard lock(shard.mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just "drain" the entire shard instead of draining one by one.
we dont actually care about the order here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we want release the _mutex for a while if the for loop takes too long.


struct Shard {
std::mutex mutex;
std::unordered_map<FileBlock*, FileBlockSPtr> entries;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need sharedptr to keep reference of file block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we do for safety reasons

@doris-robot
Copy link

TPC-H: Total hot run time: 35373 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 17a0372435053cbbab4361adcf716bcf40f275a8, data reload: false

------ Round 1 ----------------------------------
q1	17614	4333	4083	4083
q2	2012	358	246	246
q3	10181	1332	760	760
q4	10216	875	314	314
q5	7504	2157	1932	1932
q6	208	170	137	137
q7	1018	866	733	733
q8	9376	1495	1190	1190
q9	7283	5360	5327	5327
q10	6865	2411	1939	1939
q11	535	318	305	305
q12	696	734	576	576
q13	17783	3692	3051	3051
q14	297	301	268	268
q15	620	519	510	510
q16	924	891	864	864
q17	710	822	539	539
q18	7498	7228	7161	7161
q19	1117	975	627	627
q20	404	361	245	245
q21	4227	4041	3624	3624
q22	1095	991	942	942
Total cold run time: 108183 ms
Total hot run time: 35373 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4115	4143	4068	4068
q2	329	407	308	308
q3	2151	2661	2322	2322
q4	1318	1761	1279	1279
q5	4248	4799	4811	4799
q6	220	172	127	127
q7	2019	1995	1809	1809
q8	2669	2509	2618	2509
q9	7630	7578	7528	7528
q10	3107	3260	2832	2832
q11	590	505	547	505
q12	710	969	574	574
q13	3464	3761	3262	3262
q14	287	297	290	290
q15	554	519	505	505
q16	897	928	906	906
q17	1253	1505	1416	1416
q18	7715	7868	7491	7491
q19	893	900	903	900
q20	2021	2151	1919	1919
q21	4772	4253	4124	4124
q22	1081	1022	978	978
Total cold run time: 52043 ms
Total hot run time: 50451 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180912 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 17a0372435053cbbab4361adcf716bcf40f275a8, data reload: false

query5	4832	643	460	460
query6	332	231	219	219
query7	4217	460	279	279
query8	290	244	230	230
query9	8754	2573	2582	2573
query10	524	356	321	321
query11	15265	14803	14610	14610
query12	175	123	112	112
query13	1239	480	399	399
query14	5789	3262	2957	2957
query14_1	2847	2863	2876	2863
query15	204	193	183	183
query16	929	489	455	455
query17	1140	718	606	606
query18	2454	474	351	351
query19	245	238	223	223
query20	121	116	112	112
query21	217	142	126	126
query22	3919	4066	3869	3869
query23	16810	16151	15830	15830
query23_1	16071	16124	15966	15966
query24	7363	1669	1245	1245
query24_1	1262	1228	1241	1228
query25	595	503	428	428
query26	1253	283	173	173
query27	2741	486	313	313
query28	4490	2157	2143	2143
query29	810	568	458	458
query30	313	258	220	220
query31	803	717	646	646
query32	92	73	74	73
query33	552	351	286	286
query34	910	901	545	545
query35	789	834	739	739
query36	866	905	834	834
query37	131	97	82	82
query38	3835	3794	3730	3730
query39	740	735	708	708
query39_1	698	694	706	694
query40	228	137	124	124
query41	75	91	71	71
query42	109	107	106	106
query43	438	427	412	412
query44	1320	760	796	760
query45	199	190	186	186
query46	883	974	613	613
query47	1670	1685	1634	1634
query48	324	331	267	267
query49	648	429	364	364
query50	653	292	224	224
query51	3846	3845	3804	3804
query52	107	115	100	100
query53	317	345	295	295
query54	300	272	283	272
query55	83	78	74	74
query56	366	297	296	296
query57	1163	1132	1077	1077
query58	267	250	249	249
query59	2335	2463	2285	2285
query60	309	304	290	290
query61	153	155	157	155
query62	717	666	643	643
query63	332	290	295	290
query64	4989	1306	992	992
query65	4034	3948	3968	3948
query66	1484	451	329	329
query67	15103	14829	14631	14631
query68	8239	1019	743	743
query69	502	347	315	315
query70	1087	1025	995	995
query71	356	307	281	281
query72	6056	5061	5057	5057
query73	697	617	311	311
query74	8766	8878	8600	8600
query75	3556	3526	3170	3170
query76	3878	1153	766	766
query77	541	403	295	295
query78	9551	9551	8883	8883
query79	1727	879	625	625
query80	732	644	566	566
query81	531	267	238	238
query82	432	133	103	103
query83	264	248	240	240
query84	260	116	97	97
query85	921	496	469	469
query86	398	291	292	291
query87	4059	4031	3939	3939
query88	3679	2314	2318	2314
query89	474	431	400	400
query90	2194	162	160	160
query91	172	166	139	139
query92	86	72	70	70
query93	2453	919	575	575
query94	491	290	301	290
query95	587	332	361	332
query96	595	479	215	215
query97	2580	2626	2544	2544
query98	220	193	199	193
query99	1326	1283	1226	1226
Total cold run time: 264610 ms
Total hot run time: 180912 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.17 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 17a0372435053cbbab4361adcf716bcf40f275a8, data reload: false

query1	0.05	0.04	0.04
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.12	0.11
query5	0.28	0.24	0.26
query6	1.16	0.62	0.63
query7	0.02	0.02	0.02
query8	0.06	0.04	0.04
query9	0.59	0.50	0.52
query10	0.56	0.56	0.55
query11	0.16	0.11	0.12
query12	0.15	0.11	0.12
query13	0.61	0.61	0.61
query14	1.00	0.99	0.98
query15	0.81	0.79	0.81
query16	0.41	0.42	0.42
query17	1.00	1.02	1.03
query18	0.24	0.22	0.22
query19	1.93	1.88	1.76
query20	0.02	0.01	0.01
query21	15.43	0.27	0.14
query22	4.63	0.05	0.05
query23	15.91	0.29	0.10
query24	0.93	0.24	0.18
query25	0.10	0.06	0.05
query26	0.14	0.13	0.14
query27	0.06	0.06	0.07
query28	3.16	1.22	1.02
query29	12.63	3.97	3.23
query30	0.29	0.14	0.11
query31	2.84	0.62	0.39
query32	3.23	0.54	0.45
query33	2.99	3.06	3.05
query34	17.08	5.17	4.55
query35	4.54	4.56	4.58
query36	0.66	0.49	0.48
query37	0.11	0.06	0.07
query38	0.08	0.04	0.04
query39	0.04	0.03	0.02
query40	0.18	0.15	0.13
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 96.22 s
Total hot run time: 27.17 s

freemandealer and others added 2 commits December 10, 2025 22:18
Signed-off-by: zhengyu <[email protected]>
Signed-off-by: freemandealer <[email protected]>
@freemandealer
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35346 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a932f837c7ca128dcb0464f6a18a2a4183063eb5, data reload: false

------ Round 1 ----------------------------------
q1	17652	4269	4236	4236
q2	2036	356	260	260
q3	10172	1308	763	763
q4	10227	813	311	311
q5	7578	2141	1886	1886
q6	190	169	140	140
q7	1009	866	719	719
q8	9350	1449	1133	1133
q9	6977	5328	5334	5328
q10	6784	2388	2000	2000
q11	526	322	291	291
q12	650	760	569	569
q13	17807	3651	3017	3017
q14	297	300	271	271
q15	597	516	517	516
q16	932	912	872	872
q17	689	829	511	511
q18	7703	7179	7064	7064
q19	1094	956	604	604
q20	385	364	250	250
q21	4286	4027	3656	3656
q22	1026	1006	949	949
Total cold run time: 107967 ms
Total hot run time: 35346 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4069	3997	4039	3997
q2	331	379	310	310
q3	2151	2669	2228	2228
q4	1332	1754	1305	1305
q5	4214	4696	4708	4696
q6	222	175	132	132
q7	2043	2058	1824	1824
q8	2672	2561	2484	2484
q9	7776	7661	7766	7661
q10	3008	3158	2835	2835
q11	608	516	507	507
q12	701	759	618	618
q13	3573	3941	3262	3262
q14	292	317	277	277
q15	561	583	580	580
q16	890	918	862	862
q17	1173	1450	1429	1429
q18	8023	7626	7541	7541
q19	868	857	890	857
q20	2091	2094	1944	1944
q21	4898	4585	4338	4338
q22	1049	1027	1004	1004
Total cold run time: 52545 ms
Total hot run time: 50691 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181016 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a932f837c7ca128dcb0464f6a18a2a4183063eb5, data reload: false

query5	4535	652	471	471
query6	338	221	210	210
query7	4231	467	290	290
query8	308	252	250	250
query9	8792	2593	2560	2560
query10	535	398	337	337
query11	15319	15398	14577	14577
query12	191	117	116	116
query13	1273	519	416	416
query14	6468	3587	2914	2914
query14_1	2835	2917	2867	2867
query15	204	190	182	182
query16	886	485	466	466
query17	1161	693	594	594
query18	2652	430	335	335
query19	226	236	223	223
query20	128	113	112	112
query21	223	141	113	113
query22	3970	4142	3953	3953
query23	16618	16241	15947	15947
query23_1	16086	15996	16094	15996
query24	7342	1656	1218	1218
query24_1	1233	1215	1262	1215
query25	592	490	456	456
query26	1242	274	166	166
query27	2763	482	310	310
query28	4484	2181	2170	2170
query29	845	576	475	475
query30	318	250	226	226
query31	835	705	652	652
query32	83	75	72	72
query33	565	350	300	300
query34	910	889	551	551
query35	803	825	734	734
query36	861	908	836	836
query37	134	95	81	81
query38	3920	3872	3723	3723
query39	921	731	737	731
query39_1	711	708	712	708
query40	228	138	127	127
query41	77	69	66	66
query42	108	107	105	105
query43	436	426	390	390
query44	1317	762	764	762
query45	194	197	190	190
query46	880	973	616	616
query47	1685	1714	1630	1630
query48	323	345	258	258
query49	650	458	362	362
query50	667	296	230	230
query51	3907	3797	3834	3797
query52	109	114	106	106
query53	333	362	293	293
query54	307	286	264	264
query55	81	78	77	77
query56	311	317	319	317
query57	1134	1157	1076	1076
query58	284	275	265	265
query59	2294	2463	2375	2375
query60	338	372	290	290
query61	162	153	157	153
query62	697	650	623	623
query63	332	296	301	296
query64	4997	1303	998	998
query65	4059	3965	3949	3949
query66	1397	452	322	322
query67	15268	15091	14805	14805
query68	8363	1019	746	746
query69	497	342	307	307
query70	1082	990	987	987
query71	392	306	286	286
query72	6104	4950	5108	4950
query73	697	622	315	315
query74	8764	8941	8645	8645
query75	3593	3519	3136	3136
query76	3932	1140	759	759
query77	629	399	286	286
query78	9420	9567	8799	8799
query79	1636	871	618	618
query80	679	640	541	541
query81	502	265	233	233
query82	465	131	98	98
query83	263	253	239	239
query84	260	119	96	96
query85	959	519	457	457
query86	339	288	274	274
query87	4100	4051	4001	4001
query88	4053	2272	2254	2254
query89	469	426	386	386
query90	2213	156	155	155
query91	170	162	146	146
query92	84	69	65	65
query93	1228	917	571	571
query94	464	306	280	280
query95	568	333	346	333
query96	604	468	209	209
query97	2626	2642	2586	2586
query98	208	201	194	194
query99	1328	1283	1186	1186
Total cold run time: 265066 ms
Total hot run time: 181016 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a932f837c7ca128dcb0464f6a18a2a4183063eb5, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.06
query3	0.26	0.09	0.08
query4	1.61	0.11	0.11
query5	0.29	0.24	0.27
query6	1.18	0.63	0.63
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.51	0.50
query10	0.57	0.56	0.55
query11	0.16	0.12	0.12
query12	0.15	0.11	0.12
query13	0.62	0.60	0.61
query14	0.99	1.00	1.00
query15	0.81	0.81	0.80
query16	0.42	0.43	0.39
query17	1.05	1.01	1.04
query18	0.24	0.21	0.21
query19	1.99	1.89	1.90
query20	0.01	0.01	0.01
query21	15.44	0.29	0.14
query22	4.87	0.06	0.05
query23	16.02	0.27	0.11
query24	1.48	0.63	0.98
query25	0.08	0.06	0.04
query26	0.13	0.13	0.14
query27	0.06	0.05	0.05
query28	4.93	1.21	1.02
query29	12.57	4.06	3.32
query30	0.28	0.15	0.12
query31	2.82	0.63	0.40
query32	3.22	0.53	0.46
query33	3.06	2.96	3.00
query34	16.84	5.21	4.48
query35	4.56	4.57	4.56
query36	0.68	0.50	0.48
query37	0.10	0.06	0.06
query38	0.07	0.04	0.03
query39	0.04	0.03	0.03
query40	0.17	0.14	0.15
query41	0.09	0.03	0.03
query42	0.04	0.03	0.04
query43	0.04	0.04	0.03
Total cold run time: 98.75 s
Total hot run time: 27.71 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 85.23% (75/88) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.36% (18727/35098)
Line Coverage 39.08% (173223/443239)
Region Coverage 33.76% (134370/398007)
Branch Coverage 34.69% (57758/166508)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.23% (75/88) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.26% (24856/34400)
Line Coverage 59.01% (261284/442763)
Region Coverage 54.11% (217898/402730)
Branch Coverage 55.51% (92911/167383)

@freemandealer
Copy link
Contributor Author

run cloud_p0

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.23% (75/88) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.26% (24856/34400)
Line Coverage 59.01% (261284/442763)
Region Coverage 54.11% (217898/402730)
Branch Coverage 55.51% (92911/167383)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 12, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants