Skip to content

add support for counting non integer in aggregation#2547

Merged
trinity-1686a merged 2 commits into
mainfrom
trinity/count-str
Dec 17, 2024
Merged

add support for counting non integer in aggregation#2547
trinity-1686a merged 2 commits into
mainfrom
trinity/count-str

Conversation

@trinity-1686a

Copy link
Copy Markdown
Collaborator

@rdettai

rdettai commented Nov 28, 2024

Copy link
Copy Markdown
Collaborator

nice! might be worth adding some unit tests!

@PSeitz

PSeitz commented Nov 29, 2024

Copy link
Copy Markdown
Collaborator

can you check the performance with cargo bench agg

@trinity-1686a

Copy link
Copy Markdown
Collaborator Author

the output of cargo bench agg gives no significative change, though i'm not entirely sure any aggregation is hitting that code path (maybe the avg_and_range_with_avg_sub_agg does?)

bench results
full
terms_many_with_avg_sub_agg                    Memory: 27.8 MB (+0.00%)    Avg: 232.0448ms (-1.95%)    Median: 231.2331ms (-0.72%)    [222.9071ms .. 245.0322ms]    
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 42.0 MB             Avg: 304.6627ms (-3.89%)    Median: 302.6334ms (-3.40%)    [295.8152ms .. 317.8829ms]    
cardinality_agg                                Memory: 3.5 MB (-0.01%)     Avg: 66.4368ms (-1.17%)     Median: 66.2259ms (-0.78%)     [65.3263ms .. 69.9524ms]      
terms_few_with_cardinality_agg                 Memory: 10.6 MB (+0.00%)    Avg: 274.1488ms (-0.11%)    Median: 272.0091ms (-0.50%)    [268.1587ms .. 332.7399ms]    
range_agg                                      Memory: 15.9 KB (+0.82%)    Avg: 14.4317ms (+1.29%)     Median: 14.1079ms (-0.63%)     [13.9544ms .. 24.2560ms]      
range_agg_with_avg_sub_agg                     Memory: 31.1 KB             Avg: 44.3468ms (-0.83%)     Median: 44.1626ms (-1.11%)     [43.8863ms .. 45.6916ms]      
range_agg_with_term_agg_few                    Memory: 45.5 KB             Avg: 67.0602ms (+1.07%)     Median: 66.6544ms (+0.73%)     [66.0058ms .. 70.0632ms]      
range_agg_with_term_agg_many                   Memory: 6.9 MB              Avg: 108.4251ms (-4.63%)    Median: 107.0003ms (-4.10%)    [105.5204ms .. 121.3335ms]    
histogram_with_avg_sub_agg                     Memory: 48.8 KB             Avg: 61.2483ms (-0.06%)     Median: 61.2472ms (-0.01%)     [60.4866ms .. 62.0333ms]      
avg_and_range_with_avg_sub_agg                 Memory: 27.2 KB             Avg: 56.4581ms (-0.17%)     Median: 56.3653ms (-0.23%)     [55.9478ms .. 57.7496ms]      
dense
terms_many_with_avg_sub_agg                    Memory: 27.8 MB             Avg: 276.5246ms (-0.13%)    Median: 272.7037ms (-1.06%)    [263.9962ms .. 378.4555ms]    
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 42.0 MB             Avg: 336.6864ms (-2.37%)    Median: 335.7333ms (-1.80%)    [322.8848ms .. 352.5364ms]    
cardinality_agg                                Memory: 3.6 MB (+0.01%)     Avg: 79.2084ms (-0.93%)     Median: 78.9963ms (-0.68%)     [78.5911ms .. 83.1381ms]      
terms_few_with_cardinality_agg                 Memory: 10.6 MB (-0.00%)    Avg: 297.3892ms (-0.49%)    Median: 296.8820ms (-0.35%)    [292.5347ms .. 309.6992ms]    
range_agg                                      Memory: 16.1 KB             Avg: 26.0813ms (-4.65%)     Median: 26.0654ms (-3.02%)     [25.9373ms .. 26.4499ms]      
range_agg_with_avg_sub_agg                     Memory: 32.6 KB             Avg: 66.7406ms (-0.44%)     Median: 66.4263ms (-0.57%)     [66.1399ms .. 69.6485ms]      
range_agg_with_term_agg_few                    Memory: 47.1 KB             Avg: 91.7573ms (+0.41%)     Median: 91.7449ms (+0.56%)     [91.3172ms .. 92.3171ms]      
range_agg_with_term_agg_many                   Memory: 6.9 MB              Avg: 132.0438ms (-3.69%)    Median: 130.7221ms (-3.95%)    [128.3665ms .. 157.7649ms]    
histogram_with_avg_sub_agg                     Memory: 50.2 KB             Avg: 83.1442ms (-0.89%)     Median: 82.8943ms (-1.06%)     [82.5001ms .. 85.7866ms]      
avg_and_range_with_avg_sub_agg                 Memory: 29.8 KB             Avg: 89.7483ms (-0.20%)     Median: 89.0812ms (-0.80%)     [88.2248ms .. 95.5685ms]      
sparse
terms_many_with_avg_sub_agg                    Memory: 7.5 MB              Avg: 74.9221ms (+0.32%)     Median: 73.5750ms (-1.05%)     [72.3872ms .. 86.7915ms]      
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 7.4 MB              Avg: 106.6722ms (-0.08%)    Median: 105.8150ms (+0.18%)    [104.3413ms .. 112.6063ms]    
cardinality_agg                                Memory: 895.4 KB            Avg: 75.0613ms (+0.25%)     Median: 74.9402ms (+0.47%)     [74.4727ms .. 76.0159ms]      
terms_few_with_cardinality_agg                 Memory: 680.3 KB            Avg: 89.4861ms (-0.14%)     Median: 89.3371ms (-0.13%)     [89.0572ms .. 90.4494ms]      
range_agg                                      Memory: 16.9 KB (-2.18%)    Avg: 52.7132ms (-0.17%)     Median: 52.7122ms (-0.19%)     [52.4725ms .. 53.2407ms]      
range_agg_with_avg_sub_agg                     Memory: 31.1 KB             Avg: 58.2852ms (-0.84%)     Median: 58.3058ms (-0.31%)     [58.0938ms .. 58.5358ms]      
range_agg_with_term_agg_few                    Memory: 45.7 KB             Avg: 60.6195ms (-0.15%)     Median: 60.5940ms (-0.07%)     [60.3866ms .. 61.0074ms]      
range_agg_with_term_agg_many                   Memory: 1.8 MB              Avg: 65.6621ms (-1.41%)     Median: 65.4542ms (-0.76%)     [65.1492ms .. 67.3898ms]      
histogram_with_avg_sub_agg                     Memory: 34.5 KB             Avg: 59.2327ms (-0.27%)     Median: 59.2153ms (-0.25%)     [59.0855ms .. 59.5221ms]      
avg_and_range_with_avg_sub_agg                 Memory: 27.0 KB             Avg: 106.3856ms (-0.51%)    Median: 106.3384ms (-0.50%)    [106.1846ms .. 107.1841ms]    
multivalue
terms_many_with_avg_sub_agg                    Memory: 27.8 MB             Avg: 376.7042ms (-2.76%)    Median: 376.3418ms (-2.78%)    [364.0526ms .. 396.7285ms]    
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 42.0 MB             Avg: 426.5047ms (-3.39%)    Median: 424.5364ms (-4.33%)    [412.2916ms .. 453.6147ms]    
cardinality_agg                                Memory: 3.6 MB              Avg: 93.3480ms (-0.01%)     Median: 93.3406ms (+0.44%)     [92.0163ms .. 95.2114ms]      
terms_few_with_cardinality_agg                 Memory: 10.8 MB (-0.00%)    Avg: 335.2001ms (-0.56%)    Median: 334.8423ms (-0.68%)    [327.5755ms .. 351.5157ms]    
range_agg                                      Memory: 17.5 KB (+2.84%)    Avg: 42.1272ms (+6.02%)     Median: 39.3474ms (-0.59%)     [38.8971ms .. 63.5646ms]      
range_agg_with_avg_sub_agg                     Memory: 32.8 KB             Avg: 97.5148ms (+0.14%)     Median: 97.3243ms (+0.10%)     [96.9425ms .. 103.4012ms]     
range_agg_with_term_agg_few                    Memory: 248.3 KB            Avg: 121.5217ms (-0.69%)    Median: 121.2504ms (-0.50%)    [120.7182ms .. 128.1189ms]    
range_agg_with_term_agg_many                   Memory: 6.9 MB              Avg: 166.5432ms (+1.32%)    Median: 163.6119ms (+0.40%)    [159.7448ms .. 201.7112ms]    
histogram_with_avg_sub_agg                     Memory: 50.5 KB             Avg: 112.6518ms (-0.12%)    Median: 112.0596ms (-0.52%)    [111.6872ms .. 116.1402ms]    
avg_and_range_with_avg_sub_agg                 Memory: 30.2 KB             Avg: 133.0242ms (-0.64%)    Median: 132.2856ms (-0.81%)    [131.8013ms .. 138.7579ms]  

@trinity-1686a

Copy link
Copy Markdown
Collaborator Author
cargo bench 64
full
average_u64          Memory: 20.1 KB (-0.42%)    Avg: 325.7222ms (+0.02%)    Median: 325.4104ms (-0.12%)    [323.1343ms .. 331.1226ms]    
average_f64          Memory: 20.1 KB (+1.15%)    Avg: 336.5785ms (-0.30%)    Median: 335.4815ms (-0.48%)    [333.1059ms .. 351.1640ms]    
average_f64_u64      Memory: 22.8 KB (+0.47%)    Avg: 565.1703ms (-0.15%)    Median: 564.0385ms (-0.25%)    [559.4533ms .. 587.6503ms]    
stats_f64            Memory: 20.3 KB (+1.79%)    Avg: 336.8848ms (-0.15%)    Median: 336.6342ms (-0.10%)    [332.8835ms .. 344.2083ms]    
extendedstats_f64    Memory: 20.2 KB (-0.31%)    Avg: 349.1945ms (-0.20%)    Median: 348.8795ms (-0.12%)    [346.0720ms .. 353.4762ms]    
percentiles_f64      Memory: 30.4 KB (+0.60%)    Avg: 356.7218ms (-0.86%)    Median: 356.5098ms (-0.57%)    [352.6974ms .. 362.2696ms]    
dense
average_u64          Memory: 21.9 KB (-0.76%)    Avg: 489.6951ms (-0.45%)    Median: 489.4872ms (-0.34%)    [486.4282ms .. 494.6191ms]      
average_f64          Memory: 25.8 KB (-1.88%)    Avg: 498.5461ms (-0.12%)    Median: 497.9900ms (-0.17%)    [494.7158ms .. 509.0261ms]      
average_f64_u64      Memory: 24.6 KB (-0.88%)    Avg: 937.0674ms (+4.23%)    Median: 905.6846ms (+0.93%)    [881.6890ms .. 1.288596875s]    
stats_f64            Memory: 21.7 KB (-1.64%)    Avg: 499.9421ms (-0.05%)    Median: 499.6962ms (+0.22%)    [494.8581ms .. 509.4400ms]      
extendedstats_f64    Memory: 22.0 KB (-1.52%)    Avg: 512.7254ms (-0.01%)    Median: 511.9617ms (+0.00%)    [507.2669ms .. 522.6218ms]      
percentiles_f64      Memory: 25.0 KB (-1.61%)    Avg: 518.3964ms (-0.61%)    Median: 517.9492ms (-0.55%)    [514.0569ms .. 527.9620ms]      
sparse
average_u64          Memory: 25.8 KB (+19.78%)    Avg: 1.161269192s (+0.62%)    Median: 1.160306875s (+0.54%)    [1.154250416s .. 1.170924542s]    
average_f64          Memory: 26.5 KB (+17.14%)    Avg: 1.161151794s (+0.54%)    Median: 1.159992812s (+0.61%)    [1.155041667s .. 1.168221041s]    
average_f64_u64      Memory: 27.7 KB (+15.39%)    Avg: 2.319196679s (+1.22%)    Median: 2.312986146s (+1.11%)    [2.296713333s .. 2.403568458s]    
stats_f64            Memory: 27.3 KB (+15.46%)    Avg: 1.162013528s (+0.77%)    Median: 1.161988812s (+0.81%)    [1.151636042s .. 1.171992042s]    
extendedstats_f64    Memory: 25.0 KB (+15.95%)    Avg: 1.162014403s (+0.37%)    Median: 1.161707812s (+0.42%)    [1.152589375s .. 1.17367175s]     
percentiles_f64      Memory: 24.4 KB (+0.19%)     Avg: 1.100068584s (+0.86%)    Median: 1.099738499s (+0.81%)    [1.09488475s .. 1.106139292s]     
multivalue
average_u64          Memory: 23.0 KB (-15.55%)    Avg: 698.0436ms (+1.26%)      Median: 697.5080ms (+1.17%)      [690.9101ms .. 707.7743ms]        
average_f64          Memory: 24.0 KB (-13.41%)    Avg: 707.8201ms (+1.09%)      Median: 707.8935ms (+1.22%)      [703.4621ms .. 713.6697ms]        
average_f64_u64      Memory: 27.5 KB (-13.23%)    Avg: 1.381387165s (+6.20%)    Median: 1.329452208s (+2.22%)    [1.302791333s .. 1.688878708s]    
stats_f64            Memory: 25.2 KB (-13.15%)    Avg: 709.4215ms (+1.33%)      Median: 709.1806ms (+1.46%)      [703.7535ms .. 719.2411ms]        
extendedstats_f64    Memory: 23.0 KB (-13.61%)    Avg: 721.7589ms (+1.38%)      Median: 720.2963ms (+1.36%)      [716.1771ms .. 749.8379ms]        
percentiles_f64      Memory: 25.8 KB (-18.15%)    Avg: 720.5841ms (-0.91%)      Median: 719.5029ms (-0.97%)      [714.3202ms .. 757.1915ms]        

@trinity-1686a trinity-1686a merged commit c39d91f into main Dec 17, 2024
@trinity-1686a trinity-1686a deleted the trinity/count-str branch December 17, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants