Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@kedixa
Copy link
Owner

@kedixa kedixa commented May 17, 2025

No description provided.

@kedixa kedixa added the do-benchmark Let github action perform benchmarking label May 17, 2025
@github-actions
Copy link

Benchmark

The benchmark runs on Github Actions.

Environment

Details

Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               4
On-line CPU(s) list:                  0-3
Vendor ID:                            AuthenticAMD
Model name:                           AMD EPYC 7763 64-Core Processor
CPU family:                           25
Model:                                1
Thread(s) per core:                   2
Core(s) per socket:                   2
Socket(s):                            1
Stepping:                             1
BogoMIPS:                             4890.85
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves user_shstk clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm
Virtualization:                       AMD-V
Hypervisor vendor:                    Microsoft
Virtualization type:                  full
L1d cache:                            64 KiB (2 instances)
L1i cache:                            64 KiB (2 instances)
L2 cache:                             1 MiB (2 instances)
L3 cache:                             32 MiB (1 instance)
NUMA node(s):                         1
NUMA node0 CPU(s):                    0-3
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass:      Vulnerable
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

RANGE                                 SIZE  STATE REMOVABLE  BLOCK
0x0000000000000000-0x000000003fffffff   1G online       yes    0-7
0x0000000100000000-0x00000004bfffffff  15G online       yes 32-151

Memory block size:       128M
Total online memory:      16G
Total offline memory:      0B

Go

Bench command bench_go --concurrency 64 --compute 8 --max-secs 60 --total 200000 --times 5 -y

Details

name cost times mean(ms) stddev per sec
wf_go_one_name 755 5 151.00 71.44 1324503
wf_go_five_name 393 5 78.60 3.36 2544529
wf_go_ten_name 339 5 67.80 0.45 2949852
go_one_name 1149 5 229.80 24.61 870322
go_five_name 412 5 82.40 1.52 2427184
go_ten_name 347 5 69.40 0.89 2881844
switch_one_name 1174 5 234.80 27.34 851788
switch_five_name 401 5 80.20 2.86 2493765
switch_ten_name 350 5 70.00 1.00 2857142

Timer

Bench command bench_timer --concurrency 4096 --handler 12 --poller 6 --compute 8 --max-secs 60 --total 200000 --times 5 -y

Details

name cost times mean(ms) stddev per sec
wf_repeat 660 5 132.00 3.39 1515151
default_timer 647 5 129.40 3.21 1545595
yield 390 5 78.00 1.87 2564102
timer_in_task 645 5 129.00 4.06 1550387
timer_by_name 1851 5 370.20 3.90 540248
cancel_by_name 797 5 159.40 13.30 1254705
detach_by_name 1695 5 339.00 2.83 589970
detach3_by_name 4945 5 989.00 7.38 202224
one_name 815 5 163.00 3.54 1226993
two_name 841 5 168.20 6.98 1189060
ten_name 889 5 177.80 4.38 1124859
name_one_by_one 1168 5 233.60 17.53 856164
timer_by_id 854 5 170.80 3.03 1170960
timer_by_addr 920 5 184.00 2.55 1086956
cancel_by_id 678 5 135.60 2.61 1474926
detach_by_id 1005 5 201.00 3.67 995024
detach_inf_by_id 922 5 184.40 1.67 1084598
one_id 838 5 167.60 4.04 1193317
two_id 727 5 145.40 2.07 1375515
ten_id 643 5 128.60 2.30 1555209
id_one_by_one 1570 5 314.00 13.45 636942

Dag

Bench command bench_graph --num-nodes 128 --total 500 --group-size 10 --task-per-node 3 --max-secs 60 --times 5 --handler 12 --poller 6 -y

Details

name cost times mean(ms) stddev per sec
wf_chain 870 5 174.00 4.53 2873
coke_chain_once 863 5 172.60 4.04 2896
coke_chain 825 5 165.00 4.18 3030
wf_tree 1067 5 213.40 2.61 2343
coke_tree_once 1123 5 224.60 2.70 2226
coke_tree 1064 5 212.80 0.84 2349
wf_net 1132 5 226.40 2.97 2208
coke_net_once 1169 5 233.80 2.59 2138
coke_net 1066 5 213.20 1.10 2345
wf_flower 943 5 188.60 0.89 2651
coke_flower_once 964 5 192.80 1.64 2593
coke_flower 928 5 185.60 1.82 2693

Exception

Bench command bench_exception --concurrency 4096 --total 200000 --max-secs 60 --times 5 --handler 12 --poller 6 -y

Details

name cost times mean(ms) stddev per sec
normal_yield 438 5 87.60 1.67 2283105
yield_catch 434 5 86.80 1.48 2304147
d1_p0 553 5 110.60 0.89 1808318
d2_p0 552 5 110.40 2.51 1811594
d5_p0 582 5 116.40 2.51 1718213
d1_p100 1724 5 344.80 7.73 580046
d2_p100 2507 5 501.40 2.70 398883
d5_p100 4852 5 970.40 4.22 206100
d1_p1 535 5 107.00 2.24 1869158
d1_p5 596 5 119.20 0.45 1677852
d1_p10 636 5 127.20 0.84 1572327
d1_p20 702 5 140.40 1.52 1424501
d1_p50 1140 5 228.00 1.00 877192

Queue

Bench command bench_queue --concurrency 10 --total 1000000 --max-secs 60 --times 5 --que-size 1000 --batch-size 10 --handler 12 --poller 6 -y

Details

name cost times mean(ms) stddev per sec
try_push_pop 1293 5 258.60 7.92 3866976
push_pop 1768 5 353.60 5.77 2828054
push_pop_range 318 5 63.60 14.36 15723270

Weighted NSPolicy

Bench command ./bazel-bin/benchmark/bench_weighted_policy --concurrency 1 --total 1000000 --times 5 -y

Details

name num addrs fail ratio cost(ms) times mean(ms) stddev per sec
weighted_random 1000 1/10000 244 5 48.80 0.45 20491803
weighted_random 10000 1/10000 399 5 79.80 0.45 12531328
weighted_random 100000 1/10000 879 5 175.80 8.84 5688282
weighted_random 1000000 1/10000 1916 5 383.20 41.69 2609603
weighted_random 10000 1/1000 402 5 80.40 0.55 12437810
weighted_random 10000 1/100 412 5 82.40 0.55 12135922
weighted_random 10000 1/10 515 5 103.00 0.71 9708737
weighted_least_conn 1000 1/10000 497 5 99.40 0.55 10060362
weighted_least_conn 10000 1/10000 738 5 147.60 0.55 6775067
weighted_least_conn 100000 1/10000 1202 5 240.40 1.14 4159733
weighted_least_conn 1000000 1/10000 1472 5 294.40 0.55 3396739
weighted_least_conn 10000 1/1000 737 5 147.40 0.55 6784260
weighted_least_conn 10000 1/100 745 5 149.00 0.00 6711409
weighted_least_conn 10000 1/10 821 5 164.20 0.45 6090133
weighted_round_robin 1000 1/10000 354 5 70.80 0.84 14124293
weighted_round_robin 10000 1/10000 453 5 90.60 0.55 11037527
weighted_round_robin 100000 1/10000 570 5 114.00 2.92 8771929
weighted_round_robin 1000000 1/10000 1035 5 207.00 6.60 4830917
weighted_round_robin 10000 1/1000 455 5 91.00 0.71 10989010
weighted_round_robin 10000 1/100 482 5 96.40 0.55 10373443
weighted_round_robin 10000 1/10 647 5 129.40 0.55 7727975

@github-actions github-actions bot removed the do-benchmark Let github action perform benchmarking label May 17, 2025
@kedixa
Copy link
Owner Author

kedixa commented May 17, 2025

weighted_random策略在地址数量较少时效率最优,但随着地址数量增加,效率下降的速度也最快,这是因为其随机性导致几乎次次无法命中高速缓存,幸好实际业务不需要同时为百万个地址做负载均衡。

@kedixa kedixa merged commit ddb92c9 into master May 17, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants