Descriptions
- According to How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures.
Downloads
Results
- Executes 1000 times per loop when turned off most functionalities that affect measurement.
Loading hello module...
loop_size:0 >>>> variance(cycles): 3; max_deviation: 8 ;min time: 44
loop_size:1 >>>> variance(cycles): 3; max_deviation: 28 ;min time: 44
loop_size:2 >>>> variance(cycles): 3; max_deviation: 12 ;min time: 44
loop_size:3 >>>> variance(cycles): 5; max_deviation: 40 ;min time: 44
loop_size:4 >>>> variance(cycles): 4; max_deviation: 32 ;min time: 44
loop_size:5 >>>> variance(cycles): 5; max_deviation: 32 ;min time: 44
loop_size:6 >>>> variance(cycles): 6; max_deviation: 48 ;min time: 44
loop_size:7 >>>> variance(cycles): 1; max_deviation: 32 ;min time: 48
loop_size:8 >>>> variance(cycles): 4; max_deviation: 20 ;min time: 48
loop_size:9 >>>> variance(cycles): 7; max_deviation: 48 ;min time: 48
loop_size:10 >>>> variance(cycles): 5; max_deviation: 32 ;min time: 48
loop_size:11 >>>> variance(cycles): 10; max_deviation: 84 ;min time: 48
.........
.........
loop_size:994 >>>> variance(cycles): 1922; max_deviation: 1388 ;min time: 2028
loop_size:995 >>>> variance(cycles): 0; max_deviation: 0 ;min time: 2032
loop_size:996 >>>> variance(cycles): 1923; max_deviation: 1388 ;min time: 2032
loop_size:997 >>>> variance(cycles): 0; max_deviation: 0 ;min time: 2036
loop_size:998 >>>> variance(cycles): 3; max_deviation: 4 ;min time: 2036
loop_size:999 >>>> variance(cycles): 1815; max_deviation: 1348 ;min time: 2040
total number of spurious min values = 0
total variance = 2520492
absolute max deviation = 1144364
variance of variances = 17554753199565
variance of minimum values = 335594
- Executes 1000000 times per loop when turned off most functionalities that affect measurement.
Loading hello module...
loop_size:0 >>>> variance(cycles): 809; max_deviation: 23816 ;min time: 44
loop_size:1 >>>> variance(cycles): 405; max_deviation: 19300 ;min time: 44
loop_size:2 >>>> variance(cycles): 41; max_deviation: 4992 ;min time: 44
loop_size:3 >>>> variance(cycles): 13; max_deviation: 1920 ;min time: 44
loop_size:4 >>>> variance(cycles): 6300; max_deviation: 65320 ;min time: 44
loop_size:5 >>>> variance(cycles): 378; max_deviation: 19012 ;min time: 44
loop_size:6 >>>> variance(cycles): 2512; max_deviation: 46956 ;min time: 44
loop_size:7 >>>> variance(cycles): 14308; max_deviation: 109424 ;min time: 48
loop_size:8 >>>> variance(cycles): 128449; max_deviation: 357728 ;min time: 48
loop_size:9 >>>> variance(cycles): 1696; max_deviation: 40980 ;min time: 48
loop_size:10 >>>> variance(cycles): 834; max_deviation: 22336 ;min time: 48
loop_size:11 >>>> variance(cycles): 4143; max_deviation: 63780 ;min time: 48
.........
.........
loop_size:994 >>>> variance(cycles): 914214; max_deviation: 668016 ;min time: 2028
loop_size:995 >>>> variance(cycles): 1596810; max_deviation: 728892 ;min time: 2032
loop_size:996 >>>> variance(cycles): 1775690; max_deviation: 866988 ;min time: 2032
loop_size:997 >>>> variance(cycles): 2589904; max_deviation: 984516 ;min time: 2036
loop_size:998 >>>> variance(cycles): 957907; max_deviation: 677884 ;min time: 2036
loop_size:999 >>>> variance(cycles): 1254143; max_deviation: 748936 ;min time: 2040
total number of spurious min values = 4
total variance = 2631291
absolute max deviation = 246593400
variance of variances = 17487031211352
variance of minimum values = 335929
- Executes 1000000 times per loop when turned on most functionalities that affect measurement.
Loading hello module...
loop_size:0 >>>> variance(cycles): 2425; max_deviation: 49056 ;min time: 42
loop_size:1 >>>> variance(cycles): 20; max_deviation: 3444 ;min time: 42
loop_size:2 >>>> variance(cycles): 26; max_deviation: 2697 ;min time: 42
loop_size:3 >>>> variance(cycles): 97; max_deviation: 4395 ;min time: 42
loop_size:4 >>>> variance(cycles): 40; max_deviation: 2826 ;min time: 42
loop_size:5 >>>> variance(cycles): 1437; max_deviation: 27309 ;min time: 42
loop_size:6 >>>> variance(cycles): 30; max_deviation: 2802 ;min time: 42
loop_size:7 >>>> variance(cycles): 6; max_deviation: 2541 ;min time: 42
loop_size:8 >>>> variance(cycles): 13; max_deviation: 2433 ;min time: 45
loop_size:9 >>>> variance(cycles): 60; max_deviation: 3594 ;min time: 42
loop_size:10 >>>> variance(cycles): 35; max_deviation: 2661 ;min time: 45
loop_size:11 >>>> variance(cycles): 31; max_deviation: 3534 ;min time: 45
.........
.........
loop_size:994 >>>> variance(cycles): 32588; max_deviation: 46620 ;min time: 1935
loop_size:995 >>>> variance(cycles): 11208; max_deviation: 22932 ;min time: 1935
loop_size:996 >>>> variance(cycles): 9178; max_deviation: 15753 ;min time: 1938
loop_size:997 >>>> variance(cycles): 11525; max_deviation: 55938 ;min time: 1938
loop_size:998 >>>> variance(cycles): 62386; max_deviation: 229224 ;min time: 1941
loop_size:999 >>>> variance(cycles): 7847; max_deviation: 6255 ;min time: 1944
total number of spurious min values = 4
total variance = 103398
absolute max deviation = 1191852
variance of variances = 132114077117
variance of minimum values = 306145
Comments
- The results for 1000 times per loop and 1000000 times per loop are mostly equal, which means that it doesn't need too many times to get the acceptable results. The results that turned on every functionalities are a little smaller than turned off, which maybe because of multiple core or turbo mode. I don't know if there are problems using RDTSC on multi-core processor, the results looks normal, i mean, the more instructions you run, the more time it consumes. The measurements are made after reboot so the results are better. It was my impression that the total number of spurious min values is about 100 before reboot and the good news is that the deviations are less than 20. It won't affect me since i need only the relative performances, not absolute ones.
-
0000000000000000 <measured_loop>: 0: 31 c0 xor eax,eax 2: 85 c9 test ecx,ecx 4: 74 17 je 1d <measured_loop+0x1d> 6: 66 2e 0f 1f 84 00 00 nop WORD PTR cs:[rax+rax*1+0x0] d: 00 00 00 10: 83 c0 01 add eax,0x1 13: c7 02 01 00 00 00 mov DWORD PTR [rdx],0x1 19: 39 c8 cmp eax,ecx 1b: 75 f3 jne 10 <measured_loop+0x10> 1d: f3 c3 repz ret 1f: 90 nop
- The instructions in the loop are add, mov, cmp, jne according to the assembly code. The loop cost can be calculated by the output above, which are 4 cycles per 2 loops after 68th cycle when turning off functionalities and 114 cycles per 65 loops after 112th cycle when turning on functionalities, very stable.
No comments:
Post a Comment