The resolution by using __rdtsc, __cpuid and __rdtscp to measure performance on Python
Descriptions
- According to How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures.
Downloads
Results
- The result is by running an empty loop.
Loading hello module...
loop_size: 0 >>>> variance(cycles): 9875; max_deviation: 4248 ;min time: 1149
loop_size: 1 >>>> variance(cycles): 15458; max_deviation: 4950 ;min time: 1566
loop_size: 2 >>>> variance(cycles): 20495; max_deviation: 5643 ;min time: 1734
loop_size: 3 >>>> variance(cycles): 15559; max_deviation: 4608 ;min time: 1788
loop_size: 4 >>>> variance(cycles): 19601; max_deviation: 5550 ;min time: 1830
loop_size: 5 >>>> variance(cycles): 19101; max_deviation: 4878 ;min time: 1887
loop_size: 6 >>>> variance(cycles): 53959; max_deviation: 18093 ;min time: 1899
loop_size: 7 >>>> variance(cycles): 19669; max_deviation: 5022 ;min time: 1956
loop_size: 8 >>>> variance(cycles): 12512; max_deviation: 4458 ;min time: 1989
loop_size: 9 >>>> variance(cycles): 20429; max_deviation: 4977 ;min time: 2070
loop_size: 10 >>>> variance(cycles): 16398; max_deviation: 5097 ;min time: 2082
loop_size: 11 >>>> variance(cycles): 23348; max_deviation: 5352 ;min time: 2142
.........
.........
loop_size: 994 >>>> variance(cycles): 61068431; max_deviation: 527208 ;min time: 73176
loop_size: 995 >>>> variance(cycles): 55047806; max_deviation: 541896 ;min time: 73107
loop_size: 996 >>>> variance(cycles): 51354276; max_deviation: 527874 ;min time: 73275
loop_size: 997 >>>> variance(cycles): 39929555; max_deviation: 532137 ;min time: 73317
loop_size: 998 >>>> variance(cycles): 107756539; max_deviation: 718764 ;min time: 73431
loop_size: 999 >>>> variance(cycles): 104747306; max_deviation: 720498 ;min time: 73419
total number of spurious min values = 167
total variance = 12669681
absolute max deviation = 720498
variance of variances = 487873495499190
variance of minimum values = 473095955
minimum value = 1149
Comments
- The value is not stable. The situation that running long loop is faster than running short loop sometimes happens. Overall, the digits above hundreds digit are trustable and others should be ignored. Since my computer runs 3484046235 cycles per second and QueryPerformanceCounter counts 3404332 times per second, if we only watch the trustable parts, there is not much difference in precision between them. Because QueryPerformanceCounter has fixed time interval between every tick, it is better to use that.
Resources
Reviews
- The SDL library contains the functions related to QueryPerformanceCounter, you can directly use it.
No comments:
Post a Comment