Google AdSense

Thursday, March 26, 2015

The resolution by using __rdtsc, __cpuid and __rdtscp to measure performance on Python

Descriptions

  • According to How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures.

Downloads

Results

  • The result is by running an empty loop.
    Loading hello module...
    loop_size: 0 >>>> variance(cycles): 9875; max_deviation: 4248 ;min time: 1149
    loop_size: 1 >>>> variance(cycles): 15458; max_deviation: 4950 ;min time: 1566
    loop_size: 2 >>>> variance(cycles): 20495; max_deviation: 5643 ;min time: 1734
    loop_size: 3 >>>> variance(cycles): 15559; max_deviation: 4608 ;min time: 1788
    loop_size: 4 >>>> variance(cycles): 19601; max_deviation: 5550 ;min time: 1830
    loop_size: 5 >>>> variance(cycles): 19101; max_deviation: 4878 ;min time: 1887
    loop_size: 6 >>>> variance(cycles): 53959; max_deviation: 18093 ;min time: 1899
    loop_size: 7 >>>> variance(cycles): 19669; max_deviation: 5022 ;min time: 1956
    loop_size: 8 >>>> variance(cycles): 12512; max_deviation: 4458 ;min time: 1989
    loop_size: 9 >>>> variance(cycles): 20429; max_deviation: 4977 ;min time: 2070
    loop_size: 10 >>>> variance(cycles): 16398; max_deviation: 5097 ;min time: 2082
    loop_size: 11 >>>> variance(cycles): 23348; max_deviation: 5352 ;min time: 2142
    .........
    .........
    loop_size: 994 >>>> variance(cycles): 61068431; max_deviation: 527208 ;min time: 73176
    loop_size: 995 >>>> variance(cycles): 55047806; max_deviation: 541896 ;min time: 73107
    loop_size: 996 >>>> variance(cycles): 51354276; max_deviation: 527874 ;min time: 73275
    loop_size: 997 >>>> variance(cycles): 39929555; max_deviation: 532137 ;min time: 73317
    loop_size: 998 >>>> variance(cycles): 107756539; max_deviation: 718764 ;min time: 73431
    loop_size: 999 >>>> variance(cycles): 104747306; max_deviation: 720498 ;min time: 73419

    total number of spurious min values = 167
    total variance = 12669681
    absolute max deviation = 720498
    variance of variances = 487873495499190
    variance of minimum values = 473095955
    minimum value = 1149

Comments

  • The value is not stable. The situation that running long loop is faster than running short loop sometimes happens. Overall, the digits above hundreds digit are trustable and others should be ignored. Since my computer runs 3484046235 cycles per second and QueryPerformanceCounter counts 3404332 times per second, if we only watch the trustable parts, there is not much difference in precision between them. Because QueryPerformanceCounter has fixed time interval between every tick, it is better to use that.

Resources

Reviews

  • The SDL library contains the functions related to QueryPerformanceCounter, you can directly use it.

No comments:

Post a Comment