Google AdSense

Thursday, March 26, 2015

An example of using __rdtsc, __cpuid and __rdtscp to measure performance

Descriptions

  • According to How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures.
  • These functions are built in (that is, intrinsic) to the compiler, which are referred to as intrinsic functions or intrinsics. An intrinsic is often faster than the equivalent inline assembly. I have to use these functions because Microsoft compiler doesn't support inline assembler on x64 processors.

Downloads

Results

  • Loading hello module...
    loop_size:0 >>>> variance(cycles): 3; max_deviation: 4 ;min time: 44
    loop_size:1 >>>> variance(cycles): 32; max_deviation: 1712 ;min time: 44
    loop_size:2 >>>> variance(cycles): 3; max_deviation: 200 ;min time: 44
    .........
    .........
    loop_size:997 >>>> variance(cycles): 2; max_deviation: 4 ;min time: 44
    loop_size:998 >>>> variance(cycles): 3; max_deviation: 4 ;min time: 44
    loop_size:999 >>>> variance(cycles): 48; max_deviation: 2128 ;min time: 44

    total number of spurious min values = 0
    total variance = 48316
    absolute max deviation = 1537444
    variance of variances = 740609432999
    variance of minimum values = 0

Comments

  • The result is just the same as the version using inline assembler.

Resources

No comments:

Post a Comment