I’ve been reading the Programming Pearls for the last few weeks. It is a very interesting and inspirational book. In the appendix, I found a small C speed benchmark program. I changed the source code (casting the pointer to right type) so that it runs on my 64 bit system. The test results is surprising but very logical. Modern CPU runs much faster on division and mod operation. Overall, my 2.8G Core 2 Due takes 13 seconds to terminate while the 1999 PII CPU takes 513 seconds.
Testing result on a 2.8Ghz Intel Core 2 Duo CPU (Date: Feb 23 2010)
C Time Cost Model, n=5000
Integer Arithmetic (n=5000) nanosecond per instruction
{} 66372 66265 66226 66217 66212 3
k++ 61226 61101 61105 61089 61112 2
k = i + j 66915 66618 66593 66589 66378 3
k = i - j 66905 66215 66413 66270 67377 3
k = i * j 66908 66768 66696 66800 66713 3
k = i / j 84009 83901 83915 83896 83916 3 division and mod operation cost the same amount of time as addition and subtraction!
k = i % j 87329 87328 87340 87317 87335 3
k = i & j 66861 66588 66563 66471 66507 3
k = i | j 67665 67547 67562 67493 67586 3
Floating Point Arithmetic (n=5000)
fj=j; 61255 61130 61096 61151 61094 2
fj=j; fk = fi + fj; 107793107778107782107780107777 4
fj=j; fk = fi - fj; 107816107781107784107822107783 4
fj=j; fk = fi * fj; 116801116769116760116763116767 5
fj=j; fk = fi / fj; 197412197467197431197405197428 8
Array Operations (n=5000)
k = i + j 68272 68202 68198 68208 68196 3
k = x[i] + j 61465 61294 61280 61305 61280 2
k = i + x[j] 62285 62135 62112 62137 62108 2
k = x[i] + x[j] 67986 67857 67861 67854 67871 3
Comparisons (n=5000)
if (i < j) k++ 67828 67620 67623 67623 67608 3
if (x[i] < x[j]) k++ 68377 68288 68286 68280 68276 3
Array Comparisons and Swaps (n=5000)
k = (x[i]<x[k]) ? -1:1 80980 80932 80932 80928 80931 3
k = intcmp(x+i, x+j) 142630142660142621142593142599 6
swapmac(i, j) 108802108758108734108766108734 4
swapfunc(i, j) 189412189421189412189460189490 8
Max Function, Macro and Inline (n=5000)
k = (i > j) ? i : j 75729 75485 75551 75480 75543 3
k = maxmac(i, j) 75842 75592 75522 75535 75518 3
k = maxfunc(i, j) 143934143780143783143748143777 6
Math Functions (n=1000)
fk = j+fi; 2733 2736 2739 2734 2734 3
k = rand(); 12512 12529 12557 12476 12485 13
fk = sqrt(j+fi) 11497 11496 11506 11503 11500 12
fk = sin(j+fi) 46135 46116 46092 46128 46106 46
fk = sinh(j+fi) 11829 11819 11824 11837 11832 12
fk = asin(j+fi) 6392 6387 6390 6386 6387 6
fk = cos(j+fi) 46199 46183 46177 46174 46179 46
fk = tan(j+fi) 55542 55496 55507 55488 55501 56
Memory Allocation (n=500)
free(malloc(16)); 13106 13017 12932 12779 12959 52
free(malloc(100)); 12972 12947 12867 13016 12923 52
free(malloc(2000)); 11653 11683 11536 11673 11620 47
Secs: 13.33
Testing result of a PII CPU in 1999
C Time Cost Model, n=5000
Integer Arithmetic (n=5000)
{} 1263 1313 1263 1263 1263 5
k++ 2011 2011 2010 2011 2011 8
k = i + j 2260 2260 2260 2260 2260 9
k = i - j 2244 2260 2259 2260 2260 9
k = i * j 4023 4005 4023 4022 4023 16
k = i / j 13814 13814 13980 13914 14163 56
k = i % j 13814 13814 13814 13797 13797 55
k = i & j 2260 2277 2260 2260 2260 9
k = i | j 2260 2260 2260 2260 2260 9
Floating Point Arithmetic (n=5000)
fj=j; 5019 5003 5021 4970 5036 20
fj=j; fk = fi + fj; 5021 5019 5019 5021 5019 20
fj=j; fk = fi - fj; 5019 5021 5036 5019 5021 20
fj=j; fk = fi * fj; 5036 5021 5019 5019 5038 20
fj=j; fk = fi / fj; 12549 12551 12551 12551 12567 50
Array Operations (n=5000)
k = i + j 2260 2260 2260 2260 2260 9
k = x[i] + j 2808 2810 2775 2808 2810 11
k = i + x[j] 2908 3341 2893 2875 2860 12
k = x[i] + x[j] 3490 3507 3490 3490 3490 14
Comparisons (n=5000)
if (i < j) k++ 2393 2393 2377 2393 2393 10
if (x[i] < x[j]) k++ 2926 2924 2926 2908 2926 12
Array Comparisons and Swaps (n=5000)
k = (x[i]<x[k]) ? -1:1 3042 3057 3059 3059 3057 12
k = intcmp(x+i, x+j) 7015 7015 7015 7015 7016 28
swapmac(i, j) 15077 15110 15077 15110 15094 60
swapfunc(i, j) 8477 8479 8477 8494 8495 34
Max Function, Macro and Inline (n=5000)
k = (i > j) ? i : j 2509 2509 2509 2526 2511 10
k = maxmac(i, j) 2509 2509 2526 2509 2509 10
k = maxfunc(i, j) 5536 5519 5536 5519 5534 22
Math Functions (n=1000)
fk = j+fi; 151 149 166 166 149 16
k = rand(); 715 680 698 682 697 69
fk = sqrt(j+fi) 2360 2343 2360 2343 2343 235
fk = sin(j+fi) 2743 2742 2742 2743 2758 275
fk = sinh(j+fi) 9209 9226 9226 9242 9226 923
fk = asin(j+fi) 7747 7730 7728 7714 7714 773
fk = cos(j+fi) 2758 2758 2760 2758 2760 276
fk = tan(j+fi) 3256 3258 3258 3258 3258 326
Memory Allocation (n=500)
free(malloc(16)); 1113 1113 1113 1113 1097 444
free(malloc(100)); 1296 1296 1296 1296 1296 519
free(malloc(2000)); 1296 1279 1298 1279 1296 516
Secs: 513.37