on xeon, does addq require 2 clock cycles?

2

How many cycles are required to add long ints on a xeon?

from timing trials, it appears that 2 clock ticks are needed.

/proc/info says

model name : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz

sample c code (loop.02.c)

unsigned long i, j=0;
for(i=0; i<(0xFFFFFFFF);i++)   j+=3;

assembler code

21:loop.02.c     **** for(i=0; i<(0xFFFFFFFF);i++) j+=3;
34                            .loc 1 21 0
35 001e 48C745F0              movq    $0, -16(%rbp)
35      00000000
36 0026 EB0A                  jmp     .L2
37                    .L3:
38 0028 488345F8              addq    $3, -8(%rbp)
38      03
39 002d 488345F0              addq    $1, -16(%rbp)
39      01
40                    .L2:
41 0032 B8FEFFFF              movl    $4294967294, %eax
41      FF
42 0037 483945F0              cmpq    %rax, -16(%rbp)
43 003b 76EB                  jbe     .L3

so the loop executes 5 instructions: addq addq movl cmpq jbe

the loop iterates 0xFFFFFFFF = 16^8 = 4294967296 = 4G times

/usr/bin/time -f %e ./loop.02

yields 10.78 seconds, almost all of which is in the loop

4G*5 instructions / 10.78 seconds = 1.85G instructions/sec

versus the advertised 2.80GHz cycles/sec

for an average of 2.8/1.8 = 1.6 cycles/instruction

that rate makes sense if, say, 2 of the 5 instructions in the loop take 2 cycles and the others take 1 cycle.

is this the correct interpretation of the timing results?

where is a specification of the number of cycles per instruction?

xeon
clockspeed
asked on Super User Sep 19, 2012 by boddyl • edited Jan 11, 2020 by Hennes

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0