Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An attempt to understand what a Clock cycle is through example

I'm trying to fully understand what a clock cycle is, so I've come up with a test example that I'd like someone to confirm or dispel and offer a better understanding to. If I have this simple line of code, a while loop, running on a device

while(true)
{
  int x = 5;
}

Does the command: int x = 5 get executed once per clock cycle? In other words, is the clock speed the speed at which the device is able to read and execute commands per unit of time measure?

like image 869
kjh Avatar asked Dec 03 '22 00:12

kjh


2 Answers

A clock cycle is simply a single cycle of the oscillator that drives a processor's logic, what a processor might be capable of achieving in that cycle depends on the processor architecture and other factors such as memory speed.

The code in your example is in a high level language and almost certainly translates to multiple machine-level instructions if translated directly. In pseudo-machine code for example:

loop:
   MOV addrx,#5
   JMP loop

That would be at least two machine cycles per loop. There is little or no deterministic relationship between high level code and the generated machine instructions; although in this simple case, it may seems so.

The issue is further complicated by how an instruction set is implemented by a processor. A typical RISC processor executes an instruction in a single cycle, while on a CISC processor, different individual instructions each take a different number of cycles depending on their complexity.

Another consideration is memory bus latency. Often a processor is capable of executing instructions faster than it is able to access memory, this is often especially true of flash memory. An instruction accessing slower memory may introduce wait-states, where the processor is stalled until the data arrives.

Some processors have the ability to execute instructions in parallel, allowing multiple instructions in a single cycle. Others employ SIMD (single instruction-multiple data) instructions that can perform the same operation on different data at the same time.

Another technique that affects instruction throughput is pipe-lining, where an instruction may take multiple cycles, but a new instruction can be started on each cycle, so say if 5 four cycle instructions are each started one after the other, a result is yielded once per cycle.

Some processors employ a Harvard architecture that uses separate buses to allow the simultaneous fetching of data and instructions.

Other techniques are employed to maintain instruction throughput such as branch prediction. A high-level language compiler will often generate code that will maximise the potential of all the techniques mentioned above.

Often a performance measure that is given for a particular architecture is MIPS/MHz - an indication of the number of instructions typically executed per clock cycle (amortized over many clock cycles). An ARM Cortex-M3 for example manages 1.25 MIPS/MHz, while a Renesas SH-4 achieves 1.8 MIPS/MHz.

like image 66
Clifford Avatar answered Dec 05 '22 14:12

Clifford


Where do I begin...

Any processor has a "clock" that ensures that bits of electronics have time to transition from one state to another before the next thing happens. At the speeds of modern devices, nothing is "instantaneous" - a "step" becomes a "slope", and even a very short trace will cause delays in tranmission of electical signals.

Depending on the architecture of a CPU, it can do certain operations "in one clock cycle", while others take "multiple cycles". Think of long division - you do a series of subtract - shift operations, and you don't know what you need to do next until you have completed the previous part of the operation. For addition, it is easier to see how you could achieve a complete operation in one cycle.

When a particular "high level" instruction is translated into machine code, the resulting code can take one or more cycles - and a simple instruction can take one or more steps. Depending on the compiler, the target, and the optimizations chosen, any of the following could happen in your above code:

  • the compiler realizes that the "while" condition is always true, and that nothing changes inside the loop. It further realizes that you never use the value of x, and it chooses not to implement the instruction at all

  • the compiler decides to use a built in register for the int variable x, and it initializes it at compile time. No time taken during the execution of the loop

  • the compiler loads '5' into a register, looks up the offset of x in a table, computes a pointer, and copies the register into the offset address. Could be any number of cycles.

Not sure this really helped you - but the question is rather complicated...

like image 23
Floris Avatar answered Dec 05 '22 14:12

Floris