Speaking in terms of ARM Cortex-A8, does the ARM module wait or continue its operations while NEON is executing its instructions? How is this synchronization achieved? How do ARM and NEON cores synchronize, if ARM and NEON are working on the same data/code segments?
The short answer is that they are automatically synchronized ... and they are synchronized just in the same way as all other instructions are synchronized (i.e. pipeline hazard checking). In processors that can issue multiple instructions per clock cycle, NEON instructions can be issued together with non-NEON instructions.
NEON is part of the core, and uses the same caches as the regular load/store instructions. However, this also means that on some processors it can be inefficient to intermingle NEON and non-NEON loads and stores, or moving data between NEON and general-purpose registers.
They are not separate cores - NEON is implemented as an additional execution unit within an ARM core - the usual principles of superscalar architectures apply.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With