I would like some help please to better understand a part of the following passage:
"The volatile keyword qualifier indicates that the variable can be changed outside of the program.For example, an external device may write data to a port. Compilers will sometimes temporarily use a cache, or register, to hold the value in a memory location for optimization purposes. If the external write modifies the memory location,then this change will not be reflected in the cached or register value."
(It comes from the book: understanding and using c pointers, pg 178-179)
The ambiguity i have is between these phrases: "to hold the value in a memory location" and "If the external write modifies the memory location".
My problem is: I get the impression that if an external device writes data to port, that data will be stored to some location (???), then they will be stored to the register/cache (??) and then inside the variable of the c language source code. Something is misunderstood by me. From what i know the normal workflow should be: external device->small temporary buffer ->variable in RAM memory,(when data are going from a gadget to the MCU's RAM)
#define PORT 0xB0000000
unsigned int volatile * const port = (unsigned int*) PORT;
*port = 0x0BF4; // write to port
value = *port; // read from port
Memory mapped I/O devices don't go through the CPU core's registers (or cache, typically). That's why they're external, they just hang somewhere on the memory bus, pretending to be memory.
So values from such a device will appear directly in what (to the CPU) looks like memory.
In the example you gave, this:
*port = 0x0BF4; // write to port
could perhaps cause an A/D converter to start a conversion, and this
value = *port; // read from port
could read in the resulting value. This is not a very typical design (A/D converters tend to be a bit more complicated than that, and so on) but it's possible.
If a compiler thought "hey, that there is just a read from a location to which this value was written" it might replace the two statements with
value = 0x0BF4; // "optimized", but broken since no more I/O occurs
This would ruin your day, if you were trying to read values from that A/D converter.
Declaring the location volatile
tells the compiler to not make any assumptions about the side-effects of accesses to the location.
If you look at something like an STM32F4 ARM-based microcontroller, it has tons of memory-mapped I/O (serial ports, USB controller, Ethernet, timers, A/D and D/A converters, ... they're all there) plus a bunch of internal (to the core, but still memory-mapped) things.
As stated by others, these are items that are external to the CPU core itself, it could be ram it could be a memory mapped peripheral (a uart status register lets say or a timer register, etc).
#define SOME_STATUS_REGA (*((volatile unsigned int *)0x10008000))
void fun ( void )
{
while(SOME_STATUS_REGA==0) continue;
}
#define SOME_STATUS_REGB (*((unsigned int *)0x10008000))
void more_fun ( void )
{
while(SOME_STATUS_REGB==0) continue;
}
with one target and toolchain produces
00000000 <fun>:
0: e59f200c ldr r2, [pc, #12] ; 14 <fun+0x14>
4: e5923000 ldr r3, [r2]
8: e3530000 cmp r3, #0
c: 0afffffc beq 4 <fun+0x4>
10: e12fff1e bx lr
14: 10008000 andne r8, r0, r0
00000018 <more_fun>:
18: e59f300c ldr r3, [pc, #12] ; 2c <more_fun+0x14>
1c: e5933000 ldr r3, [r3]
20: e3530000 cmp r3, #0
24: 112fff1e bxne lr
28: eafffffe b 28 <more_fun+0x10>
2c: 10008000 andne r8, r0, r0
you can see with the more_fun, not volatile case it reads the location one time does the comparison one time but goes into an infinite loop. The compiler has done what we told it to do since there is no way that variable can change there is no reason to burn clock cycles re-reading something that wont change so if it wasnt zero the first and only read it will never be zero so this falls into an infinite loop.
If you make it volatile you are "asking" the compiler to read or write it every time your code accesses it. Which you can see in the fun case, it goes back every time through the loop to read that address to see if it has changed. The volatile keyword is what made the difference between these two behaviors.
It doesnt have to be hardware that changes these values, if you use a global variable to communicate between an isr and foreground code then that variable in memory can be changed by the isr and/or by the foreground code so both need to treat it as volatile.
You also have the case of a multicore/multithreaded processor where each core/thread independently has access to shared resources. Not only do you need to use a volatile in that situation but you might need to have that ram not cached if the cores do not share the same cache and may have to have hardware and/or software locking if atomic operations are needed (ldrex/strex in the ARM world are the first step for that).
EDIT
Another demonstration, the problem is not only with reads, but with writes as well. Lets say you have a peripheral that you need to write a config register to setup some mode, then you write it again to enable it with that mode. or you have a hardware interface where each write increments some logic pointer and you do a series of writes to do something.
#define SOMETHING1 (*((volatile unsigned char *)0x10002000))
void fun ( void )
{
SOMETHING1=5;
SOMETHING1=5;
SOMETHING1=6;
}
#define SOMETHING2 (*((unsigned char *)0x10002000))
void more_fun ( void )
{
SOMETHING2=5;
SOMETHING2=5;
SOMETHING2=6;
}
without volatile, that peripheral is not going to operate properly. The multiple writes to the same pointer/address are considered dead code and optimized out.
00000000 <fun>:
0: e3a02005 mov r2, #5
4: e3a01006 mov r1, #6
8: e59f300c ldr r3, [pc, #12] ; 1c <fun+0x1c>
c: e5c32000 strb r2, [r3]
10: e5c32000 strb r2, [r3]
14: e5c31000 strb r1, [r3]
18: e12fff1e bx lr
1c: 10002000 andne r2, r0, r0
00000020 <more_fun>:
20: e3a02006 mov r2, #6
24: e59f3004 ldr r3, [pc, #4] ; 30 <more_fun+0x10>
28: e5c32000 strb r2, [r3]
2c: e12fff1e bx lr
30: 10002000 andne r2, r0, r0
EDIT2
Clang/llvm demonstrates the problem as well
#define A (*((volatile unsigned char *)0x10002000))
void afun ( void )
{
A = 4;
A = 5;
A = 6;
A |= 1;
while(A==0) continue;
}
#define B (*((unsigned char *)0x10002000))
void bfun ( void )
{
B = 4;
B = 5;
B = 6;
B |= 1;
while(B==0) continue;
}
Producing
00000000 <afun>:
0: e3a00a02 mov r0, #8192 ; 0x2000
4: e3a01004 mov r1, #4
8: e3800201 orr r0, r0, #268435456 ; 0x10000000
c: e5c01000 strb r1, [r0]
10: e3a01005 mov r1, #5
14: e5c01000 strb r1, [r0]
18: e3a01006 mov r1, #6
1c: e5c01000 strb r1, [r0]
20: e5d01000 ldrb r1, [r0]
24: e3811001 orr r1, r1, #1
28: e5c01000 strb r1, [r0]
2c: e5d01000 ldrb r1, [r0]
30: e3510000 cmp r1, #0
34: 0afffffc beq 2c <afun+0x2c>
38: e12fff1e bx lr
0000003c <bfun>:
3c: e3a00a02 mov r0, #8192 ; 0x2000
40: e3a01007 mov r1, #7
44: e3800201 orr r0, r0, #268435456 ; 0x10000000
48: e5c01000 strb r1, [r0]
4c: e12fff1e bx lr
Adding the volatile wont hurt you if you are doing onesy twosy things that are not in a domain that can optimize them out. (a single write to each register in some sequence, a single read of a register, single also implying no loops). It will most definitely hurt you if you are doing more than one write (which often happens when configuring a peripheral) doing a read modify write (x |= something, y &= something, z ^= something, etc).
If you are using a toolchain that doesnt have an optimizer or you choose not to optimize you wont have this problem, but that code is not portable if you leave the volatiles off, you will eventually run into trouble if you dont habitually deal with variables/code that crosses compile or other similar domains (hardware is a separate compile domain from software).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With