union test{
char a; // 1 byte
int b; // 4 bytes
};
int main(){
test t;
t.a = 5;
return t.b;
}
This link says: https://en.cppreference.com/w/cpp/language/union
It's undefined behavior to read from the member of the union that wasn't most recently written.
According to this, does my sample code above have UB? If so then what's the point of a Union then? I thought the whole point it to read/write different value types form the same memory location.
If I need to access the most recently written
value then I will just use a
regular variable and not a Union.
Yes the behaviour is undefined in C++.
When you write a value to a member of union, think of that member becoming the active member.
The behaviour of reading any member of a union that is not the active member is undefined.
in C++, a union
is often coupled with another variable that serves as a means of identifying the active member.
Your implication that having unions without the possibility of reading their inactive members makes them useless is wrong. Consider the following simplified implementation of a string class:
class string {
char* data_;
size_t size_;
union {
size_t capacity_;
char buffer_[16];
};
string(const char* str) : size_(strlen(str)) {
if (size_ < 16)
data_ = buffer_; // short string, buffer_ will be active
else {
capacity_ = size_; // long string, capacity_ is active
data_ = new char[capacity_ + 1];
}
memcpy(data_, str, size_ + 1);
}
bool is_short() const { return data_ == buffer_; }
...
public:
size_t capacity() const { return is_short() ? 15 : capacity_; }
const char* data() const { return data_; }
...
};
Here, if the stored string has less then 16 characters, it is stored in buffer_
and data_
points to it. Otherwise, data_
points to a dynamically-allocated buffer.
Consequently, you can distinguish between both cases (short/long string) by comparing data_
with buffer_
. When the string is short, buffer_
is active and you don't need to read capacity_
, since you know it is 15. When the string is long, capacity_
is active and you don't need to read buffer_
, since it is unused.
Exactly this approach is used in libstdc++. It is a bit more complicated there since std::string
is just a specialization of std::basic_string
class template, but the idea is the same. Source code from include/bits/basic_string.h
:
enum { _S_local_capacity = 15 / sizeof(_CharT) };
union
{
_CharT _M_local_buf[_S_local_capacity + 1];
size_type _M_allocated_capacity;
};
It can save a lot of space if your program works with a lot of strings at once (consider, e.g., databases). Without union, each string
objects would take 8 more bytes in memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With