I want to implement a copy-on-write on my custom C++ String class, and I wonder how to...
I tried to implement some options, but they all turned out very inefficient.
Thank you guys :-)
Copy-on-write or CoW is a technique to efficiently copy data resources in a computer system. If a unit of data is copied but not modified, the "copy" can exist as a reference to the original data. Only when the copied data is modified is a copy created, and new bytes are actually written.
Swift arrays implement copy-on-write optimization, which means if we create an array and simply assign it to another variable (without any of the variables modifying the array), both the arrays will share the same reference to the underlying data.
Copy-on-write (or COW) is a technique to delay or altogether prevent copying of the data. Rather than duplicate the process address space, the parent and the child can share a single copy. The data, however, is marked in such a way that if it is written to, a duplicate is made and each process receives a unique copy.
Copy on Write or simply COW is a resource management technique. One of its main use is in the implementation of the fork system call in which it shares the virtual memory(pages) of the OS. In UNIX like OS, fork() system call creates a duplicate process of the parent process which is called as the child process.
In a multi-threaded environemnt (which is most of them nowadays) CoW is frequently a huge performance hit rather than a gain. And with careful use of const references, it's not much of a performance gain even in a single threaded environment.
This old DDJ article explains just how bad CoW can be in a multithreaded environment, even if there's only one thread.
Additionally, as other people have pointed out, CoW strings are really tricky to implement, and it's easy to make mistakes. That coupled with their poor performance in threading situations makes me really question their usefulness in general. This becomes even more true once you start using C++11 move construction and move assignment.
But, to answer your question....
Here are a couple of implementation techniques that may help with performance.
First, store the length in the string itself. The length is accessed quite frequently and eliminating the pointer dereference would probably help. I would, just for consistency put the allocated length there too. This will cost you in terms of your string objects being a bit bigger, but the overhead there in space and copying time is very small, especially since these values will then become easier for the compiler to play interesting optimization tricks with.
This leaves you with a string class that looks like this:
class MyString { ... private: class Buf { ... private: ::std::size_t refct_; char *data_; }; ::std::size_t len_; ::std::size_t alloclen_; Buf *data_; };
Now, there are further optimizations you can perform. The Buf class there looks like it doesn't really contain or do much, and this is true. Additionally, it requires allocating both an instance of Buf and a buffer to hold the characters. This seems rather wasteful. So, we'll turn to a common C implementation technique, stretchy buffers:
class MyString { ... private: struct Buf { ::std::size_t refct_; char data_[1]; }; void resizeBufTo(::std::size_t newsize); void dereferenceBuf(); ::std::size_t len_; ::std::size_t alloclen_; Buf *data_; }; void MyString::resizeBufTo(::std::size_t newsize) { assert((data_ == 0) || (data_->refct_ == 1)); if (newsize != 0) { // Yes, I'm using C's allocation functions on purpose. // C++'s new is a poor match for stretchy buffers. Buf *newbuf = ::std::realloc(data_, sizeof(*newbuf) + (newsize - 1)); if (newbuf == 0) { throw ::std::bad_alloc(); } else { data_ = newbuf_; } } else { // newsize is 0 if (data_ != 0) { ::std::free(data_); data_ = 0; } } alloclen_ = newsize; }
When you do things this way, you can then treat data_->data_
as if it contained alloclen_
bytes instead of just 1.
Keep in mind that in all of these cases you will have to make sure that you either never ever use this in a multi-threaded environment, or that you make sure that refct_
is a type that you have both an atomic increment, and an atomic decrement and test instruction for.
There is an even more advanced optimization technique that involves using a union to store short strings right inside the bits of data that you would use to describe a longer string. But that's even more complex, and I don't think I will feel inclined to edit this to put a simplified example here later, but you never can tell.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With