How to implement Copy-on-Write?

1 Answers

In a multi-threaded environemnt (which is most of them nowadays) CoW is frequently a huge performance hit rather than a gain. And with careful use of const references, it's not much of a performance gain even in a single threaded environment.

This old DDJ article explains just how bad CoW can be in a multithreaded environment, even if there's only one thread.

Additionally, as other people have pointed out, CoW strings are really tricky to implement, and it's easy to make mistakes. That coupled with their poor performance in threading situations makes me really question their usefulness in general. This becomes even more true once you start using C++11 move construction and move assignment.

But, to answer your question....

Here are a couple of implementation techniques that may help with performance.

First, store the length in the string itself. The length is accessed quite frequently and eliminating the pointer dereference would probably help. I would, just for consistency put the allocated length there too. This will cost you in terms of your string objects being a bit bigger, but the overhead there in space and copying time is very small, especially since these values will then become easier for the compiler to play interesting optimization tricks with.

This leaves you with a string class that looks like this:

class MyString {    ...  private:    class Buf {       ...     private:       ::std::size_t refct_;       char *data_;    };     ::std::size_t len_;    ::std::size_t alloclen_;    Buf *data_; };

Now, there are further optimizations you can perform. The Buf class there looks like it doesn't really contain or do much, and this is true. Additionally, it requires allocating both an instance of Buf and a buffer to hold the characters. This seems rather wasteful. So, we'll turn to a common C implementation technique, stretchy buffers:

class MyString {    ...  private:    struct Buf {       ::std::size_t refct_;       char data_[1];    };     void resizeBufTo(::std::size_t newsize);    void dereferenceBuf();     ::std::size_t len_;    ::std::size_t alloclen_;    Buf *data_; };  void MyString::resizeBufTo(::std::size_t newsize) {    assert((data_ == 0) || (data_->refct_ == 1));    if (newsize != 0) {       // Yes, I'm using C's allocation functions on purpose.       // C++'s new is a poor match for stretchy buffers.       Buf *newbuf = ::std::realloc(data_, sizeof(*newbuf) + (newsize - 1));       if (newbuf == 0) {          throw ::std::bad_alloc();       } else {          data_ = newbuf_;       }    } else { // newsize is 0       if (data_ != 0) {          ::std::free(data_);          data_ = 0;       }    }    alloclen_ = newsize; }

When you do things this way, you can then treat data_->data_ as if it contained alloclen_ bytes instead of just 1.

Keep in mind that in all of these cases you will have to make sure that you either never ever use this in a multi-threaded environment, or that you make sure that refct_ is a type that you have both an atomic increment, and an atomic decrement and test instruction for.

There is an even more advanced optimization technique that involves using a union to store short strings right inside the bits of data that you would use to describe a longer string. But that's even more complex, and I don't think I will feel inclined to edit this to put a simplified example here later, but you never can tell.

170

answered Sep 20 '22 16:09

Omnifarious

Related questions
                            
                                HTTP, 408 Request timeout
                            
                                ASP.net based open source support ticket system [closed]
                            
                                symbols in restructuredText
                            
                                .Net Where to find the official specification of the BinaryFormatter serialization format?
                            
                                How do I make a thread dump with MONO?
                            
                                Reading a specific line from a text file in Java
                            
                                In Selenium IDE, how to get the value of the base url
                            
                                Fast Cross Platform Inter Process Communication in C++
                            
                                unit testing asp mvc view
                            
                                Strings and Garbage Collection
                            
                                \center environment centers the whole document in LaTeX
                            
                                How do I create a UIViewController programmatically?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to implement Copy-on-Write?

Tags:

fiveOthersWaiting

People also ask

1 Answers

Omnifarious

Recent Activity

Donate For Us