I am working on implementing a database server in C that will handle requests from multiple clients. I am using fork()
to handle connections for individual clients.
The server stores data in the heap, which consists of a root pointer to hash tables of dynamically allocated records. The records are structs that have pointers to various data-types. I would like for the processes to be able to share this data so that, when a client makes a change to the heap, the changes will be visible for the other clients.
I have learned that fork()
uses COW (Copy On Write), and my understanding is that it copies the heap (and stack) memory of the parent process when the child tries to modify the data in memory.
I have found out that I can use the shm library to share memory.
Would the code below be a valid way to share heap memory (in shared_string)? If a child were to use similar code (i.e. starting from //start), would other children be able to read/write to it while the child is running and after it's dead?
key_t key;
int shmid;
key = ftok("/tmp",'R');
shmid = shmget(key, 1024, 0644 | IPC_CREAT);
//start
char * string;
string = malloc(sizeof(char) * 10);
strcpy(string, "a string");
char * shared_string;
shared_string = shmat(shmid, string, 0);
strcpy(shared_string, string);
Here are some of my thoughts/concerns regarding this:
I'm thinking about sharing the root pointer of the database. I'm not sure if that would work or if I have to mark all allocated memory as shared.
I'm not sure if the parent / other children are able to access memory allocated by a child.
I'm not sure if a child's allocated memory stays on the heap after it is killed, or if that memory is released.
First of all, fork
is completely inappropriate for what you're trying to achieve. Even if you can make it work, it's a horrible hack. In general, fork
only works for very simplistic programs anyway, and I would go so far as to say that fork
should never be used except followed quickly by exec
, but that's aside from the point here. You really should be using threads.
With that said, the only way to have memory that's shared between the parent and child after fork
, and where the same pointers are valid in both, is to mmap
(or shmat
, but that's a lot fuglier) a file or anonymous map with MAP_SHARED
prior to the fork
. You cannot create new shared memory like this after fork
because there's no guarantee that it will get mapped at the same address range in both.
Just don't use fork
. It's not the right tool for the job.
I think you are basically looking to do what is done by Redis (and probably others). They describe it in http://redis.io/topics/persistence (search for "copy-on-write").
The primary benefit to using this method is avoidance of locking, which can be a pain to get right.
As far as I understand it the idea of using COW is to:
Things to watch out for:
You may want to take gander at the redis code as well
I'm thinking about sharing the root pointer of the database. I'm not sure if that would work or if I have to mark all allocated memory as shared.
Each process will have its own private memory range. Copy-on-write is a kernel-space optimization that is transparent to user space.
As others have said, SHM or mmap'd files are the only way to share memory between separate processes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With