I'm making a program for school where I have a multiprocess program where each process reads a portion of a file and they work together to count the number of words in the file. I'm having an issue where if there are more than 2 processes, then all of the processes read EOF from the file before they've read their portion of the file. Here's the relevant code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
int main(int argc, char *argv[]) {
FILE *input_textfile = NULL;
char input_word[1024];
int num_processes = 0;
int proc_num = 0; //The index of this process (used after forking)
long file_size = -1;
input_textfile = fopen(argv[1], "r");
num_processes = atoi(argv[2]);
//...Normally error checking would go here
if (num_processes > 1) {
//...create space for pipes
for (proc_num = 0; proc_num < num_processes - 1; proc_num++) {
//...create pipes
pid_t proc = fork();
if (proc == -1) {
fprintf(stderr,"Could not fork process index %d", proc_num);
perror("");
return 1;
} else if (proc == 0) {
break;
}
//...link up the pipes
}
}
//This code taken from http://stackoverflow.com/questions/238603/how-can-i-get-a-files-size-in-c
//Interestingly, it also fixes a bug we had where the child would start reading at an unpredictable place
//No idea why, but apparently the offset wasn't guarenteed to start at 0 for some reason
fseek(input_textfile, 0L, SEEK_END);
file_size = ftell(input_textfile);
fseek(input_textfile, proc_num * (1.0 * file_size / num_processes), 0);
//read all words from the file and add them to the linked list
if (file_size != 0) {
//Explaination of this mess of a while loop:
// if we're a child process (proc_num < num_processes - 1), then loop until we make it to where the next
// process would start (the ftell part)
// if we're the parent (proc_num == num_processes - 1), loop until we reach the end of the file
while ((proc_num < num_processes - 1 && ftell(input_textfile) < (proc_num + 1) * (1.0 * file_size / num_processes))
|| (proc_num == num_processes - 1 && ftell(input_textfile) < file_size)){
int res = fscanf(input_textfile, "%s", input_word);
if (res == 1) {
//count the word
} else if (res == EOF && errno != 0) {
perror("Error reading file: ");
exit(1);
} else if (res == EOF && ftell(input_textfile) < file_size) {
printf("Process %d found unexpected EOF at %ld.\n", proc_num, ftell(input_textfile));
exit(1);
} else if (res == EOF && feof(input_textfile)){
continue;
} else {
printf("Scanf returned unexpected value: %d\n", res);
exit(1);
}
}
}
//don't get here anyway, so no point in closing files and whatnot
return 0;
}
Output when running the file with 3 processes:
All files opened successfully
Process 2 found unexpected EOF at 1323008.
Process 1 found unexpected EOF at 823849.
Process 0 found unexpected EOF at 331776.
The test file that causes the error: https://dl.dropboxusercontent.com/u/16835571/test34.txt
Compile with:
gcc main.c -o wordc-mp
and run as:
wordc-mp test34.txt 3
It's worth noting that only that particular file gives me issues, but the offsets of the error keep changing so it's not the contents of the file.
You have created your file descriptor before forking. A child process inherits the file descriptor which point to the same file description of the parent, and thus, advancing with one of the children make the cursor advance for all the children.
From "man fork", you can have the confirmation :
The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.
The child inherits copies of the parent's set of open file descrip‐ tors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. This means that the two descriptors share open file status flags, current file offset, and signal-driven I/O attributes (see the description of F_SETOWN and F_SETSIG in fcntl(2)).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With