I am trying to write a simple piece of code to read values from a CSV file with a max of 100 entries into an array of structs.
Example of a line of the CSV file:
1,Mr,James,Quigley,Director,200000,0
I use the following code to read in the values, but when I print out the values they are incorrect
for(i = 0; i < 3; i++) /*just assuming number of entries here to demonstrate problem*/
{
fscanf(f, "%d,%s,%s,%s,%s,%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);
}
Then when I print out the first name, the values are all assigned to the first name:
for(j = 0; j < 3; j++) /* test by printing values*/
{
printf("Employee name is %s\n", inArray[j].firstName);
}
Gives ames,Quigley,Director,200000,0
and so on in that way. I am sure it's how i format the fscanf line but I can't get it to work.
Here is my struct I'm reading into:
typedef struct Employee
{
int ID;
char salutation[4];
char firstName[21];
char surName[31];
char position[16];
int sal;
int deleted;
} Employee;
Parsing CSV File Into a Bash Array We can then use the array to process the records. In this example, we read the line from our input CSV, and then appended it to the array arr_csv (+= is used to append the records to Bash array). Then we printed the records of the array using a for loop. This reads lines from input.
This is because a string %s
can contain the comma, so it gets scanned into the first string. There's no "look-ahead" in the scanf()
formatting specifier, the fact that the %s
is followed by a comma in the format specification string means nothing.
Use character groups (search the manual for [).
const int got = fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID,
inArray[i].salutation, inArray[i].firstName,
inArray[i].surName, inArray[i].position, &inArray[i].sal,
&inArray[i].deleted);
And learn to check the return value, since I/O calls can fail! Don't depend on the data being valid unless got
is 7.
To make your program read the entire file (multiple records, i.e. lines), I would recommend loading entire lines into a (large) fixed-size buffer with fgets()
, then using sscanf()
on that buffer to parse out the column values. That is much easier and will ensure that you really do scan separate lines, calling fscanf()
in a loop will not, since to fscanf()
a linefeed is just whitespace.
Might as well post my comment as an answer:
%s
reads a full word by default.
It finds the %d
, the integer part, then the ,
, and then it has to read a string. ,
is considered valid in a word (it is not a whitespace), so it reads until the end of the line (there is no whitespace until then), not until the first comma... And the rest remains empty. (From this answer)
You have to change the separator with specifying a regex:
fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);
Instead of %s
, use %[^,]
, which means "grab all chars, and stop when found a ,
".
EDIT
%[^,]s
is bad, it would need a literal s
after the end of the scanset... Thanks @MichaelPotter
(From Changing the scanf() delimiter and Reading values from CSV file into variables )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With