Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading values from CSV file into variables

I am trying to write a simple piece of code to read values from a CSV file with a max of 100 entries into an array of structs.

Example of a line of the CSV file:

1,Mr,James,Quigley,Director,200000,0

I use the following code to read in the values, but when I print out the values they are incorrect

for(i = 0; i < 3; i++) /*just assuming number of entries here to demonstrate problem*/
    {
    fscanf(f, "%d,%s,%s,%s,%s,%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);
    } 

Then when I print out the first name, the values are all assigned to the first name:

for(j = 0; j < 3; j++) /* test by printing values*/
    {
    printf("Employee name is %s\n", inArray[j].firstName);
    } 

Gives ames,Quigley,Director,200000,0 and so on in that way. I am sure it's how i format the fscanf line but I can't get it to work.

Here is my struct I'm reading into:

typedef struct Employee
    {
    int ID;
    char salutation[4];
    char firstName[21];
    char surName[31];
    char position[16];
    int sal;
    int deleted;
    } Employee;
like image 728
Dawson Avatar asked Sep 11 '13 09:09

Dawson


People also ask

How do I parse a CSV file in shell script?

Parsing CSV File Into a Bash Array We can then use the array to process the records. In this example, we read the line from our input CSV, and then appended it to the array arr_csv (+= is used to append the records to Bash array). Then we printed the records of the array using a for loop. This reads lines from input.


2 Answers

This is because a string %s can contain the comma, so it gets scanned into the first string. There's no "look-ahead" in the scanf() formatting specifier, the fact that the %s is followed by a comma in the format specification string means nothing.

Use character groups (search the manual for [).

const int got = fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID,
                       inArray[i].salutation, inArray[i].firstName,
                       inArray[i].surName, inArray[i].position, &inArray[i].sal, 
                       &inArray[i].deleted);

And learn to check the return value, since I/O calls can fail! Don't depend on the data being valid unless got is 7.

To make your program read the entire file (multiple records, i.e. lines), I would recommend loading entire lines into a (large) fixed-size buffer with fgets(), then using sscanf() on that buffer to parse out the column values. That is much easier and will ensure that you really do scan separate lines, calling fscanf() in a loop will not, since to fscanf() a linefeed is just whitespace.

like image 164
unwind Avatar answered Sep 28 '22 10:09

unwind


Might as well post my comment as an answer:

%s reads a full word by default.

It finds the %d, the integer part, then the ,, and then it has to read a string. , is considered valid in a word (it is not a whitespace), so it reads until the end of the line (there is no whitespace until then), not until the first comma... And the rest remains empty. (From this answer)

You have to change the separator with specifying a regex:

fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);

Instead of %s, use %[^,], which means "grab all chars, and stop when found a ,".

EDIT

%[^,]s is bad, it would need a literal s after the end of the scanset... Thanks @MichaelPotter

(From Changing the scanf() delimiter and Reading values from CSV file into variables )

like image 44
ppeterka Avatar answered Sep 28 '22 10:09

ppeterka