Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using fread() to read text file into a buffer - why are the values in the buffer not each character's respective ASCII value?

Tags:

c

First off, this isn't homework. Just trying to understand why I'm seeing what I'm seeing on my screen.

The stuff below (my own work) currently takes an input file and reads it as a binary file. I want it to store each byte read in an array (for later use). For the sake of brevity the input file (Hello.txt) just contains 'Hello World', without the apostrophes.

int main(int argc, char *argv[]) {

    FILE *input;
    int i, size;
    int *array;

    input = fopen("Hello.txt", "rb");
    if (input == NULL) {
        perror("Invalid file specified.");
        exit(-1);
    }

    fseek(input, 0, SEEK_END);
    size = ftell(input);
    fseek(input, 0, SEEK_SET);

    array = (int*) malloc(size * sizeof(int));
    if (array == NULL) {
        perror("Could not allocate array.");
        exit(-1);
    }
    else {
        input = fopen("Hello.txt", "rb");
        fread(array, sizeof(int), size, input);
        // some check on return value of fread?
        fclose(input);
    }

    for (i = 0; i < size; i++) {
        printf("array[%d] == %d\n", i, array[i]);
    }

Why is it that having the print statement in the for loop as it is above causes the output to look like this

array[0] == 1819043144
array[1] == 1867980911
array[2] == 6581362
array[3] == 0
array[4] == 0
array[5] == 0
array[6] == 0
array[7] == 0
array[8] == 0
array[9] == 0
array[10] == 0

while having it like this

printf("array[%d] == %d\n", i, ((char *)array)[i]);

makes the output look like this (decimal ASCII value for each character)

array[0] == 72
array[1] == 101
array[2] == 108
array[3] == 108
array[4] == 111
array[5] == 32
array[6] == 87
array[7] == 111
array[8] == 114
array[9] == 108
array[10] == 100

? If I'm reading it as a binary file and want to read byte by byte, why don't I get the right ASCII value using the first print statement?

On a related note, what happens if the input file I send in isn't a text document (e.g., jpeg)?

Sorry is this is an entirely trivial matter, but I can't seem to figure out why.

like image 850
user2809475 Avatar asked Sep 24 '13 04:09

user2809475


People also ask

How does fread () work?

The fread() function returns the number of full items successfully read, which can be less than count if an error occurs, or if the end-of-file is met before reaching count. If size or count is 0, the fread() function returns zero, and the contents of the array and the state of the stream remain unchanged.

Is fread buffered?

fread() is part of the C library, and provides buffered reads.

What does fread () return?

fread returns the number of full items actually read, which may be less than count if an error occurs or if the end of the file is encountered before reaching count . Use the feof or ferror function to distinguish a read error from an end-of-file condition.


2 Answers

The behaviour is not surprising:

  • You have a file containing 11 characters. sizeof(char) is 1.
  • Now you allocate an array of int with 11 int. sizeof(int) is very likely to be 4 on your machine
  • You instruct fread to read up to 11 ints (up to 44 bytes). So the first 4 characters will be read as an int and stored in array[0] and the next 4 in array[1].
    • If you had checked the return of fread it would tell you that it actually only read 2 elements (as the content is 11 bytes it can only read 2 ints and the last 3 remaining bytes cannot be successfully read as an int).
  • Now you loop over the array and print the number which is the int you get build up by the first 4 characters.
  • In your alternative solution you pretent to point to a sequence of chars so the array index will only increment in 1 byte offsets

The memory layout basically looks like this:

array[0]
|       array[1]
|       |
1 2 3 4 5 6 7 8 9 10 11
| |
| ((char *)array)[1]
((char *)array)[0]
like image 90
ChrisWue Avatar answered Nov 15 '22 07:11

ChrisWue


Your ftell returns the current value of the position indicator of the stream.

And it returns number of byte the file has. And you are reading file as the sequence of int 4-byte and ofcourse the later element will be 0. For more detail, you are reading 4 x size bytes from a file with size bytes.

Your array should be type of char.

Something like

char* array = malloc(sizeOfFile * sizeof(char));
if(array == NULL) {
  ...
}

fread(array, sizeOf(char), sizeOfFile, filePointer);
// ..

Just the idea, not the code. Hope this help;

like image 22
simpletron Avatar answered Nov 15 '22 06:11

simpletron