Consider these two files:
file1.txt (Windows newline)
abc\r\n
def\r\n
file2.txt (Unix newline)
abc\n
def\n
I've noticed that for the file2.txt, the position obtained with fgetpos
is not incremented correctly. I'm working on Windows.
Let me show you an example. The following code:
#include<cstdio>
void read(FILE *file)
{
int c = fgetc(file);
printf("%c (%d)\n", (char)c, c);
fpos_t pos;
fgetpos(file, &pos); // save the position
c = fgetc(file);
printf("%c (%d)\n", (char)c, c);
fsetpos(file, &pos); // restore the position - should point to previous
c = fgetc(file); // character, which is not the case for file2.txt
printf("%c (%d)\n", (char)c, c);
c = fgetc(file);
printf("%c (%d)\n", (char)c, c);
}
int main()
{
FILE *file = fopen("file1.txt", "r");
printf("file1:\n");
read(file);
fclose(file);
file = fopen("file2.txt", "r");
printf("\n\nfile2:\n");
read(file);
fclose(file);
return 0;
}
gives such result:
file1:
a (97)
b (98)
b (98)
c (99)
file2:
a (97)
b (98)
(-1)
(-1)
file1.txt works as expected, while file2.txt behaves strange. To explain what's wrong with it, I tried the following code:
void read(FILE *file)
{
int c;
fpos_t pos;
while (1)
{
fgetpos(file, &pos);
printf("pos: %d ", (int)pos);
c = fgetc(file);
if (c == EOF) break;
printf("c: %c (%d)\n", (char)c, c);
}
}
int main()
{
FILE *file = fopen("file1.txt", "r");
printf("file1:\n");
read(file);
fclose(file);
file = fopen("file2.txt", "r");
printf("\n\nfile2:\n");
read(file);
fclose(file);
return 0;
}
I got this output:
file1:
pos: 0 c: a (97)
pos: 1 c: b (98)
pos: 2 c: c (99)
pos: 3 c:
(10)
pos: 5 c: d (100)
pos: 6 c: e (101)
pos: 7 c: f (102)
pos: 8 c:
(10)
pos: 10
file2:
pos: 0 c: a (97) // something is going wrong here...
pos: -1 c: b (98)
pos: 0 c: c (99)
pos: 1 c:
(10)
pos: 3 c: d (100)
pos: 4 c: e (101)
pos: 5 c: f (102)
pos: 6 c:
(10)
pos: 8
I know that fpos_t
is not meant to be interpreted by coder, because it's depending on implementation. However, the above example explains the problems with fgetpos
/fsetpos
.
How is it possible that the newline sequence affects the internal position of the file, even before it encounters that characters?
I would say the problem is probably caused by the second file confusing the implementation, since it's being opened in text mode, but it doesn't follow the requirements.
In the standard,
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character
Your second file stream contains no valid newline characters (since it looks for \r\n
to convert to the newline character internally). As a result, the implementation may not understand the line length properly, and get hopelessly confused when you try to move about in it.
Additionally,
Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment.
Bear in mind that the library will not just read each byte from the file as you call fgetc
- it will read the entire file (for one so small) into the stream's buffer and operate on that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With