I have written a program which takes a file as input and whenever it finds a line with length > 80, it adds \ and \n to that file to make it 80 chars in width max.
The problem is that I have used fseek to insert \ and \n whenever the length exceeds 80, so it overrides two characters of that line which exceeds length 80. Is there a way using which I can insert text without overriding the existing text?
Here is my code:-
#include<stdio.h>
#include<string.h>
int main(int argc, char *argv[])
{
FILE *fp1,*fp2;
int prev=0,now=0;
char ch;
int flag=0;
long cur;
fp1=fopen(argv[1],"r+");
if(fp1==NULL){
printf("Unable to open the file to read. Program will exit.");
exit(0);
}
else{
while((ch=fgetc(fp1))!=EOF){
if(ch!=' ' && ch!='\n'){
now=now+1;
}
else{
if(now>=80){
fseek(fp1,cur,SEEK_SET);
fputc('\\',fp1);
fputc('\n',fp1);
now=0;
continue;
}
if(ch=='\n'){
flag=0;
now=0;
continue;
}
else{
prev=now;
cur=ftell(fp1);
}
now=now+1;
}
}
}
fclose(fp1);
return 0;
}
To run it, you need to do following:-
user@ubuntu$ cc xyz.c
user@ubuntu$ ./a.out file_to_check.txt
While there are a couple of techniques to do it in-place, you're working with a text file and want to perform insertions. Operating systems typically don't support text file insertions as a file system primitive and there's no reason they should do that.
The best way to do that kind of thing is to open your file for reading, open a new file for writing, copy the part of the file before the insertion point, insert the data, copy the rest, and then move the new file over the old one.
This is a common technique and it has a purpose. If anything goes wrong (e.g. with your system), you still have the original file and can repeat the transaction later. If you start two instances of the process and use a specific pattern, the second instance is able to detect that the transaction has already been started. With exclusive file access, it can even detect whether the transaction was interrupted or is still running.
That way is much less error prone than any of the techniques performed directly on the original file and is used by all of those traditional tools like sed
even if you ask them to work in-place (sed -i
). Another bonus is that you can always rename the original file to one with a backup suffix before overwriting it (sed
offers such an option as well).
The same technique is often used for configuration files even if your program is writing an entirely new version and doesn't use the original file for that. It hasn't been long since many internet magazines claimed that ext4 accidentally truncates configuration files to zero length. This was exactly because some applications kept the configuration files open and truncated while the system was forcedly shut down. Those application often tampered with the original configuration files before they had the data ready and then even kept them open without syncing them, which made the window for data corruption much larger.
TL;DR version:
When you value your data, don't destroy it before you have the replacement data ready.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With