Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C function to insert text at particular location in file without over-writing the existing text

Tags:

c

file-io

I have written a program which takes a file as input and whenever it finds a line with length > 80, it adds \ and \n to that file to make it 80 chars in width max.

The problem is that I have used fseek to insert \ and \n whenever the length exceeds 80, so it overrides two characters of that line which exceeds length 80. Is there a way using which I can insert text without overriding the existing text?

Here is my code:-

#include<stdio.h>
#include<string.h>

int main(int argc, char *argv[])
{
  FILE *fp1,*fp2;
  int prev=0,now=0;
  char ch;
  int flag=0;
  long cur;
  fp1=fopen(argv[1],"r+");
  if(fp1==NULL){
    printf("Unable to open the file to read. Program will exit.");
    exit(0);
  }
  else{
    while((ch=fgetc(fp1))!=EOF){
      if(ch!=' ' && ch!='\n'){
        now=now+1;
      }
      else{
        if(now>=80){
            fseek(fp1,cur,SEEK_SET);
            fputc('\\',fp1);
            fputc('\n',fp1);
            now=0;
            continue;
        }
        if(ch=='\n'){
          flag=0;
          now=0;
          continue;
          }
        else{
          prev=now;
          cur=ftell(fp1);
        }
        now=now+1;
      }
    }
  }
  fclose(fp1);
  return 0;
}

To run it, you need to do following:-

user@ubuntu$ cc xyz.c
user@ubuntu$ ./a.out file_to_check.txt
like image 582
Rahul Avatar asked Jan 27 '12 12:01

Rahul


1 Answers

While there are a couple of techniques to do it in-place, you're working with a text file and want to perform insertions. Operating systems typically don't support text file insertions as a file system primitive and there's no reason they should do that.

The best way to do that kind of thing is to open your file for reading, open a new file for writing, copy the part of the file before the insertion point, insert the data, copy the rest, and then move the new file over the old one.

This is a common technique and it has a purpose. If anything goes wrong (e.g. with your system), you still have the original file and can repeat the transaction later. If you start two instances of the process and use a specific pattern, the second instance is able to detect that the transaction has already been started. With exclusive file access, it can even detect whether the transaction was interrupted or is still running.

That way is much less error prone than any of the techniques performed directly on the original file and is used by all of those traditional tools like sed even if you ask them to work in-place (sed -i). Another bonus is that you can always rename the original file to one with a backup suffix before overwriting it (sed offers such an option as well).

The same technique is often used for configuration files even if your program is writing an entirely new version and doesn't use the original file for that. It hasn't been long since many internet magazines claimed that ext4 accidentally truncates configuration files to zero length. This was exactly because some applications kept the configuration files open and truncated while the system was forcedly shut down. Those application often tampered with the original configuration files before they had the data ready and then even kept them open without syncing them, which made the window for data corruption much larger.

TL;DR version:

When you value your data, don't destroy it before you have the replacement data ready.

like image 127
Pavel Šimerda Avatar answered Sep 25 '22 08:09

Pavel Šimerda