Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster way to merge multiple files

Tags:

file

linux

bash

I have multiple small files in Linux (about 70,000 files) and I want to add a word to the end of each line of the files and then merge them all into a single file.

I'm using this script:

for fn in *.sms.txt 
do 
    sed 's/$/'$fn'/' $fn >> sms.txt
    rm -f $fn
done

Is there a faster way to do this?

like image 759
user1815910 Avatar asked Dec 21 '22 13:12

user1815910


2 Answers

I tried with these files:

for ((i=1;i<70000;++i)); do printf -v fn 'file%.5d.sms.txt' $i; echo -e "HAHA\nLOL\nBye" > "$fn"; done

I tried your solution that took about 4 minutes (real) to process. The problem with your solution is that you're forking on sed 70000 times! And forking is rather slow.

#!/bin/bash

filename="sms.txt"

# Create file "$filename" or empty it if it already existed
> "$filename"

# Start editing with ed, the standard text editor
ed -s "$filename" < <(
   # Go into insert mode:
   echo i
   # Loop through files
   for fn in *.sms.txt; do
      # Loop through lines of file "$fn"
      while read l; do
         # Insert line "$l" with "$fn" appended to
         echo "$l$fn"
      done < "$fn"
   done
   # Tell ed to quit insert mode (.), to save (w) and quit (q)
   echo -e ".\nwq"
)

This solution took ca. 6 seconds.

Don't forget, ed is the standard text editor, and don't overlook it! If you enjoyed ed, you'll probably also enjoy ex!

Cheers!

like image 56
gniourf_gniourf Avatar answered Dec 23 '22 02:12

gniourf_gniourf


Almost Same as gniourf_gniourf's solution, but without ed:

for i in *.sms.txt 
do   
   while read line   
   do    
     echo $line $i
   done < $i
done >sms.txt
like image 30
Guru Avatar answered Dec 23 '22 02:12

Guru