I have many large csv files (1-10 gb each) which I'm importing into databases. For each file, I need to replace the 1st line so I can format the headers to be the column names. My current solution is:
using (var reader = new StreamReader(file))
{
using (var writer = new StreamWriter(fixed))
{
var line = reader.ReadLine();
var fixedLine = parseHeaders(line);
writer.WriteLine(fixedLine);
while ((line = reader.ReadLine()) != null)
writer.WriteLine(line);
}
}
What is a quicker way to only replace line 1 without iterating through every other line of these huge files?
Sed command in unix is not working on file size greater than 3GB - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.
sed is a common text processing utility in the Linux command-line. Removing the first line from an input file using the sed command is pretty straightforward. The sed command in the example above isn't hard to understand. The parameter '1d' tells the sed command to apply the 'd' (delete) action on line number '1'.
Find and replace text within a file using sed command Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input. txt. The s is the substitute command of sed for find and replace.
If you can guarantee that fixedLine
is the same length (or less) as line
, you can update the files in-place instead of copying them.
If not, you can possibly get a little performance improvement by accessing the .BaseStream
of your StreamReader
and StreamWriter
and doing big block copies (using, say, a 32K byte buffer) to do the copying, which will at least eliminate the time spent checking every character to see if it's an end-of-line character as happens now with reader.ReadLine()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With