I am making a text editor and for editing a file I really need some sort of way to only read certain bytes from a file, which I've achieved using fs.createReadStream
uisng the start
and end
options.
I also need to replace certain bytes in the file. I am not sure how this can be done. So far the best solution I've come up is to read the file using a stream and then write to a new file, when I come across the bytes I'm looking for I write my new content instead, thus replacing the old stuff with the new stuff.
This is not the best way, as you'll probably know. To edit 4 bytes I am reading a huge 2GB file and writing the 2GB (assuming I'm editing a 2GB file), not efficient in the least.
What is the best way to achieve this? I've spent two weeks doing this and I've also thought of using Buffers, but Buffers load the entire file into memory, which again is unefficient if it's a 2GB file.
How would you achieve replacing certain bytes in a file without reading the entire file and without installing some npm package that has C++ code. I don't want my editor to have to compile C++ code.
If doing that is not straightforward, how about deleting certain bytes from a file without reading the entire file? If I can do that then I can delete the bytes to be replaced and use something like fs.write()
to add the ones I want them to be replaced with.
Edit #1:
After playing around, I've found that if I open a file with fs.open
with flag r+
and then fs.write
that replaces stuff. So if the text is "Lorem ipsum" and I fs.write
"!!!!" the result will be "!!!!m ipsum".
This would work fine if only all the stuff I was going to write was the perfect length. :/
I know what to do in the case that the new content isn't the perfect length, but I don't know how. :/ Maybe if there was some sort of "empty byte"...
Edit #2:
So as said above, fs.open
(with r+
flags option) + fs.write
allow me to overwrite the content in a file without reading the entire file, which is terrific. Now with this I am running into a new problem. Let's take the following file:
one\n
two\n
three\n
If I fs.open
at byte 0 and then fs.write
"yes", I end up with:
yes\n
two\n
three\n
If I do the same but instead fs.write
"niet", I end up with:
niettwo\n
three\n
Notice how the \n
character was replaced with the "t", this is because of how fs.write
works by replacing bytes when using r+
in fs.open
. This is the problem I am trying to solve right now.
How would one go about doing something like "from this byte to this byte, replace it with these other bytes" so my function could be something like function replaceBytes(filePath, newBytes, startByte, endByte)
and that would replace only from startByte
to endByte
, no matter how long newBytes
, whether it be shorter or longer than the length of endByte - startByte
.
Edit #3:
OK, I figured out the case where the new content is longer than the old content that is being replaced. Thanks to \x00
, I've been able to figure it out. In case both the new and the old content are the same length, that's not hard to figure out as there's nothing to do there.
But the case where the old content is shorter than the new content, that's still unresolved.
For those curious, this is the working code for old content longer than new content: https://github.com/noedit/file/blob/592a35134440a03d3e3c3e366f6cda7f565c11aa/lib/replaceBytes.js#L27-L34
Although it does put a null byte in there, which depending on the editor, it may show up as a character and thus looking weird. :/
As you've discovered, fs.write
with r+
mode allows you to overwrite bytes. This suffices for the case where the added and deleted pieces are exactly the same length.
When the added text is shorter than the deleted text, I advise that you not fill in with \x00
bytes, as you suggest in one of your edits. Those are perfectly valid characters in most types of files (in source code, they will usually cause the compiler/interpreter to throw an error).
The short answer is that this is not generally possible. This is not an abstraction issue; at the file system level, files are stored in chunks of contiguous bytes. There is no generic way to insert/remove from the middle of a file.
The correct way to do this is to seek to the first byte you need to change, and then write the rest of the file (unless you get to a point at which you've added/deleted the same number of bytes, in which case you can stop writing).
In order to avoid issues with crashing during a long write or something like that, it is common to write to a temporary file location and then mv
the temporary file in place of the actual file you wish to save.
Try below code snippet:
New Solution:
var fs = require('fs');
var startByte = 3,
endByte = 6,
newBytes ='replacing with this line',
filePath ='sample.txt';
function replaceBytes(filePath, startByte, endByte, newBytes)
{
var fsWriteStream = fs.createWriteStream('temp.txt', {flags: 'w+'});
var fsReadStream = fs.createReadStream(filePath, {start: endByte+1});
fsReadStream.pipe(fsWriteStream);
fsWriteStream.on('finish', function(){
var fsReadStream2 = fs.createReadStream('temp.txt');
var fsWriteStream2 = fs.createWriteStream(filePath, {start: startByte, flags: 'r+'});
fsWriteStream2.write(newBytes);
fsReadStream2.pipe(fsWriteStream2);
//fsWriteStream2.end();
});
}
replaceBytes(filePath, startByte, endByte, newBytes);
Old Solution:
s - start byte
R - text to be replaced with
file - file where text has to be replaced
var fs = require('fs');
var s = 3,
R ='replacing with this line',
file ='sample.txt';
function replace(file, s, R)
{
var N = R.length;
var fsWriteStream = fs.createWriteStream(file, {start: s, flags: 'r+'});
fsWriteStream.write(R);
fsWriteStream.end();
}
replace(file, s, R);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With