After looking all over the Internet, I've come to this.
Let's say I have already made a text file that reads: Hello World
Well, I want to remove the very last character (in this case d
) from this text file.
So now the text file should look like this: Hello Worl
But I have no idea how to do this.
All I want, more or less, is a single backspace function for text files on my HDD.
This needs to work on Linux as that's what I'm using.
The easiest way is to use the built-in substring() method of the String class. In order to remove the last character of a given String, we have to use two parameters: 0 as the starting index, and the index of the penultimate character.
tell() b/c you do not have 'b' access. then you can set the cursor to the starting of the last element. Then you can delete the last element by an empty string. Or cleaner to f.
Using the truncate Command. The command truncate contracts or expands a file to a given size. The truncate command with option -s -1 reduces the size of the file by one by removing the last character s from the end of the file. The command truncate takes very little time, even for processing large files.
To look at the last few lines of a file, use the tail command. tail works the same way as head: type tail and the filename to see the last 10 lines of that file, or type tail -number filename to see the last number lines of the file.
Use fileobject.seek()
to seek 1 position from the end, then use file.truncate()
to remove the remainder of the file:
import os with open(filename, 'rb+') as filehandle: filehandle.seek(-1, os.SEEK_END) filehandle.truncate()
This works fine for single-byte encodings. If you have a multi-byte encoding (such as UTF-16 or UTF-32) you need to seek back enough bytes from the end to account for a single codepoint.
For variable-byte encodings, it depends on the codec if you can use this technique at all. For UTF-8, you need to find the first byte (from the end) where bytevalue & 0xC0 != 0x80
is true, and truncate from that point on. That ensures you don't truncate in the middle of a multi-byte UTF-8 codepoint:
with open(filename, 'rb+') as filehandle: # move to end, then scan forward until a non-continuation byte is found filehandle.seek(-1, os.SEEK_END) while filehandle.read(1) & 0xC0 == 0x80: # we just read 1 byte, which moved the file position forward, # skip back 2 bytes to move to the byte before the current. filehandle.seek(-2, os.SEEK_CUR) # last read byte is our truncation point, move back to it. filehandle.seek(-1, os.SEEK_CUR) filehandle.truncate()
Note that UTF-8 is a superset of ASCII, so the above works for ASCII-encoded files too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With