Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why isn't truncate defaulting properly to the current position for files?

In an answer to another question, an odd behavior was observed, specific to Python 3. The documentation for the truncate command states (emphasis mine):

Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled). The new file size is returned.

However...

>>> open('temp.txt', 'w').write('ABCDE\nFGHIJ\nKLMNO\nPQRST\nUVWXY\nZ\n')
32
>>> f = open('temp.txt', 'r+')
>>> f.readline()
'ABCDE\n'
>>> f.tell()
6                   # As expected, current position is 6 after the readline
>>> f.truncate()
32                  # ?!

Instead of truncating at the current position (6), it truncated at the end of the file (i.e. not at all). This was verified by checking the file on disk.

This process works as expected (file truncated to 6 bytes) in Python 2, and also in Python 3 using a StringIO instead of a file. Why is it not working as expected with files in Python 3? Is this a bug?

(Edit: it also works properly if an explicit f.seek(6) is given right before the truncate.)

like image 323
glibdud Avatar asked Jan 19 '16 14:01

glibdud


1 Answers

>>> open('temp.txt', 'w').write('ABCDE\nFGHIJ\nKLMNO\nPQRST\nUVWXY\nZ\n')
32
>>> f = open('temp.txt', 'r+')
>>> f.readline()
'ABCDE\n'
>>> f.seek(6) 
>>> f.truncate()

This fixes the issue if nothing else, as to why this happens I have no idea but it would be a good thing to report this up-stream if it isn't already.

These are the only textural differences to the truncate() functions between Python3 and Python2 that I could find (except for related function calls within the truncate function itself obviously):

33,34c33,34
<             except AttributeError as err:
<                 raise TypeError("an integer is required") from err
---
>             except AttributeError:
>                 raise TypeError("an integer is required")
54c54
<         """Truncate size to pos, where pos is an int."""
---
>         """Truncate size to pos."""

I'm sure someone will slap my fingers on the subject, but I think it's more related to the flush() calls and how the buffer is handled once you call flush. Almost as if it doesn't reset to it's previous position after flushing all the I/O. it's a wild assumption with no technical stuff to back it up yet, but it would be my first guess.

Checked into the flush() situation, here's the only difference between the two, of which Python2 performs the following operation that Python3 does not (even lacks the source code for it):

def _flush_unlocked(self):
    if self.closed:
        raise ValueError("flush of closed file")
    while self._write_buf:
        try:
            n = self.raw.write(self._write_buf)
        except BlockingIOError:
            raise RuntimeError("self.raw should implement RawIOBase: it "
                               "should not raise BlockingIOError")
        except IOError as e:
            if e.errno != EINTR:
                raise
            continue
        if n is None:
            raise BlockingIOError(
                errno.EAGAIN,
                "write could not complete without blocking", 0)
        if n > len(self._write_buf) or n < 0:
            raise IOError("write() returned incorrect number of bytes")
        del self._write_buf[:n]

It's a function of BufferedWriter which appears to be used in this I/O operation.
Now I'm late for a date so gotta dash, will be interesting to see what you guys find in the mean time!

like image 76
Torxed Avatar answered Nov 14 '22 22:11

Torxed