Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

file name vs file object as a function argument

If a function takes as an input the name of a text file, I can refactor it to instead take a file object (I call it "stream"; is there a better word?). The advantages are obvious - a function that takes a stream as an argument is:

  • much easier to write a unit test for, since I don't need to create a temporary file just for the test
  • more flexible, since I can use it in situations where I somehow already have the contents of the file in a variable

Are there any disadvantages to streams? Or should I always refactor a function from a file name argument to a stream argument (assuming, of course, the file is text-only)?

like image 878
max Avatar asked Sep 25 '12 05:09

max


People also ask

What is a file object and how is it different from a file name or the file's contents in Python?

A file object is an object that exposes "a file-oriented API (with methods such as read() or write()) to an underlying resource." A file name is just a text string containing the name of the file.

How do you pass a filename to an argument in Python?

This can be done by passing a comma-separated list of file names as one of the arguments while running the script. FOr example, if you have a script called `myscipt.py' you would run it as: python myscript.py file1,file2,file3.

Can a file be an object?

An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the same machine code can be packaged in different object file formats.

What is the other name of file object in Python?

Where. File_obj also called handle is the variable to add the file object. filename: Name of the file. mode: To tell the interpreter which way the file will be used.


1 Answers

... Here is how xml.etree.ElementTree module implements the parse function:

def parse(self, source, parser=None):
    close_source = False
    if not hasattr(source, "read"):
        source = open(source, "rb")
        close_source = True
    ...

As filename is a string, it does not have the read() method (here whatever attribute of that name is checked); however, the open file has it. The four lines makes the rest of code common. The only complication is that you have to remember whether to close the file object (here named source) or not. If it was open inside, then it must be closed. Otherwise, it must not be closed.

Actually, files differ from sreams slightly. Streams are potentially infinite while files usually not (unless some device is mapped as if it were file). The important difference when processing is, that you can never read the stream into memory at once. You have to process it by chunks.

like image 60
pepr Avatar answered Sep 19 '22 06:09

pepr