I am trying to get a list of strings with the file path and the file name. At the moment I only get the file names into the list.
Code:
hamFileNames = os.listdir("train_data\ham")
Output:
['0002.1999-12-13.farmer.ham.txt',
'0003.1999-12-14.farmer.ham.txt',
'0005.1999-12-14.farmer.ham.txt']
I would want an output similar to this:
['train_data\ham\0002.1999-12-13.farmer.ham.txt',
'train_data\ham\0003.1999-12-14.farmer.ham.txt',
'train_data\ham\0005.1999-12-14.farmer.ham.txt']
Since you have access to the directory path you could just do:
dir = "train_data\ham"
output = map(lambda p: os.path.join(dir, p), os.listdir(dir))
or simpler
output = [os.path.join(dir, p) for p in os.listdir(dir)]
Where os.path.join
will join your directory path with the filenames inside it.
If you're on Python 3.5 or higher, skip os.listdir
in favor of os.scandir
, which is both more efficient and does the work for you (path
is an attribute of the result objects):
hamFileNames = [entry.path for entry in os.scandir(r"train_data\ham")]
This also lets you cheaply filter (scandir
includes some file info for free, without stat
-ing the file), e.g. to keep only files (no directories or special file-system objects):
hamFileNames = [entry.path for entry in os.scandir(r"train_data\ham") if entry.is_file()]
If you're on 3.4 or below, you may want to look at the PyPI scandir
module (which provides the same API on earlier Python).
Also note: I used a raw string for the path; while \h
happens to work without it, you should always use raw strings for Windows path literals, or you'll get a nasty shock when you try to use "train_data\foo"
(where \f
is the ASCII form feed character), while r"train_data\foo"
works just fine (because the r
prefix prevents backslash interpolation of anything but the quote character).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With