Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

os.walk() not picking up my file names

Tags:

python

os.walk

I'm trying to use a python script to edit a large directory of .html files in a loop. I'm having trouble looping through the filenames using os.walk(). This chunk of code just turns the html files into strings that I can work with, but the script does not even enter the loop, as if the files don't exist. Basically it prints point1 but never reaches point2. The script ends without an error message. The directory is set up inside the folder called "amazon", and there is one level of 20 subfolders inside of it with 20 html files in each of those.

Oddly the code works perfectly on a neighboring directory that only contains .txt files, but it seems like it's not grabbing my .html files for some reason. Is there something I don't understand about the structure of the for root, dirs, filenames in os.walk() loop? This is my first time using os.walk, and I've looked at a number of other pages on this site to try to make it work.

import os

rootdir = 'C:\filepath\amazon'
print "point1"
for root, dirs, filenames in os.walk(rootdir):
    print "point2"
    for file in filenames:
        with open (os.path.join(root, file), 'r') as myfile:
             g = myfile.read()
        print g

Any help is much appreciated.

like image 792
user3087978 Avatar asked Dec 26 '22 08:12

user3087978


2 Answers

The backslash is used as an escape. Either double them, or use "raw strings" by putting a prefix "r" on it.

Example:

>>> 'C:\filepath\amazon'
'C:\x0cilepath\x07mazon'
>>> r'\x'
'\\x'
>>> '\x'
ValueError: invalid \x escape

Explanation: In Python, what does preceding a string literal with “r” mean?

like image 74
johntellsall Avatar answered Dec 31 '22 14:12

johntellsall


You can avoid having to explicitly handle slashes of any sort by using os.path.join:

rootdir = os.path.join('C:', 'filepath', 'amazon')
like image 34
huu Avatar answered Dec 31 '22 13:12

huu