Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Browse files and subfolders in Python

I'd like to browse through the current folder and all its subfolders and get all the files with .htm|.html extensions. I have found out that it is possible to find out whether an object is a dir or file like this:

import os  dirList = os.listdir("./") # current directory for dir in dirList:   if os.path.isdir(dir) == True:     # I don't know how to get into this dir and do the same thing here   else:     # I got file and i can regexp if it is .htm|html 

and in the end, I would like to have all the files and their paths in an array. Is something like that possible?

like image 601
Blackie123 Avatar asked Apr 28 '11 10:04

Blackie123


People also ask

How do you access files and folders in Python?

To get a list of all the files and folders in a particular directory in the filesystem, use os. listdir() in legacy versions of Python or os. scandir() in Python 3.

How do I get a list of files in a directory and subdirectories?

The ls command is used to list files or directories in Linux and other Unix-based operating systems. Just like you navigate in your File explorer or Finder with a GUI, the ls command allows you to list all files or directories in the current directory by default, and further interact with them via the command line.


2 Answers

You can use os.walk() to recursively iterate through a directory and all its subdirectories:

for root, dirs, files in os.walk(path):     for name in files:         if name.endswith((".html", ".htm")):             # whatever 

To build a list of these names, you can use a list comprehension:

htmlfiles = [os.path.join(root, name)              for root, dirs, files in os.walk(path)              for name in files              if name.endswith((".html", ".htm"))] 
like image 117
Sven Marnach Avatar answered Sep 21 '22 14:09

Sven Marnach


I had a similar thing to work on, and this is how I did it.

import os  rootdir = os.getcwd()  for subdir, dirs, files in os.walk(rootdir):     for file in files:         #print os.path.join(subdir, file)         filepath = subdir + os.sep + file          if filepath.endswith(".html"):             print (filepath) 

Hope this helps.

like image 32
Pragyaditya Das Avatar answered Sep 21 '22 14:09

Pragyaditya Das