Python filename, not markup. open this file and pass the filehandle into Beautiful Soup

Tags:

beautifulsoup

python-2.7

I have changed my Python 2.7 routine to accept a file path as a parameter for the routine so I don't have to duplicate code by inserting multiple file paths inside the method.

When my method is called I get the following error:

looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup.
  '"%s" looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup.' % markup)

My method implementation is:

def extract_data_from_report3(filename):
    html_report_part1 = open(filename,'r').read()
    soup = BeautifulSoup(filename, "html.parser")
    th = soup.find_all('th')
    td = soup.find_all('td')

    headers = [header.get_text(strip=True) for header in soup.find_all("th")]
    rows = [dict(zip(headers, [td.get_text(strip=True) for td in row.find_all("td")]))
        for row in soup.find_all("tr")[1:-1]]
    print(rows)
    return rows

To call the method is as follows:

rows_part1 =  report.extract_data_from_report3(r"E:\test_runners\selenium_regression_test_5_1_1\TestReport\SeleniumTestReport_part1.html")
print "part1 = "
print rows_part1

How can I pass the file name as a parameter?

882

asked May 17 '16 12:05

Riaz Ladhani

1 Answers

If you want to pass a file handle then don't call read, just pass open(filename) or the file handle without calling read :

def extract_data_from_report3(filename):
    html_report_part1 = open(filename,'r')
    soup = BeautifulSoup( html_report_part1, "html.parser")

Or:

def extract_data_from_report3(filename):
    soup = BeautifulSoup(open(filename), "html.parser")

You can pass html_report_part1 after calling read as suggested but you don't need to, BeautifulSoup can take a file object.

136

answered Oct 01 '22 15:10

Padraic Cunningham

Related questions
                            
                                How can i check if date is on range on Python? [duplicate]
                            
                                Transparent error bars without affecting markers
                            
                                How do I created nested JSON object with Python?
                            
                                'easy_install' is not recognized as an in internal or external command, operable program or batch file
                            
                                How to keep the window focus on new Toplevel() window in Tkinter?
                            
                                Comparing string and unicode in Python 2.7.5
                            
                                Why does python allow spaces between an object and the method name after the "."
                            
                                NLTK package to estimate the (unigram) perplexity
                            
                                gitpython: Command syntax for git commit
                            
                                Python SocketServer: sending to multiple clients?
                            
                                How to generate a number of n-bit in length using python? [duplicate]
                            
                                fill_between gives "ValueError: Argument dimensions are incompatible"
                            
                                How to store os.system() output in a variable or a list in python [duplicate]
                            
                                Fast ping sweep in python
                            
                                How to convert tuple to a multi nested dictionary in python?
                            
                                Display notifications in Gnome Shell
                            
                                Escape single quote (') in raw string r'...'
                            
                                How to apply format as 'Text' and 'Accounting' using xlsxwriter
                            
                                xlwings function to find the last row with data
                            
                                Symbol not found: _BIO_new_CMS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With