Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pd.read_html() imports a list rather than a dataframe

I used pd.read_html() to import a table from a webpage but instead of structuring the data as a dataframe Python imported it as a list. How can I import the data as a dataframe? Thank you!

The code is the following:

import pandas as pd

import html5lib

url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'

dfs = pd.read_html(url)

type(dfs)

Out[1]: list
like image 385
AlK Avatar asked Sep 26 '16 19:09

AlK


2 Answers

.read_html() produces a list of dataframes (there could be multiple tables in an HTML source), get the desired one by index. In your case, there is a single dataframe:

dfs = pd.read_html(url)
df = dfs[0]
print(df)

Note that, if there are no tables in the HTML source, it would return an error and would never produce an empty list.

like image 87
alecxe Avatar answered Oct 23 '22 00:10

alecxe


import pandas as pd
import html5lib
url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'
dfs = pd.read_html(url)
df = pd.concat(dfs)
df
like image 22
Nikhil Chawla Avatar answered Oct 23 '22 02:10

Nikhil Chawla