Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: read_html

Tags:

I'm trying to extract US states from wiki URL, and for which I'm using Python Pandas.

import pandas as pd import html5lib f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')  

However, the above code is giving me an error L

ImportError Traceback (most recent call last) in () 1 import pandas as pd ----> 2 f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')

if flavor in ('bs4', 'html5lib'): 662 if not _HAS_HTML5LIB: --> 663 raise ImportError("html5lib not found, please install it") 664 if not _HAS_BS4: 665 raise ImportError("BeautifulSoup4 (bs4) not found, please install it") ImportError: html5lib not found, please install it

I installed html5lib and beautifulsoup4 as well, but it is not working. Can someone help pls.

like image 545
user4943236 Avatar asked Jan 01 '16 09:01

user4943236


People also ask

What is pandas read_html?

The pandas read_html() function is a quick and convenient way to turn an HTML table into a pandas DataFrame. This function can be useful for quickly incorporating tables from various websites without figuring out how to scrape the site's HTML.

Can pandas read HTML file?

We can read tables of an HTML file using the read_html() function. This function read tables of HTML files as Pandas DataFrames. It can read from a file or a URL. Let's have a look at each input source one by one.

How do I read a .TXT file in pandas?

We can read data from a text file using read_table() in pandas. This function reads a general delimited file to a DataFrame object. This function is essentially the same as the read_csv() function but with the delimiter = '\t', instead of a comma by default.


1 Answers

Running Python 3.4 on a mac

New pyvenv

pip install pandas pip install lxml pip install html5lib pip install BeautifulSoup4 

Then run your example and it should work:

import pandas as pd import html5lib f_states=   pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')  
like image 130
Tim Seed Avatar answered Sep 24 '22 22:09

Tim Seed