What's the easiest way of getting this data into a Pandas Dataframe?

Tags:

pandas

I came across this dataset:

http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data

and I couldn't find a simple way of getting this into a Pandas Dataframe. I manually parsed this into a list of lists and then called the Dataframe constructor, but is there an easier way of doing this. Thanks!

323

asked Nov 06 '12 03:11

vgoklani

1 Answers

Try using pandas.read_fwf and specify a list of column widths (including whitespace):

In [35]: url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'

In [36]: widths = [7, 4, 10, 10, 11, 7, 4, 4, 30]

In [37]: df = pd.read_fwf(url, widths=widths, header=None, na_values=['?'])

In [38]: df.irow(0)
Out[38]: 
X0                              18
X1                               8
X2                             307
X3                             130
X4                            3504
X5                              12
X6                              70
X7                               1
X8    "chevrolet chevelle malibu"

Name: 0

answered Oct 08 '22 11:10

Chang She

Related questions
                            
                                pandas aggregated data to a numpy array : data structure conversion
                            
                                Pandas shuffle rows at a certain level
                            
                                How can I iterate and apply a function over a single level of a DataFrame with MultiIndex?
                            
                                removing particular rows from DataFrame in python pandas
                            
                                how to change the order of factor plot in seaborn
                            
                                Categorical features correlation
                            
                                How to divide the value of pandas columns by the other column
                            
                                How to use groupby transform across multiple columns
                            
                                Specifying the schema in Pandas to_sql
                            
                                Pandas: Ternary conditional operator for setting a value in a DataFrame
                            
                                Want to know the diff among pd.factorize, pd.get_dummies, sklearn.preprocessing.LableEncoder and OneHotEncoder [closed]
                            
                                how to set readable xticks in seaborn's facetgrid?
                            
                                How to set/get Pandas dataframes into Redis using pyarrow
                            
                                Python error : TypeError: Object of type 'Timestamp' is not JSON serializable'
                            
                                How to read UTF-8 files with Pandas?
                            
                                Find index of all rows with null values in a particular column in pandas dataframe
                            
                                pyspark show dataframe as table with horizontal scroll in ipython notebook
                            
                                How to delete column name
                            
                                Remove ends of string entries in pandas DataFrame column
                            
                                Drawing a bootstrap sample from a pandas.DataFrame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With