Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas.read_csv from string or package data

I have some csv text data in a package which I want to read using read_csv. I was doing this by

from pkgutil import get_data from StringIO import StringIO  data = read_csv(StringIO(get_data('package.subpackage', 'path/to/data.csv'))) 

However, StringIO.StringIO disappears in Python 3, and io.StringIO only accepts Unicode. Is there a simple way to do this?

Edit: the following does not appear to work

import pandas as pd  import pkgutil from io import StringIO  def get_data_file(pkg, path):     f = StringIO()     contents = unicode(pkgutil.get_data('pymc.examples', 'data/wells.dat'))     f.write(contents)     return f  wells = get_data_file('pymc.examples', 'data/wells.dat')  data = pd.read_csv(wells, delimiter=' ', index_col='id',                    dtype={'switch': np.int8}) 

failing with

  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 401, in parser_f     return _read(filepath_or_buffer, kwds)   File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 209, in _read     parser = TextFileReader(filepath_or_buffer, **kwds)   File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 509, in __init__     self._make_engine(self.engine)   File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 611, in _make_engine     self._engine = CParserWrapper(self.f, **self.options)   File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 893, in __init__     self._reader = _parser.TextReader(src, **kwds)   File "parser.pyx", line 441, in pandas._parser.TextReader.__cinit__ (pandas/src/parser.c:3940)   File "parser.pyx", line 551, in pandas._parser.TextReader._get_header (pandas/src/parser.c:5096) pandas._parser.CParserError: Passed header=0 but only 0 lines in file 
like image 280
John Salvatier Avatar asked Dec 20 '13 04:12

John Salvatier


People also ask

What data type does read_csv return?

Read a CSV File In this case, the Pandas read_csv() function returns a new DataFrame with the data and labels from the file data. csv , which you specified with the first argument. This string can be any valid path, including URLs.

How do you read a string in a data frame?

Method 1: Create Pandas DataFrame from a string using StringIO() One way to achieve this is by using the StringIO() function. It will act as a wrapper and it will help us to read the data using the pd. read_csv() function.


1 Answers

To pass a string to pandas read_csv(), you can use io.StringIO, i.e.:

import pandas as pd from io import StringIO  df = pd.read_csv(StringIO("csv string...")) 
like image 169
Pedro Lobito Avatar answered Sep 22 '22 04:09

Pedro Lobito