Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should we use pandas.compat.StringIO or Python 2/3 StringIO?

StringIO is the file-like string buffer object we use when reading pandas dataframe from text, e.g. "How to create a Pandas DataFrame from a string?"

Which of these two imports should we use for StringIO (within pandas)? This is a long-running question that has never been resolved over four years.

  1. StringIO.StringIO (Python 2) / io.StringIO (Python 3)
    • Advantages: more stable for futureproofing code, but forces us to version-fork, e.g. see code at bottom from EmilH.
  2. pandas.compat.StringIO
    • pandas.compat is a 2/3 compatibility package ("without the need for 2to3") introduced back in 0.13.0 (Jan 2014)
    • pandas.compat package is still marked 'private' as of 0.22 and no plans to make 'public' says "Warning The pandas.core, pandas.compat, and pandas.util top-level modules are considered to be PRIVATE. Stability of functionality in those modules in not guaranteed." although they essentially haven't broken since 0.13
    • pandas.compat source defines the imports builtins, StringIO/cStringIO, BytesIO, cPickle, httplib, iterator versions of range, filter, map and zip, plus other necessary elements for Python 3 compatibility - see the 0.13.0 whatsnew

Version 2/3 forking code for imports from standard (from EmilH):

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

# Note: but this is very much a poor-man's version of pandas.compat, which contains much much more

Note:

  • pandas.compat has existed since pandas 0.13.0 (Jan 2014) as a subpackage within pandas
  • it also seems to have been released as a standalone package: 0.1.0 (Jun 10, 2017) and 0.1.1 (Jun 10, 2017)
like image 680
smci Avatar asked May 11 '18 00:05

smci


2 Answers

I know this is an old question, but I followed breadcrumbs here, so perhaps still worth answering. It's not totally definitive, but current Pandas documentation suggests using the built in StringIO rather than it's own internal methods.

For examples that use the StringIO class, make sure you import it with from io import StringIO for Python 3.

like image 136
s_pike Avatar answered Oct 11 '22 08:10

s_pike


FYI, as of pandas 0.25, StringIO was removed from pandas.compat (PR #25954), so you'll now see:

from pandas.compat import StringIO

ImportError: cannot import name 'StringIO' from 'pandas.compat'

This means the only answer is to import from the io module.

like image 37
Mike T Avatar answered Oct 11 '22 08:10

Mike T