Pandas (Python) reading and working on Java BigInteger/ large numbers

Question

I have a data file (csv) with Nilsimsa hash values. Some of them would have as long as 80 characters. I wish to read them in Python for data analysis tasks. Is there a way to import the data in python without information loss?

EDIT: I have tried the implementations proposed in the comments but that does not work for me. Example data in csv file would be: 77241756221441762028881402092817125017724447303212139981668021711613168152184106

JohnE · Accepted Answer

Start with a simple text file to read in, just one variable and one row.

%more foo.txt
x
77241756221441762028881402092817125017724447303212139981668021711613168152184106

In [268]: df=pd.read_csv('foo.txt')

Pandas will read it in as a string because it's too big to store as a core number type like int64 or float64. But the info is there, you didn't lose anything.

In [269]: df.x
Out[269]: 
0    7724175622144176202888140209281712501772444730...
Name: x, dtype: object

In [270]: type(df.x[0])
Out[270]: str

And you can use plain python to treat it as a number. Recall the caveats from the links in the comments, this isn't going to be as fast as stuff in numpy and pandas where you have stored a whole column as int64. This is using the more flexible but slower object mode to handle things.

You can change a column to be stored as longs (long integers) like this. (But note that the dtype is still object because everything except the core numpy types (int32, int64, float64, etc.) are stored as objects.)

In [271]: df.x = df.x.map(int)

And then can more or less treat it like a number.

In [272]: df.x * 2
Out[272]: 
0    1544835124428835240577628041856342500354488946...
Name: x, dtype: object

You'll have to do some formatting to see the whole number. Or go the numpy route which will default to showing the whole number.

In [273]: df.x.values * 2
Out[273]: array([ 154483512442883524057762804185634250035448894606424279963336043423226336304368212L], dtype=object)

Pandas (Python) reading and working on Java BigInteger/ large numbers

Tags:

python

pandas

numpy

biginteger

Segmented

1 Answers

JohnE

Recent Activity

Donate For Us

Pandas (Python) reading and working on Java BigInteger/ large numbers

Tags:

python

pandas

numpy

biginteger

Segmented

1 Answers

JohnE

Related questions

Recent Activity

Donate For Us