Pandas SparseDataFrame from list of dicts

Question

I'm trying to convert a list of Python dicts into a Pandas DataFrame. Since every dict has different keys, it takes up too much memory. Since most of the values are NaN, a SparseDataFrame should be helpful in this case.

import pandas

df = pandas.DataFrame(keyword_data).to_sparse(fill_value=.0)

This works, but takes up loads of memory because a DataFrame is created in the meanwhile, and sometimes raises MemoryError.

Is it possible to create a SparseDataFrame with this data without that step? The Pandas documentation doesn't help much in this case... Doing this:

pandas.SparseDataFrame(keyword_data, default_fill_value=.0)

Raises:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

The data looks something like:

[{'a': 0.672366,
  'b': 0.667276,
  # ...
 },
 {'c': 0.507752,
  'd': 0.532593,
  'e': 0.507793
  # ...
 },
 # ...
]

The keys are always strings, with different keys per dict, the values are floats.

Is there a way to create a SparseDataFrame directly from this data, without going through a regular DataFrame?

Qusai Alothman · Accepted Answer

As of pandas v1.0.0, SparseDataFrame and SparseSeries were removed.

There is no need for them anymore. Quoting the documentation:

There’s no performance or memory penalty to using a Series or DataFrame with sparse values, rather than a SparseSeries or SparseDataFrame.

Pandas SparseDataFrame from list of dicts

Tags:

python

pandas

numpy

yprez

1 Answers

Qusai Alothman

Recent Activity

Donate For Us

Pandas SparseDataFrame from list of dicts

Tags:

python

pandas

numpy

yprez

1 Answers

Qusai Alothman

Related questions

Recent Activity

Donate For Us