After setting a DataFrame to redis, then getting it back, redis returns a string and I can't figure out a way to convert this str to a DataFrame.
How can I do these two appropriately?
python - Pyarrow is slower than pandas for csv read in - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.
To interface with pandas, PyArrow provides various conversion routines to consume pandas structures and convert back to them.
Yes, pyarrow is a library for building data frame internals (and other data processing applications). It is not an end user library like pandas.
Pandas is more user-friendly, but NumPy is faster. Pandas has a lot more options for handling missing data, but NumPy has better performance on large datasets. Pandas uses Python objects internally, making it easier to work with than NumPy (which uses C arrays).
set:
redisConn.set("key", df.to_msgpack(compress='zlib'))
get:
pd.read_msgpack(redisConn.get("key"))
I couldn't use msgpack because of Decimal
objects in my dataframe. Instead I combined pickle and zlib together like this, assuming a dataframe df
and a local instance of Redis:
import pickle import redis import zlib EXPIRATION_SECONDS = 600 r = redis.StrictRedis(host='localhost', port=6379, db=0) # Set r.setex("key", EXPIRATION_SECONDS, zlib.compress( pickle.dumps(df))) # Get rehydrated_df = pickle.loads(zlib.decompress(r.get("key")))
There isn't anything dataframe specific about this.
Caveats
msgpack
is better -- use it if it works for youIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With