After setting a DataFrame to redis, then getting it back, redis returns a string and I can't figure out a way to convert this str to a DataFrame.
How can I do these two appropriately?
python - Pyarrow is slower than pandas for csv read in - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.
To interface with pandas, PyArrow provides various conversion routines to consume pandas structures and convert back to them.
Yes, pyarrow is a library for building data frame internals (and other data processing applications). It is not an end user library like pandas.
Pandas is more user-friendly, but NumPy is faster. Pandas has a lot more options for handling missing data, but NumPy has better performance on large datasets. Pandas uses Python objects internally, making it easier to work with than NumPy (which uses C arrays).
set:
redisConn.set("key", df.to_msgpack(compress='zlib'))   get:
pd.read_msgpack(redisConn.get("key")) 
                        I couldn't use msgpack because of Decimal objects in my dataframe.  Instead I combined pickle and zlib together like this, assuming a dataframe df and a local instance of Redis:
import pickle import redis import zlib  EXPIRATION_SECONDS = 600  r = redis.StrictRedis(host='localhost', port=6379, db=0)  # Set r.setex("key", EXPIRATION_SECONDS, zlib.compress( pickle.dumps(df)))  # Get rehydrated_df = pickle.loads(zlib.decompress(r.get("key")))   There isn't anything dataframe specific about this.
Caveats
msgpack is better  -- use it if it works for youIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With