Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does parameter key mean in dataframe.to_hdf()

Tags:

python

pandas

DataFrame.to_hdf(path_or_buf, key, **kwargs)

In pandas official document, it is said that key is identifier for the group in the store.
But what does that mean? Still, I cannot find sufficient examples for that. I have tried some arbitrary values for parameter key, but I didn't see any difference between them. Sometimes, the api reference can be quite ambiguous. Can anyone offer me some examples to help me to have a better understanding of parameter key?

like image 788
Yu Gu Avatar asked Jun 24 '17 06:06

Yu Gu


People also ask

How do I read a Textframe file in pandas?

We can read data from a text file using read_table() in pandas. This function reads a general delimited file to a DataFrame object. This function is essentially the same as the read_csv() function but with the delimiter = '\t', instead of a comma by default.

How does pandas mean handle NaN?

pandas mean() Key PointsBy default ignore NaN values and performs mean on index axis.

How do I save pandas DataFrame to HDFS?

An hack could be to create N pandas dataframes (each less than 2 GB) (horizontal partitioning) from the big one and create N different spark dataframes, then merging (Union) them to create a final one to write into HDFS.


1 Answers

In pandas to_hdf, the 'key' parameter is the name of the object you are storing in the hdf5 file. You can store multiple objects (dataframes) in a single hdf5 file. So for instance, you can store dataframe 'xyz' AND dataframe 'abc' in the same file, so in this case you would use key='xyz' if you wanted to store dataframe 'xyz' in your hdf5 file.

The 'key' is basically whatever name you want to name the specific object you are storing. It is like a 'key' in a dictionary.

like image 152
clg4 Avatar answered Oct 19 '22 09:10

clg4