Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Print pandas data frame for reproducible example (equivalent to dput in R)

Tags:

pandas

Lately I'm constantly finding myself asking questions in Pandas which depend on data that I'm using , so far it takes me quite a while to create a data frame with similarity to my data (reproducible data frame) so that SO users could easily copy it to their machine.

I would prefer to find a convenient way so i could just print my small DF within my question, and other users could easily collect it, hence creating it with minimum effort.

In R I'm used to print a small sample of my data within the dput function in the console, and then printing the output within my question (example): Getting the error "level sets of factors are different" when running a for loop

I've noticed this explanation, but i don't think its suitable for printing a sample of data for other SO users: Python's equivalent for R's dput() function

Is there an equivalent method in Pandas for doing that?

Thanks in advance!

like image 909
Yehoshaphat Schellekens Avatar asked Nov 23 '17 08:11

Yehoshaphat Schellekens


People also ask

What is the pandas equivalent in R?

Pandas for Python and Dplyr for R are the two most popular libraries for working with tabular/structured data for many Data Scientists.

Can you print a Pandas Dataframe?

There are 4 methods to Print the entire pandas Dataframe:Use to_string() Method. Use pd. option_context() Method. Use pd.

What is Syntaxfor Panda's data frame?

For the row labels, the Index to be used for the resulting frame is Optional Default np. arange(n) if no index is passed. For column labels, the optional default syntax is - np. arange(n).


1 Answers

If binary data is OK for you, you can use the pickle library. It usually allows to serialize and deserialize arbitraty objects (on condition that their class definition is provided, which is true for dataframes, if pandas is installed).

If you need a human-readable format, you can create a Python dictionary from your dataframe with df_dict = df.to_dict(), and print this dictionary (to look at it and maybe copy-paste), or dump it to a JSON string.

When you want to convert a dict back to pandas, use df = pd.DataFrame.from_dict(df_dict).

A minimal example of decoding and encoding:

import pandas as pd
df = pd.DataFrame.from_dict({'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}})
print(df.to_dict())

which results in the {'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}} copy-able object.

like image 190
David Dale Avatar answered Oct 11 '22 19:10

David Dale