Python Pandas: Merge Columns of Data Frame with column name into one column

Tags:

I have the data in the following format in my Data Frame:

>>> df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
>>> df
          A         B         C         D
0  0.578095 -1.985742 -0.269517 -0.180319
1 -0.618431 -0.937284  0.556290 -1.416877
2  1.695109  0.122219  0.182450  0.411448
3  0.228466  0.268943 -1.249488  3.227840
4  0.005990 -0.805618 -1.941092 -0.146649
5 -1.116451 -0.649854  1.272314  1.422760

I want to combine some columns at each row by appending the row data and column names creating the following output:

          A         B        New Column
0  0.578095 -1.985742  {"C":"-0.269517","D":"-0.180319"}
1 -0.618431 -0.937284  {"C":"0.556290","D":"-1.416877"}
2  1.695109  0.122219  {"C":"0.182450","D":"0.411448"}
3  0.228466  0.268943  {"C":"-1.249488","D":"3.227840"}
4  0.005990 -0.805618  {"C":"-1.941092","D":"-0.146649"}
5 -1.116451 -0.649854  {"C":"1.272314","D":"1.422760"}

How can I achieve this in pandas?

The end game is to have the data in JSON format where Column C-D are taken as Measures for the Dimensions A-B and then store them into the table in Snowflake.

298

asked Jul 31 '18 15:07

abdthe1

2 Answers

Drop the columns and create a new one with agg:

df2 = df.drop(['C', 'D'], axis=1).assign(New_Column=
    df[['C', 'D']].agg(pd.Series.to_dict, axis=1))

df2
          A         B                                         New_Column
0 -0.645719 -0.757112  {'D': 0.8923148471642509, 'C': -0.685995130541...
1 -0.124200 -0.578526  {'D': -0.5457121278891495, 'C': -1.46006615752...
2  2.160417 -0.985475  {'D': -0.49915307027471345, 'C': 0.85388172610...
3  2.111050  1.384887  {'D': -0.4617380879640236, 'C': 0.907519279458...
4  0.781630 -0.366445  {'D': -0.3105127375402184, 'C': 0.295808587414...
5  0.460773  0.549545  {'D': -0.993162129461116, 'C': 0.8163378188816...

191

answered Oct 17 '22 06:10

cs95

Using to_dict with 'records'

df['New c']=df[['C','D']].to_dict('records')
df
Out[580]: 
          A         B         C         D  \
0  0.578095 -1.985742 -0.269517 -0.180319   
1 -0.618431 -0.937284  0.556290 -1.416877   
2  1.695109  0.122219  0.182450  0.411448   
3  0.228466  0.268943 -1.249488  3.227840   
4  0.005990 -0.805618 -1.941092 -0.146649   
5 -1.116451 -0.649854  1.272314  1.422760   
                                        New c  
0            {'C': -0.269517, 'D': -0.180319}  
1              {'C': 0.55629, 'D': -1.416877}  
2               {'C': 0.18245, 'D': 0.411448}  
3              {'C': -1.249488, 'D': 3.22784}  
4  {'C': -1.9410919999999998, 'D': -0.146649}  
5               {'C': 1.272314, 'D': 1.42276}

answered Oct 17 '22 08:10

BENY

Related questions
                            
                                Python: Dictionary that only retains last n inserted keys
                            
                                Change default expiry period of "Pyotp"
                            
                                Exclude null values in map function of Python3
                            
                                tf.assign to variable slice doesn't work inside tf.while_loop
                            
                                What does pyspark need psutil for? (faced "UserWarning: Please install psutil to have better support with spilling")?
                            
                                Why does the shape of the selection of my pandas dataframe is wrong
                            
                                AttributeError: module 'plotly' has no attribute 'plotly'
                            
                                plotly basic example shows no plot in jupyter lab
                            
                                Use custom function in ModelViewSet with Django Rest Framework
                            
                                How can I override Django Rest Framework viewset perform_create() method to set a default value to a field
                            
                                Array in a program
                            
                                python - Sum of digits in 2^1000? [duplicate]
                            
                                How to get exactly one element from list in Jinja?
                            
                                PyQt5 - How to display image in QMainWindow class?
                            
                                create a tensor proto whose content is larger than 2GB
                            
                                how to import selenium WebElement in python
                            
                                Why doesn't concurrent.futures.ThreadPoolExecutor().submit return immediately?
                            
                                Why am I getting a KeyError when trying to use an Enum as a dictionary key in another file?
                            
                                Create barplot from string data using groupby and multiple columns in pandas dataframe
                            
                                How can i extract day of week from timestamp in pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Pandas: Merge Columns of Data Frame with column name into one column

Tags:

python

json

pandas

dataframe

snowflake-cloud-data-platform

abdthe1

People also ask

2 Answers

cs95

BENY

Recent Activity

Donate For Us