Pandas combining sparse columns in dataframe

Tags:

I am using Python, Pandas for data analysis. I have sparsely distributed data in different columns like following

| id | col1a | col1b | col2a | col2b | col3a | col3b |
|----|-------|-------|-------|-------|-------|-------|
|  1 |   11  |   12  |  NaN  |  NaN  |  NaN  |  NaN  |
|  2 |  NaN  |  NaN  |   21  |   86  |  NaN  |  NaN  |
|  3 |   22  |   87  |  NaN  |  NaN  |  NaN  |  NaN  |
|  4 |  NaN  |  NaN  |   NaN |  NaN  |  545  |   32  |

I want to combine this sparsely distributed data in different columns to tightly packed column like following.

Click to copy

| id | group |  cola |  colb |
|----|-------|-------|-------|
| 1  |  g1   |   11  |   12  |
| 2  |  g2   |   21  |   86  |
| 3  |  g1   |   22  |   87  |
| 4  |  g3   |  545  |   32  |

What I have tried is doing following, but not able to do it properly

Click to copy

df['cola']=np.nan
df['colb']=np.nan
df['cola'].fillna(df.col1a,inplace=True)
df['colb'].fillna(df.col1b,inplace=True)
df['cola'].fillna(df.col2a,inplace=True)
df['colb'].fillna(df.col2b,inplace=True)
df['cola'].fillna(df.col3a,inplace=True)
df['colb'].fillna(df.col3b,inplace=True)

But I think there must be more concise and efficient way way of doing this. How to do this in better way?

440

asked Jun 07 '20 16:06

Prabhu

1 Answers

You can use df.stack() assuming 'id' is your index else set 'id' as index. Then use pd.pivot_table.

Click to copy

df = df.stack().reset_index(name='val',level=1)
df['group'] = 'g'+ df['level_1'].str.extract('col(\d+)')
df['level_1'] = df['level_1'].str.replace('col(\d+)','')
df.pivot_table(index=['id','group'],columns='level_1',values='val')

level_1    cola  colb
id group
1  g1      11.0  12.0
2  g2      21.0  86.0
3  g1      22.0  87.0
4  g3     545.0  32.0

answered Nov 10 '22 17:11

Ch3steR

Related questions
                            
                                How to re-assign a variable in python without changing its id?
                            
                                Iterating over rows in pandas to check the condition
                            
                                PySpark Will not start - ‘python’: No such file or directory
                            
                                Python - How to - Big Query asynchronous tasks
                            
                                Pandas to_sql changing datatype in database table
                            
                                Filtering on action decorator - Django Rest Framework
                            
                                Python & Selenium: Difference between driver.implicitly_wait() and time.sleep()
                            
                                add string to every element of a python list
                            
                                How to create a custom mixin in django?
                            
                                Pandas - How to extract HH:MM from datetime column in Python?
                            
                                Return a Pandas DataFrame as a data_table from a callback with Plotly Dash for Python
                            
                                Python TypeError: sort() takes no positional arguments
                            
                                No module named 'cv2.cv2'
                            
                                Cycle over list indefinitely
                            
                                airflow webserver started but UI doesn't show in browser
                            
                                What does Import Error: Symbol not found: _PQencryptPasswordConn mean and how do I fix it?
                            
                                Install python 2.7 on ubuntu 18.04
                            
                                WARNING: Failed to generate report: No data to report error in python using pytest module
                            
                                How to count consecutive repetitions of a substring in a string?
                            
                                How do you save a Tensorflow dataset to a file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas combining sparse columns in dataframe

Tags:

python

pandas

dataframe

Prabhu

People also ask

1 Answers

Ch3steR

Recent Activity

Donate For Us