Expand pandas dataframe and consolidate columns

Tags:

python

pandas

I have a dataframe that looks like this:

    desc    item    type1    date1    type2    date2    type3    date3
0   this    foo1      A        9/1      B        9/2      C        9/3
1   this    foo2      D        9/4      E        9/5      F        9/6

How do I get it to look like:

     desc    item    type    date
0    this    foo1      A      9/1
1    this    foo1      B      9/2
2    this    foo1      C      9/3
3    this    foo2      D      9/4
4    this    foo2      E      9/5
5    this    foo2      F      9/6

882

asked Sep 14 '20 02:09

user12405981

2 Answers

Check with wide_to_long

out = pd.wide_to_long(df.reset_index(), ['type','date'], i ='index', j = 'drop').reset_index(drop=True)
out
Out[127]: 
  type date
0    A  9/1
1    B  9/2
2    C  9/3

For your updated question, the same concept still applies, you just do not need to reset the index, since item is unique:

pd.wide_to_long(df, stubnames=['type','date'], i='item',j='drop').droplevel(-1).reset_index()



    item  type  date
0   foo1    A   9/1
1   foo2    D   9/4
2   foo1    B   9/2
3   foo2    E   9/5
4   foo1    C   9/3
5   foo2    F   9/6

answered Oct 17 '22 16:10

BENY

You can also use .melt on two dataframes by passing a list to value_vars using list comprehension if a column contains type or date. Then, you can merge these two dataframes on the index:

df = pd.merge(df.melt(id_vars='item', value_vars=[col for col in df.columns if 'type' in col], value_name='type')[['item','type']],
              df.melt(id_vars='item', value_vars=[col for col in df.columns if 'date' in col], value_name='date')['date'],
              how='left', left_index=True, right_index=True).sort_values('type')
df
Out[1]: 
   item type date
0  foo1    A  9/1
2  foo1    B  9/2
4  foo1    C  9/3
1  foo2    D  9/4
3  foo2    E  9/5
5  foo2    F  9/6

answered Oct 17 '22 17:10

David Erickson

Related questions
                            
                                implementation of using Maclaurin series of e^x in python
                            
                                AttributeError: 'HTTPHeaderDict' object has no attribute 'get_all' while mapping the mappings in elasticsearch
                            
                                Inheriting from a generic abstract class with a concrete type parameter is not enforced in PyCharm
                            
                                Unable to parse string at position 0 problem
                            
                                TypeError: Could not build a TypeSpec for a column
                            
                                Adding dates into a hovertemplate plotly
                            
                                Why are double curly braces used instead of backslash in python f-strings?
                            
                                Combining Python trace information and logging
                            
                                Is there a way to download video while keeping their chapters metadata? [closed]
                            
                                Package Python3.7 is not available
                            
                                error: Could not find a version that satisfies the requirement pprint (from -r requirements.txt (line 67)) (from versions: none)
                            
                                How to extract feature vector from single image in Pytorch?
                            
                                Yielding asyncio generator data back from event loop possible?
                            
                                Python Serverless Function Vercel - Next.js
                            
                                Google Calendar API drops "conferenceData" nested object
                            
                                Send a pandas dataframe to slack
                            
                                Python argparse select a list from choices
                            
                                Is there a convention for indicating a quantity's units in Python code?
                            
                                Extracting data from list in Python, after BeautifulSoup scrape, and creating Pandas table
                            
                                Pandas in df column extract string after colon if colon exits; if not, keep text

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With