I have an input dataframe which can be generated from the code given below
df = pd.DataFrame({'subjectID' :[1,1,2,2],'keys':
['H1Date','H1','H2Date','H2'],'Values':
['10/30/2006',4,'8/21/2006',6.4]})
The input dataframe looks like as shown below
This is what I did
s1 = df.set_index('subjectID').stack().reset_index()
s1.rename(columns={0:'values'},
inplace=True)
d1 = s1[s1['level_1'].str.contains('Date')]
d2 = s1[~s1['level_1'].str.contains('Date')]
d1['g'] = d1.groupby('subjectID').cumcount()
d2['g'] = d2.groupby('subjectID').cumcount()
d3 = pd.merge(d1,d2,on=["subjectID", 'g'],how='left').drop(['g','level_1_x','level_1_y'], axis=1)
Though it works, I am afraid that this may not be the best approach. As we might have more than 200 columns and 50k RECORDS. Any help to improve my code further is very helpful.
I expect my output dataframe to look like as shown below
To create a new column, use the [] brackets with the new column name at the left side of the assignment.
Searches for a value in the top row of a table or an array of values, and then returns a value in the same column from a row you specify in the table or array. Use HLOOKUP when your comparison values are located in a row across the top of a table of data, and you want to look down a specified number of rows.
Rows are arranged horizontally, from left to right, while columns are arranged vertically, from top to bottom.
How do we calculate the prior row value and current row value? For example the Final Value being 3, 6, 5, 13 03-17-2022 05:09 PM = [Value]+lookupvalue (data [value],data [date],calculate (max (data [date]),filter (data,data [date]<earlier (data [date])))) Hope this helps. 08-20-2018 07:03 PM
Updating Row Values Like updating the columns, the row value updating is also very simple. You have to locate the row value first and then, you can update that row with new values. You can use the pandas loc function to locate the rows. We have located row number 3, which has the details of the fruit, Strawberry.
You have to locate the row value first and then, you can update that row with new values. You can use the pandas loc function to locate the rows. We have located row number 3, which has the details of the fruit, Strawberry. Now, we have to update this row with a new fruit named Pineapple and its details.
You can even update multiple column names at a single time. For that, you have to add other column names separated by a comma under the curl braces. #multile column update data.rename(columns = {'Fruit':'Fruit Name','Colour':'Color','Price':'Cost'}) Just like this, you can update all your columns at the same time. 3.
may be something like:
s=df.groupby(df['keys'].str.contains('Date').cumsum()).cumcount()+1
final=(df.assign(s=s.astype(str)).set_index(['subjectID','s']).
unstack().sort_values(by='s',axis=1))
final.columns=final.columns.map(''.join)
print(final)
keys1 Values1 keys2 Values2
subjectID
1 H1Date 10/30/2006 H1 4
2 H2Date 8/21/2006 H2 6.4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With