I have a dataframe in a "yes/no" format like
7 22
1 NaN t
25 t NaN
where "t" stands for yes and I need to transform it to a X-Y table since the column name is the X coordinate and the index is the Y coordinate:
X Y
1 22 1
2 7 25
a pseudo-code like:
if a cell = "t":
newdf.X = df.column(t)
newdf.Y = df.index(t)
You can change the index to a different column by using set_index() after reset_index() .
The transpose() function is used to transpose index and columns. Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. If True, the underlying data is copied. Otherwise (default), no copy is made if possible.
tolist() function return a list of the values. These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period). Example #1: Use Index. tolist() function to convert the index into a list.
Pandas DataFrame reset_index() Method The reset_index() method allows you reset the index back to the default 0, 1, 2 etc indexes. By default this method will keep the "old" idexes in a column named "index", to avoid this, use the drop parameter.
Try this:
# Use np.where to get the integer location of the 't's in the dataframe
r, c = np.where(df == 't')
# Use dataframe constructor with dataframe indexes to define X, Y
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]})
df_out
Output:
X Y
0 22 1
1 7 25
Update to address @RajeshC comment:
Given df,
7 22
1 NaN t
13 NaN NaN
25 t NaN
Then:
r, c = np.where(df == 't')
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]}, index=r)
df_out = df_out.reindex(range(df.shape[0]))
df_out
Output:
X Y
0 22 1.0
1 NaN NaN
2 7 25.0
Another option with stack
:
pd.DataFrame.from_records(
df.stack().index.swaplevel(),
columns=['X', 'Y'])
Output:
X Y
0 22 1
1 7 25
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With