Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transform a Pandas Dataframe Column and Index to Values

I have a dataframe in a "yes/no" format like

    7   22
1   NaN t
25  t   NaN

where "t" stands for yes and I need to transform it to a X-Y table since the column name is the X coordinate and the index is the Y coordinate:

  X  Y
1 22  1
2  7 25

a pseudo-code like:

if a cell = "t":
     newdf.X = df.column(t)
     newdf.Y = df.index(t)
like image 899
Fred Avatar asked Apr 06 '21 16:04

Fred


People also ask

How do I change an index to a column?

You can change the index to a different column by using set_index() after reset_index() .

How do you change columns and indexes in pandas?

The transpose() function is used to transpose index and columns. Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. If True, the underlying data is copied. Otherwise (default), no copy is made if possible.

How do you convert the index of a DataFrame to a list in Python?

tolist() function return a list of the values. These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period). Example #1: Use Index. tolist() function to convert the index into a list.

What is reset_index () in pandas?

Pandas DataFrame reset_index() Method The reset_index() method allows you reset the index back to the default 0, 1, 2 etc indexes. By default this method will keep the "old" idexes in a column named "index", to avoid this, use the drop parameter.


Video Answer


2 Answers

Try this:

# Use np.where to get the integer location of the 't's in the dataframe
r, c = np.where(df == 't')

# Use dataframe constructor with dataframe indexes to define X, Y
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]})
df_out

Output:

    X   Y
0  22   1
1   7  25

Update to address @RajeshC comment:

Given df,

      7   22
1   NaN    t
13  NaN  NaN
25    t  NaN

Then:

r, c = np.where(df == 't')
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]}, index=r)
df_out = df_out.reindex(range(df.shape[0]))
df_out

Output:

     X     Y
0   22   1.0
1  NaN   NaN
2    7  25.0
like image 166
Scott Boston Avatar answered Oct 16 '22 18:10

Scott Boston


Another option with stack:

pd.DataFrame.from_records(
    df.stack().index.swaplevel(),
    columns=['X', 'Y'])

Output:

    X   Y
0  22   1
1   7  25
like image 39
perl Avatar answered Oct 16 '22 18:10

perl