Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to concatenate values of all rows in a dataframe into a single row without altering the columns?

I have a data frame input that looks like:

  col1 col2 col3
0    3    1  NaN
1  NaN    7    8

How to collapse all rows while concatenating the data in the rows with ', '?

The desired data frame output:

  col1  col2 col3
0    3  1, 7    8

Sample input code:

import pandas as pd
import numpy as np


d = {'col1': ["3", np.nan], 'col2': ["1", "7"], 'col3': [np.nan, "8"]}
df = pd.DataFrame(data=d)
like image 849
Nae Avatar asked Mar 17 '18 23:03

Nae


People also ask

How do I concatenate rows in a DataFrame?

Use pandas.concat() method to concat two DataFrames by rows meaning appending two DataFrames. By default, it performs append operations similar to a union where it bright all rows from both DataFrames to a single DataFrame.

Can you concatenate DataFrames with different columns?

It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name.

How do I convert multiple columns to single column in pandas?

Step #1: Load numpy and Pandas. Step #2: Create random data and use them to create a pandas dataframe. Step #3: Convert multiple lists into a single data frame, by creating a dictionary for each list with a name. Step #4: Then use Pandas dataframe into dict.

How many columns can be concatenated in a Dataframe?

Example 2: Similarly, we can concatenate any number of columns in a dataframe. Let’s see through another example to concatenate three different columns of the day, month, and year in a single column Date.

How to group and concatenate data in Python Dataframe?

Group the data using Dataframe.groupby () method whose attributes you need to concatenate. Concatenate the string by using the join function and transform the value of that column using lambda statement. We will use the CSV file having 2 columns, the content of the file is shown in the below image:

How do you combine data in a Dataframe?

You have now learned the three most important techniques for combining data in Pandas: merge () for combining data on common columns or indices. .join () for combining data on a key column or an index. concat () for combining DataFrames across rows or columns.

What is data concatenation?

With concatenation, your datasets are just stitched together along an axis — either the row axis or column axis. Visually, a concatenation with no parameters along rows would look like this:


Video Answer


3 Answers

agg + dropna + str.join comes to mind.

df.agg(lambda x: ', '.join(x.dropna())).to_frame().T

  col1  col2 col3
0    3  1, 7    8

There are other solutions, my peers will find them for you :)

like image 169
cs95 Avatar answered Oct 17 '22 21:10

cs95


pd.DataFrame(
    [[
        ', '.join(map(str, map(int, filter(pd.notna, c))))
        for c in zip(*df.values)
    ]], columns=df.columns
)

  col1  col2 col3
0    3  1, 7    8
like image 5
piRSquared Avatar answered Oct 17 '22 21:10

piRSquared


One way to get what you want would be to create a new dataframe with the same columns as your old dataframe, and populate the first index with your desired data. In your case, your desired data would be a list of each column, joined by ', ', and with your NaN values removed:

new_df = pd.DataFrame(columns=df.columns)

for col in df.columns:
    new_df.loc[0, col] = ', '.join(df[col].dropna().tolist())

>>> new_df
  col1  col2 col3
0    3  1, 7    8
like image 4
sacuL Avatar answered Oct 17 '22 21:10

sacuL