Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: adding column with the length of other column as value

Tags:

python

pandas

I want to add an additional column to an existing dataframe that has the length of the 'seller_name' column as its value.

The output should be like so:

seller_name    name_length
-------------|-------------
Rick         |      4
Hannah       |      6

However, I'm having difficulty getting the code right.

df['name_length']  = len(df['seller_name'])

just gives me the length of the entire column (6845) And

df['nl']  = df[len('seller_name')]

Throws a KeyError.

Does anyone know the correct command to achieve my goal?

Many thanks!

like image 231
Jasper Avatar asked Mar 15 '17 16:03

Jasper


People also ask

How do I create a column with the same value in pandas?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])

How get value from another column in pandas?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.

How will you add the value of two columns in a pandas DataFrame to create another column?

Combine Two Columns Using + OperatorBy use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.


2 Answers

Use the .str string accessor to perform string operations on DataFrames. In particular, you want .str.len:

df['name_length']  = df['seller_name'].str.len()

The resulting output:

  seller_name  name_length
0        Rick            4
1      Hannah            6
like image 185
root Avatar answered Oct 23 '22 04:10

root


Say you have this data:

y_1980 = pd.read_csv('y_1980.csv', sep='\t')

     country  y_1980
0     afg     196
1     ago     125
2     al      23

If you want to calculate the length of any column you can use:

y_1980['length'] = y_1980['country'].apply(lambda x: len(x))
print(y_1980)

     country  y_1980  length
 0     afg     196       3
 1     ago     125       3
 2     al      23       2

This way you can calculate the length of any columns you desire.

like image 29
everestial007 Avatar answered Oct 23 '22 04:10

everestial007