Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding a new column with specific dtype in pandas

Can we assign a new column to pandas and also declare the datatype in one fell scoop?

df = pd.DataFrame({'BP': ['100/80'],'Sex': ['M']})
df2 = (df.drop('BP',axis=1)
       .assign(BPS =  lambda x: df.BP.str.extract('(?P<BPS>\d+)/'))
       .assign(BPD =  lambda x: df.BP.str.extract('/(?P<BPD>\d+)'))
        )

print(df2)
df2.dtypes

Can we have dtype as np.float using only the chained expression?

like image 518
BhishanPoudel Avatar asked Jan 23 '19 03:01

BhishanPoudel


People also ask

Can pandas column have different data types?

Pandas uses other names for data types than Python, for example: object for textual data. A column in a DataFrame can only have one data type. The data type in a DataFrame's single column can be checked using dtype .

How do I add one column in pandas?

In pandas you can add/append a new column to the existing DataFrame using DataFrame. insert() method, this method updates the existing DataFrame with a new column. DataFrame. assign() is also used to insert a new column however, this method returns a new Dataframe after adding a new column.

How do you create a new column in pandas and assign a value?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])


2 Answers

Obviously, you don't have to do this, but you can.

df.drop('BP', 1).join(
    df['BP'].str.split('/', expand=True)
            .set_axis(['BPS', 'BPD'], axis=1, inplace=False)
            .astype(float))

  Sex    BPS   BPD
0   M  100.0  80.0

Your two str.extract calls can be done away with in favour of a single str.split call. You can then make one astype call.


Personally, if you ask me about style, I would say this looks more elegant:

u = (df['BP'].str.split('/', expand=True)
             .set_axis(['BPS', 'BPD'], axis=1, inplace=False)
             .astype(float))
df.drop('BP', 1).join(u)


  Sex    BPS   BPD
0   M  100.0  80.0
like image 122
cs95 Avatar answered Oct 26 '22 09:10

cs95


Adding astype when you assign the values

df2 = (df.drop('BP',axis=1)
       .assign(BPS =  lambda x: df.BP.str.extract('(?P<BPS>\d+)/').astype(float))
       .assign(BPD =  lambda x: df.BP.str.extract('/(?P<BPD>\d+)').astype(float))
       )
df2.dtypes
Sex     object
BPS    float64
BPD    float64
dtype: object

What I will do

df.assign(**df.pop('BP').str.extract(r'(?P<BPS>\d+)/(?P<BPD>\d+)').astype(float))
  Sex    BPS   BPD
0   M  100.0  80.0
like image 26
BENY Avatar answered Oct 26 '22 07:10

BENY