I have a question to merge two columns into one in the same dataframe(start_end), also remove null value. I intend to merge 'Start station' and 'End station' into 'station', and keep 'duration' according to the new column 'station'. I have tried pd.merge, pd.concat, pd.append, but I cannot work it out.
dataFrame of Start_end:
Duration End station Start station
14 1407 NaN 14th & V St NW
19 509 NaN 21st & I St NW
20 638 15th & P St NW. NaN
27 1532 NaN Massachusetts Ave & Dupont Circle NW
28 759 NaN Adams Mill & Columbia Rd NW
Expected output:
Duration stations
14 1407 14th & V St NW
19 509 21st & I St NW
20 638 15th & P St NW
27 1532 Massachusetts Ave & Dupont Circle NW
28 759 Adams Mill & Columbia Rd NW
Code i have so far:
#start_end is the dataframe, 'start station', 'end station', 'duration'
start_end = pd.concat([df_start, df_end])
This is what I attempted to:
station = pd.merge([start_end['Start station'],start_end['End station']])
fillna
If NaN
are truly nulls
df.assign(**{
'Start station': df['Start station'].fillna(df['End station'])})
Duration End station Start station
14 1407 NaN 14th & V St NW
19 509 NaN 21st & I St NW
20 638 15th & P St NW. 15th & P St NW.
27 1532 NaN Massachusetts Ave & Dupont Circle NW
28 759 NaN Adams Mill & Columbia Rd NW
mask
If NaN
are strings
df.assign(**{
'Start station': df['Start station'].mask(
lambda x: x == 'NaN', df['End station'])})
Duration End station Start station
14 1407 NaN 14th & V St NW
19 509 NaN 21st & I St NW
20 638 15th & P St NW. 15th & P St NW.
27 1532 NaN Massachusetts Ave & Dupont Circle NW
28 759 NaN Adams Mill & Columbia Rd NW
Using combine_first
. replaces null values in col1 with col2
df["station"] = df["End station"].combine_first(df["Start station"])
df.drop(["End station", "Start station"], 1, inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With