Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract the first 2 digits of all numbers in a column of a dataframe?

I am completely new at Python (this is my first assignment) and I am trying to take the first two digits of the D-column of the following dataframe and put those two digits in a new column F:

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A' : [1, 1, 1, 4, 5, 3, 3, 4, 1, 4], 
                    'B' : [8, 4, 3, 1, 1, 6, 4, 6, 9, 8], 
                    'C' : [69,82,8,25,56,79,98,68,49,82], 
                    'D' : [1663, 8818, 9232, 9643, 4900, 8568, 4975, 8938, 7513, 1515],
                    'E' : ['Married','Single','Single','Divorced','Widow(er)','Single','Married','Divorced','Married','Widow(er)']})

I found several possible solutions here on Stack Overflow, and tried to apply them but none of them is working for me. Either I get some error message (different depending on which solution I tried to apply) I do not get th result that I am expecting.

like image 738
SamR Avatar asked Jan 03 '23 04:01

SamR


2 Answers

You could use something like:

df1['f'] = df1.D.astype(str).str[:2].astype(int)
like image 83
Kumar Avatar answered Jan 27 '23 15:01

Kumar


Here's a solution using NumPy. It requires numbers in D to have at least 2 digits.

df = pd.DataFrame({'D': [1663, 8818, 9232, 9643, 31, 455, 43153, 45]})

df['F'] = df['D'] // np.power(10, np.log10(df['D']).astype(int) - 1)

print(df)

       D   F
0   1663  16
1   8818  88
2   9232  92
3   9643  96
4     31  31
5    455  45
6  43153  43
7     45  45

If all your numbers have 4 digits, you can simply use df['F'] = df['D'] // 100.

For larger dataframes, these numeric methods will be more efficient than converting integers to strings, extracting the first 2 characters and converting back to int.

like image 38
jpp Avatar answered Jan 27 '23 15:01

jpp