I am trying to replace all letters of a python object with numbers, in a Pandas DataFrame.
Example: I have a column of 3000 course codes, ex. A0101P. I am trying to replace all the letters of the alphabet in the course code with corresponding numbers (A =1, B=2 etc) so the output looks like this "1010116" (and most importantly, is an integer not an object/string)
The course code was initially a python object. So I have used
course.to_string()
to change it to string value.
Then, I have created a mapping and then used str.replace to replace the values.
mapping = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "H": 8, "I": 9, "J": 10, "K": 11, "L": 12, "M": 13, "N": 14, "O": 15, "P": 16, "Q":17,"R":18, "S": 19, "T": 20,"U": 21, "V": 22, "W": 23, "X": 24, "Y": 25, "Z":26}
courseDone = course.str.replace(course["Cursus code"], mapping)
It raises an error
AttributeError: 'str' object has no attribute 'str'
I have also tried
for key, value in mapping.items():
course = course.replace(key, value)
and it raises error
TypeError: replace() argument 2 must be str, not int
Even though the datatype is a string.
Example data:
1 A0101P
2 A0111P
3 A0200P
4 A0201P
5 A0202P
Desired output:
1 1010116
2 1011116
3 1020016
4 1020116
5 1020216
I have also tried to change the datatype with str() and the end errors are the same.
When I use
for key, value in mapping.items():
course["Cursus code"] = course["Cursus code"].replace(key, value)
I receive no error, but the output remains the same.
I am new to python and I have tried my best to find a solution but nothing seems to work. Can anyone help me please?
A silly solution is to replace the letters one by one, similar to yours, but have to map numbers to string:
for k,v in mapping.items():
v = str(v)
course["Cursus code"] = course["Cursus code"].str.replace(k,v)
Output:
0 1010116
1 1011116
2 1020016
3 1020116
4 1020216
strings are kept as 'objects' in pandas. You can use info() method of a dataframe to see which columns are integer, objects (for strings), timestampts, etc like:
df.info()
As to your question, you can use apply method and replace your string with desired mapping, like that:
def str_to_int_map(string, mapping):
return int(''.join([str(mapping.get(x, x)) for x in string]))
mapping = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "H": 8, "I": 9, "J": 10, "K": 11, "L": 12, "M": 13, "N": 14, "O": 15, "P": 16, "Q":17,"R":18, "S": 19, "T": 20,"U": 21, "V": 22, "W": 23, "X": 24, "Y": 25, "Z":26}
df['Course'] = df['Course'].apply(lambda x: str_to_int_map(x, mapping))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With