Here is some dummy data I have created for my question. I have two questions regarding this:
split
working by using str
in the first part of the query and not in the second part?[0]
is picking up the first row in part 1 and the first element from each row in part 2?chess_data = pd.DataFrame({"winner": ['A:1','A:2','A:3','A:4','B:1','B:2']})
chess_data.winner.str.split(":")[0]
['A', '1']
chess_data.winner.map(lambda n: n.split(":")[0])
0 A
1 A
2 A
3 A
4 B
5 B
Name: winner, dtype: object
chess_data
is a dataframechess_data.winner
is a serieschess_data.winner.str
is an accessor to methods that are string specific and optimized (to a degree)chess_data.winner.str.split
is one such methodchess_data.winner.map
is a different method that takes a dictionary or a callable object and either calls that callable with each element in the series or calls the dictionaries get
method on each element of the series.In the case of using chess_data.winner.str.split
Pandas does do a loop and performs a kind of str.split
. While map
is a more crude way of doing the same thing.
With your data.
chess_data.winner.str.split(':')
0 [A, 1]
1 [A, 2]
2 [A, 3]
3 [A, 4]
4 [B, 1]
5 [B, 2]
Name: winner, dtype: object
In order to get each first element, you'll want to use the string accessor again
chess_data.winner.str.split(':').str[0]
0 A
1 A
2 A
3 A
4 B
5 B
Name: winner, dtype: object
This is the equivalent way of performing what you had done in your map
chess_data.winner.map(lambda x: x.split(':')[0])
You could have also used a comprehension
chess_data.assign(new_col=[x.split(':')[0] for x in chess_data.winner])
winner new_col
0 A:1 A
1 A:2 A
2 A:3 A
3 A:4 A
4 B:1 B
5 B:2 B
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With