Am trying to parse a series of text, using a series of numbers like the code below, but all i get in return is a series of NaN's.
import numpy as np
import pandas as pd
numData = np.array([4,6,4,3,6])
txtData = np.array(['bluebox','yellowbox','greybox','redbox','orangebox'])
n = pd.Series(numData)
t = pd.Series(txtData)
x = t.str[:n]
print (x)
output is
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
I would like the output to be
0 blue
1 yellow
2 grey
3 red
4 orange
Is there an easy way to do this.
You can use a simple list comprehension if in reality you can't chop off the last 3 characters and need to rely on your slice ranges. You will need error handling if your data aren't guaranteed to be all strings, or if end
can exceed the length of the string.
pd.Series([x[:end] for x,end in zip(t,n)], index=t.index)
0 blue
1 yellow
2 grey
3 red
4 orange
dtype: object
You can pd.Series.str.slice
t.str.slice(stop=-3)
# short hand for this is t.str[:-3]
0 blue
1 yellow
2 grey
3 red
4 orange
dtype: object
Or cast numData
as an iterator using iter
and use slice
it = iter(numData)
t.map(lambda x:x[slice(next(it))])
0 blue
1 yellow
2 grey
3 red
4 orange
dtype: object
numdata_iter = iter(numData)
x = t.apply(lambda text: text[:next(numdata_iter)])
We turn the numData
into an iterator and then invoke next
on it for each slicing in apply
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With