This works (using Pandas 12 dev)
table2=table[table['SUBDIVISION'] =='INVERNESS']
Then I realized I needed to select the field using "starts with" Since I was missing a bunch. So per the Pandas doc as near as I could follow I tried
criteria = table['SUBDIVISION'].map(lambda x: x.startswith('INVERNESS')) table2 = table[criteria]
And got AttributeError: 'float' object has no attribute 'startswith'
So I tried an alternate syntax with the same result
table[[x.startswith('INVERNESS') for x in table['SUBDIVISION']]]
Reference http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing Section 4: List comprehensions and map method of Series can also be used to produce more complex criteria:
What am I missing?
Using “contains” to Find a Substring in a Pandas DataFrame The contains method in Pandas allows you to search a column for a specific substring. The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not.
at is a single element and using . loc maybe a Series or a DataFrame. Returning single value is not the case always. It returns array of values if the provided index is used multiple times.
You can use the str.startswith
DataFrame method to give more consistent results:
In [11]: s = pd.Series(['a', 'ab', 'c', 11, np.nan]) In [12]: s Out[12]: 0 a 1 ab 2 c 3 11 4 NaN dtype: object In [13]: s.str.startswith('a', na=False) Out[13]: 0 True 1 True 2 False 3 False 4 False dtype: bool
and the boolean indexing will work just fine (I prefer to use loc
, but it works just the same without):
In [14]: s.loc[s.str.startswith('a', na=False)] Out[14]: 0 a 1 ab dtype: object
.
It looks least one of your elements in the Series/column is a float, which doesn't have a startswith method hence the AttributeError, the list comprehension should raise the same error...
To retrieve all the rows which startwith required string
dataFrameOut = dataFrame[dataFrame['column name'].str.match('string')]
To retrieve all the rows which contains required string
dataFrameOut = dataFrame[dataFrame['column name'].str.contains('string')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With