Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select rows if string begins with certain characters in pandas

I have a csv file as the given picture bellow

enter image description here

I'm trying to find any word that will start with letter A and G or any list that I want

but my code returns an error any Ideas what I'm doing wrong ? this is my code

if len(sys.argv) == 1:
    print("please provide a CSV file to analys")
else:
    fileinput = sys.argv[1]

wdata = pd.read_csv(fileinput)


print( list(filter(startswith("a","g"), wdata)) )

like image 392
programming freak Avatar asked Jan 27 '20 07:01

programming freak


People also ask

How do you select a row with specific text in Python?

Using iterrows() to iterate rows with find to get rows that contain the desired text. iterrows() function returns the iterator yielding each index value along with a series containing the data in each row.

How do you select the first 3 rows in Pandas?

You can use df. head() to get the first N rows in Pandas DataFrame. Alternatively, you can specify a negative number within the brackets to get all the rows, excluding the last N rows.

How do you check if a string contains a character Pandas?

Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .


1 Answers

To get relevant rows, extract the first letter, then use isin:

df
  words  frequency
0  what         10
1   and          8
2   how          8
3  good          5
4   yes          7

df[df['words'].str[0].isin(['a', 'g'])]
  words  frequency
1   and          8
3  good          5

If you want a specific column, use loc:

df.loc[df['words'].str[0].isin(['a', 'g']), 'words']
1     and
3    good
Name: words, dtype: object

df.loc[df['words'].str[0].isin(['a', 'g']), 'words'].tolist()
# ['and', 'good']
like image 142
cs95 Avatar answered Oct 08 '22 03:10

cs95