How can I get the first word from each string in my Dataframe using Python?

Tags:

pandas

I have a Pandas DataFrame called "data" with 2 columns and 50 rows filled with one or two lines of text each, imported from a .tsv file. Some of the questions may contain integers and floats, besides strings. I am trying to extract the first word of every sentence (in both columns), but consistently get this error: AttributeError: 'DataFrame' object has no attribute 'str'.

At first, I thought the error was due to my wrong use of "data.str.split", but all changes I could Google failed. Then I through the file might not be composed of all strings. So I tried "data.astype(str)" on the file, but the same error remained. Any suggestions? Thanks a lot!

Here is my code:

import pandas as pd
questions = "questions.tsv"
data = pd.read_csv(questions, usecols = [3], nrows = 50, header=1, sep="\t")
data = data.astype(str)
first_words = data.str.split(None, 1)[0]

382

asked Sep 15 '17 04:09

twhale

1 Answers

Use:

first_words = data.apply(lambda x: x.str.split().str[0])

Or:

first_words = data.applymap(lambda x: x.split()[0])

Sample:

data = pd.DataFrame({'a':['aa ss ss','ee rre', 1, 'r'],
                   'b':[4,'rrt ee', 'ee www ee', 6]})
print (data)
          a          b
0  aa ss ss          4
1    ee rre     rrt ee
2         1  ee www ee
3         r          6

data = data.astype(str)
first_words = data.apply(lambda x: x.str.split().str[0])
print (first_words)
    a    b
0  aa    4
1  ee  rrt
2   1   ee
3   r    6

first_words = data.applymap(lambda x: x.split()[0])
print (first_words)
    a    b
0  aa    4
1  ee  rrt
2   1   ee
3   r    6

177

answered Oct 08 '22 09:10

jezrael

Related questions
                            
                                how extract a vector from groupby pandas in python
                            
                                Making a group in dataframe in pandas
                            
                                Pandas One hot encoding: Bundling together less frequent categories
                            
                                seaborn pointplot above swarmplot
                            
                                Pandas read_csv get rid of enclosing double quotes
                            
                                Pandas read_hdf very slow for non-numeric data
                            
                                Parse text file python and covert to pandas dataframe
                            
                                Pandas: How to select a column in rolling window
                            
                                Pandas.DataFrame.sum(axis = 1) not working
                            
                                Drop columns of DataFrames from a list of DataFrames using List Comprehension (Pandas)
                            
                                Getting a scalar by integer location and column label (mixed indexing)
                            
                                Pandas: expanding DataFrame by number of observations in column
                            
                                passing pandas dataframe into a python subprocess.Popen as an argument
                            
                                Removing 'overlapping' dates from pandas dataframe
                            
                                how to split and concat pandas dataframe
                            
                                Correct use of map for mapping a function onto a df, python pandas
                            
                                Getting the three smallest values per row and returning the correspondent column names
                            
                                Pandas Multi-Index DataFrame to Numpy Ndarray
                            
                                Count distinct strings in rolling window using pandas
                            
                                Updating pandas to version 0.19 in Azure ML Studio

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With