Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python get the x first words in a string

Tags:

python

string

I'm looking for a code that takes the 4 (or 5) first words in a script. I tried this:

import re    
my_string = "the cat and this dog are in the garden"    
a = my_string.split(' ', 1)[0]
b = my_string.split(' ', 1)[1]

But I can't take more than 2 strings:

a = the
b = cat and this dog are in the garden

I would like to have:

a = the
b = cat
c = and
d = this
...
like image 813
Guillaume Avatar asked Mar 31 '14 16:03

Guillaume


People also ask

How do you extract the first word of a string in Python?

Use str. split() and list indexing to get the first word in a string. Call str. split() to create a list of all words in str separated by space or newline character.

How do you get the first 4 words of a string in Python?

We can use the re. findall() function in python to get the first n words in a given string. This function returns a list of all the matches in the given string. We can then use the join() function to join all the words in the list and return a string.

How do you find the first index of a word in a string in Python?

Python String – Find the index of first occurrence of substring. To find the position of first occurrence of a string, you can use string. find() method.

How do I print the first letter of a string in Python?

Get the first character of a string in python As indexing of characters in a string starts from 0, So to get the first character of a string pass the index position 0 in the [] operator i.e. It returned a copy of the first character in the string. You can use it to check its content or print it etc.


3 Answers

You can use slice notation on the list created by split:

my_string.split()[:4] # first 4 words
my_string.split()[:5] # first 5 words

N.B. these are example commands. You should use one or the other, not both in a row.

like image 168
Two-Bit Alchemist Avatar answered Sep 22 '22 07:09

Two-Bit Alchemist


The second argument of the split() method is the limit. Don't use it and you will get all words. Use it like this:

my_string = "the cat and this dog are in the garden"    
splitted = my_string.split()

first = splitted[0]
second = splitted[1]

...

Also, don't call split() every time when you want a word, it is expensive. Do it once and then just use the results later, like in my example.
As you can see, there is no need to add the ' ' delimiter since the default delimiter for the split() function (None) matches all whitespace. You can use it however if you don't want to split on Tab for example.

like image 22
bosnjak Avatar answered Sep 20 '22 07:09

bosnjak


You can split a string on whitespace easily enough, but if your string doesn't happen to have enough words in it, the assignment will fail where the list is empty.

a, b, c, d, e = my_string.split()[:5] # May fail

You'd be better off keeping the list as is instead of assigning each member to an individual name.

words = my_string.split()
at_most_five_words = words[:5] # terrible variable name

That's a terrible variable name, but I used it to illustrate the fact that you're not guaranteed to get five words – you're only guaranteed to get at most five words.

like image 23
kojiro Avatar answered Sep 21 '22 07:09

kojiro