I'm trying to split a column in a pandas dataframe based on a separator character, and obtain the last section. pandas has the str.rsplit and the str.rpartition functions. If I try: <pre class="prettyprint"><code>df_client["Subject"].str.rsplit("-", 1) </code></pre> I get <blockquote> 0 [Activity -Location , UserCode] 1 [Activity -Location , UserCode] </blockquote> and if I try <pre class="prettyprint"><code>df_client["Subject"].str.rpartition("-") </code></pre> I get <blockquote> <pre class="prettyprint"><code> 0 1 2 </code></pre> 0 Activity -Location - UserCode 1 Activity -Location - UserCode </blockquote> If I do <pre class="prettyprint"><code>df_client["Subject"].str.rpartition("-")[2] </code></pre> I get <blockquote> 0 UserCode </blockquote> which is what I want. To me, str.rsplit seems unintuitive. After getting the list of the split string, how would I then select the single item that I need?

I think need indexing by str working with iterables: <pre class="prettyprint"><code>#select last lists df_client["Subject"].str.rsplit("-", 1).str[-1] #select second lists df_client["Subject"].str.rsplit("-", 1).str[1] </code></pre> If performance is important use <code>list comprehension</code>: <pre class="prettyprint"><code>df_client['last_col'] = [x.rsplit("-", 1)[-1] for x in df_client["Subject"]] print (df_client) Subject last_col 0 Activity-Location-UserCode UserCode 1 Activity-Location-UserCode UserCode </code></pre>

Pandas: Split string on last occurrence

Tags:

python

pandas

I'm trying to split a column in a pandas dataframe based on a separator character, and obtain the last section.

pandas has the str.rsplit and the str.rpartition functions.

If I try:

df_client["Subject"].str.rsplit("-", 1)

I get

0 [Activity -Location , UserCode]
1 [Activity -Location , UserCode]

and if I try

df_client["Subject"].str.rpartition("-")

I get

      0            1      2   
0 Activity -Location - UserCode
1 Activity -Location - UserCode

If I do

df_client["Subject"].str.rpartition("-")[2]

I get

0 UserCode

which is what I want.

To me, str.rsplit seems unintuitive.

After getting the list of the split string, how would I then select the single item that I need?

687

asked Sep 02 '18 16:09

Alan

1 Answers

I think need indexing by str working with iterables:

#select last lists 
df_client["Subject"].str.rsplit("-", 1).str[-1]
#select second lists
df_client["Subject"].str.rsplit("-", 1).str[1]

If performance is important use list comprehension:

df_client['last_col'] = [x.rsplit("-", 1)[-1] for x in df_client["Subject"]]
print (df_client)
                      Subject  last_col
0  Activity-Location-UserCode  UserCode
1  Activity-Location-UserCode  UserCode

127

answered Sep 21 '22 19:09

jezrael

Related questions
                            
                                Differences between OtpionMenu and ComboBox in tkinter
                            
                                Pandas - Go through 2 columns (latitude and longitude) and find the distance between each coordinate and a specific place
                            
                                How rename pd.value_counts() index with a correspondance dictionary
                            
                                Find similar items in list of dictionaries based on values
                            
                                'module' object has no attribute 'lru_cache'
                            
                                Accuracy Stuck at 50% Keras
                            
                                Block Bootstrapped Sampling in Pandas
                            
                                Cleaning email chain for text analysis python
                            
                                ModuleNotFoundError: No module named 'skimage.util.montage'
                            
                                Change pandas data frame column values inplace
                            
                                Element-wise broadcasting for comparing two NumPy arrays?
                            
                                Matplotlib Rotate xticklabels using ax.set()
                            
                                Python equivalent of R assign
                            
                                Python 3 tkinter message box highlight the "No" button?
                            
                                Opencv: Jetmap or colormap to grayscale, reverse applyColorMap()
                            
                                switch-case statement for STRINGS in Python
                            
                                Python call sql-server stored procedure with table valued parameter
                            
                                Discord API 401: Unauthorized error
                            
                                How to encrypt and decrypt pandas dataframe with decryption key?
                            
                                How to back up anaconda environment in Windows 10?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: Split string on last occurrence

Tags:

python

pandas

Alan

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us