Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting string between multiple occurrence of same delimiter in python pandas

Tags:

Column "Test" has strings with multiple occurrence of same delimiter. Am trying to fetch the string which is within those delimiters. Can you please help.

Example:

Test
|||||CHNBAD||POC-RM0EP7-01-A

My code:

df["Fetch"]=df["Test"].str.rsplit("|", 2).str[-2]

But its giving me an output as POC-RM0EP7-01-A.

Am looking to get "CHNBAD" from the string

like image 459
Raghav Avatar asked May 22 '21 18:05

Raghav


People also ask

How do you return a string between two strings in Python?

Given a string and two substrings, write a Python program to extract the string between the found two substrings. In this, we get the indices of both the substrings using index(), then a loop is used to iterate within the index to find the required string between them.

How do you split a string between two characters in Python?

The Python standard library comes with a function for splitting strings: the split() function. This function can be used to split strings between characters. The split() function takes two parameters. The first is called the separator and it determines which character is used to split the string.

How do I extract a string between?

Extract part string between two same characters with formulas. If you want to extract part string between two same characters, you can do as this: Select a cell which you will place the result, type this formula =SUBSTITUTE(MID(SUBSTITUTE("/" & A3&REPT(" ",6),"/",REPT(",",255)),2*255,255),",",""), and press Enter key.


1 Answers

With your shown samples, please try following. We could use str.extract function pf Pandas here. Applying str.extract function on Test column and creating new column named Fetch in DataFrame.

df['Fetch'] = df['Test'].str.extract(r'^\|+([^|]*)\|.*',expand=False)

DataFrame's will be as follows:

    Test                            Fetch
0   |||||CHNBAD||POC-RM0EP7-01-A    CHNBAD

Explanation of regex:

^\|+     ##Matching 1 or more matches of | from starting of value.
([^|]*)  ##Creating 1st capturing group which has everything till next | comes.
\|.*     ##Matching | and everything till last of value.
like image 101
RavinderSingh13 Avatar answered Sep 30 '22 17:09

RavinderSingh13