Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regular expression - re.findall() in list

This is my list:

lista=[u'REG_S_3_UMTS_0_0 (RNC)', u'REG_S_3_UMTS_0_1 (RNC)', u'REG_S_3_UMTS_0_2 (RNC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_0 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_1 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_0 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_1 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_2 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_0 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_1 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_2 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_0 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_1 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_2 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_0 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_1 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_2 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_0 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_1 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_2 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_0 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_1 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_2 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_0 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_1 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_2 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_0 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_1 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_2 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_0 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_1 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_2 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'Pool ID: 200']

And that's my function:

def Filter_List(lista):

     string = ''.join(lista)
     match = re.findall(r"\(([A-Z]+)\)|Pool ID", string)
     return match

As a result I get:

[u'RNC', u'RNC', u'RNC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'BSC', u'']

But the last element (it should be: Pool ID) doesn't display.There's only: u'' . Does anyone know how should I change my expression? Thanks in advance !!!

like image 728
pingwin850 Avatar asked Aug 12 '16 09:08

pingwin850


People also ask

Does Findall work on lists?

The parenthesis ( ) group mechanism can be combined with findall(). If the pattern includes 2 or more parenthesis groups, then instead of returning a list of strings, findall() returns a list of *tuples*. Each tuple represents one match of the pattern, and inside the tuple is the group(1), group(2) .. data.

Does re Findall return a list?

The re. findall() method returns a list of strings. Each string element is a matching substring of the string argument.

What is re Findall () in Python?

The re. findall() method is used for getting all the non-overlapping matches of the pattern in the string of data as return, in the form of the list of strings. The string of data will be scanned from left to right, and its matches will be returned in the same order as found.

What is difference between Search () and Findall () methods in Python?

Here you can see that, search() method is able to find a pattern from any position of the string. The re. findall() helps to get a list of all matching patterns. It searches from start or end of the given string.


1 Answers

Note that re.findall returns (list of) tuples if a regex pattern has capture groups defined. Remove it:

import re
lista=[u'REG_S_3_UMTS_0_0 (RNC)', u'REG_S_3_UMTS_0_1 (RNC)', u'REG_S_3_UMTS_0_2 (RNC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_2_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_0 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_1 (BSC)', u'REG_S_3_GSM_ERIC_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_0 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_1 (BSC)', u'REG_S_3_GSM_HUAP_CBSM_bsc_0_2 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_0 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_1 (BSC)', u'REG_S_3_GSM_HUA_CBSM_bsc_0_2 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_0 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_1 (BSC)', u'REG_S_3_GSM_IPAC_SABP_bsc_0_2 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_0 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_1 (BSC)', u'REG_S_3_GSM_NOKI_CLNS_bsc_0_2 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_0 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_1 (BSC)', u'REG_S_3_GSM_NOKI_RFC1_bsc_0_2 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_3_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_0 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_1 (BSC)', u'REG_S_3_GSM_SIEM_BSCI_bsc_0_2 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_0 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_1 (BSC)', u'REG_S_GSM_ERIC_CBSP_bsc_0_2 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_0 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_1 (BSC)', u'REG_S_GSM_HUAP_CBSM_bsc_0_2 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_0 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_1 (BSC)', u'REG_S_GSM_HUA_CBSM_bsc_0_2 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_0 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_1 (BSC)', u'REG_S_GSM_NORT_CBSP_bsc_0_2 (BSC)', u'Pool ID: 200']
string = ''.join(lista)
match = re.findall(r"\([A-Z]+\)|Pool ID", string)
print(match)

See this Python demo returning

[u'(RNC)', u'(RNC)', u'(RNC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'(BSC)', u'Pool ID']

like image 181
Wiktor Stribiżew Avatar answered Oct 14 '22 11:10

Wiktor Stribiżew