Python: regex to catch data

Question

I want to ask your help.

I have a large piece of data, which looks like this:

     a
  b : c 901
   d : e sda
 v
     w : x ads
  any
   abc : def 12132
   ghi : jkl dasf
  mno : pqr fas
   stu : vwx utu

Description: file begins with a line containing single word (it can start with whitespace and whitespaces can be also after the word), then follows line of attributes separated by colon (also can have whitespaces), then again line of attributes or line with a single word. I can't create the right regex to catch it in such form:

{
  "a": [["b": "c 901"], ["d", "e sda"]],
  "v": [["w", "x ads"]],
  "any": ["abc", "def 12132"], ["ghi", "jkl dasf"],
  # etc.
}

Here is what I've tried:

regex = str()
regex += "^(?:(?:\s*)(.*?)(?:\s*))$",
regex += "(?:(?:^(?:\s*)(.*?)(?:\s*):(?:\s*)(.*?)(?:\s*))$)*$"
pattern = re.compile(regex, re.S | re.M)

However, it doesn't find what I need. Could you help me? I know I could process file without regex, using line-by-line iterator and checking for ":" symbol, but file is too big to process it this way (if you know how to process it fast without regex, this also will be right answer, but first which comes in mind is too slow).

Thanks in advance!

P.S. In the canonical form of file looks like this:

a
  b : c 901
  d : e sda

Every section begins with a single word, then follow attributes line (after two whitespaces), there attributes are separated with (" : "), then agane attributes line or line with a single word. Other whitespaces are prohibited. Probably it will be easier.

freakish · Accepted Answer

Are regular expressions really necessary here? Try this pseudocode:

result = {}

last = None
for _line in data:
    line = _line.strip( ).split( ":" )
    if len( line ) == 1:
        last = line[ 0 ]
        if last not in result:
            result[ last ] = []
    elif len( line ) == 2:
        obj = [ line[ 0 ].strip( ), line[ 1 ].strip( ) ]
        result[ last ].append( obj )

I hope I understand correctly your data structure.

Python: regex to catch data

Tags:

python

regex

ghostmansd

1 Answers

freakish

Recent Activity

Donate For Us

Python: regex to catch data

Tags:

python

regex

ghostmansd

1 Answers

freakish

Related questions

Recent Activity

Donate For Us