process a text file using various delimiters

Question

My text file (unfortunately) looks like this...

<amar>[amar-1000#Fem$$$_Y](1){india|1000#Fem$$$,mumbai|1000#Mas$$$}
<akbar>[akbar-1000#Fem$$$_Y](1){}
<john>[-0000#$$$_N](0){USA|0100#$avi$$,NJ|0100#$avi$$}

It contain the customer name followed by some information. The sequence is...

text string followed by list, set and then dictionary

<> [] () {}

This is not python compatible file so the data is not as expected. I want to process the file and extract some information.

amar 1000 | 1000  | 1000
akbar 1000  
john 0000  | 0100 | 0100

1) name between <>

2) The number between - and # in the list

3 & 4) split dictionary on comma and the numbers between | and # (there can be more than 2 entries here)

I am open to using any tool best suited for this task.

Martin Evans · Accepted Answer

The following Python script will read your text file and give you the desired results:

import re, itertools

with open("input.txt", "r") as f_input:
    for line in f_input:
        reLine = re.match(r"<(\w+)>$$(.*?)$$.*?{(.*?)\}", line) 
        lNumbers = [re.findall(".*?(\d+).*?", entry) for entry in  reLine.groups()[1:]]
        lNumbers = list(itertools.chain.from_iterable(lNumbers))
        print reLine.group(1), " | ".join(lNumbers)

This prints the following output:

amar 1000 | 1000 | 1000
akbar 1000
john 0000 | 0100 | 0100

process a text file using various delimiters

Tags:

python

grep

sed

awk

shantanuo

1 Answers

Martin Evans

Recent Activity

Donate For Us

process a text file using various delimiters

Tags:

python

grep

sed

awk

shantanuo

1 Answers

Martin Evans

Related questions

Recent Activity

Donate For Us