Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

joining output from regex search

  • I have a regex that looks for numbers in a file.
  • I put results in a list

The problem is that it prints each results on a new line for every single number it finds. it aslo ignore the list I've created.

What I want to do is to have all the numbers into one list. I used join() but it doesn't works.

code :

def readfile():
    regex = re.compile('\d+')
for num in regex.findall(open('/path/to/file').read()):
    lst = [num]
    jn = ''.join(lst)    
    print(jn)

output :

122
34
764
like image 958
shimo Avatar asked Jan 01 '19 10:01

shimo


2 Answers

What goes wrong:

# this iterates the single numbers you find - one by one
for num in regex.findall(open('/path/to/file').read()):  
    lst = [num]                  # this puts one number back into a new list
    jn = ''.join(lst)            # this gets the number back out of the new list
    print(jn)                    # this prints one number

Fixing it:

Reading re.findall() show's you, it returns a list already.

There is no(t much) need to use a for on it to print it.

If you want a list - simply use re.findall()'s return value - if you want to print it, use one of the methods in Printing an int list in a single line python3 (several more posts on SO about printing in one line):

import re

my_r = re.compile(r'\d+')                 # define pattern as raw-string

numbers = my_r.findall("123 456 789")     # get the list

print(numbers)

# different methods to print a list on one line
# adjust sep  / end to fit your needs
print( *numbers, sep=", ")                # print #1

for n in numbers[:-1]:                    # print #2
    print(n, end = ", ")
print(numbers[-1])

print(', '.join(numbers))                 # print #3

Output:

['123', '456', '789']   # list of found strings that are numbers
123, 456, 789
123, 456, 789
123, 456, 789

Doku:

  • print() function for sep= and end=
  • Printing an int list in a single line python3
  • Convert all strings in a list to int ... if you need the list as numbers

More on printing in one line:

  • Print in one line dynamically
  • Python: multiple prints on the same line
  • How to print without newline or space?
  • Print new output on same line
like image 153
Patrick Artner Avatar answered Nov 09 '22 21:11

Patrick Artner


In your case, regex.findall() returns a list and you are are joining in each iteration and printing it.

That is why you're seeing this problem.

You can try something like this.

numbers.txt

Xy10Ab
Tiger20
Beta30Man
56
My45one

statements:

>>> import re
>>>
>>> regex = re.compile(r'\d+')
>>> lst = []
>>>
>>> for num in regex.findall(open('numbers.txt').read()):
...     lst.append(num)
...
>>> lst
['10', '20', '30', '56', '45']
>>>
>>> jn = ''.join(lst)
>>>
>>> jn
'1020305645'
>>>
>>> jn2 = '\n'.join(lst)
>>> jn2
'10\n20\n30\n56\n45'
>>>
>>> print(jn2)
10
20
30
56
45
>>>
>>> nums = [int(n) for n in lst]
>>> nums
[10, 20, 30, 56, 45]
>>>
>>> sum(nums)
161
>>>
like image 33
hygull Avatar answered Nov 09 '22 20:11

hygull