Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove spaces and tabs within a string in python

How can I remove specific whitespace within a string in python.

My input string is:,

str1 = """vendor_id\t: GenuineIntel
        cpu family\t: 6
        model\t\t: 58
        model name\t: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
        stepping\t: 9
        cpu MHz\t\t: 2485.659
        cache size\t: 6144 KB
        fpu\t\t: yes
        fpu_exception\t: yes
        cpuid level\t: 5
        wp\t\t: yes"""

My required output is:

>>>print str1
vendor_id: GenuineIntel
cpu family: 6
model: 58
model name: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
stepping: 9
cpu MHz: 2485.659
cache size: 6144 KB
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes
like image 343
user27 Avatar asked Dec 15 '22 01:12

user27


2 Answers

Looks like you want to remove the whitespace from the start of lines, and remove all whitespace before a colon. Use regular expressions:

import re

re.sub(r'(^[ \t]+|[ \t]+(?=:))', '', str1, flags=re.M)

This picks out spaces and tabs at the start of lines (^[ \t]*, ^ is the start of a line, [ \t] is a space or tab, + is 1 or more), or it picks out spaces and tabs right before a colon ([ \t]+ is 1 or more spaces and tabs, (?=:) means that a : character must follow but isn't included in what is picked) and then replaces those spaces and tabs with an empty string. The flags=re.M is there to make sure the pattern works on each individual line.

Demo:

>>> import re
>>> str1 = """vendor_id\t: GenuineIntel
...         cpu family\t: 6
...         model\t\t: 58
...         model name\t: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
...         stepping\t: 9
...         cpu MHz\t\t: 2485.659
...         cache size\t: 6144 KB
...         fpu\t\t: yes
...         fpu_exception\t: yes
...         cpuid level\t: 5
...         wp\t\t: yes"""
>>> print re.sub(r'(^[ \t]+|[ \t]+(?=:))', '', str1, flags=re.M)
vendor_id: GenuineIntel
cpu family: 6
model: 58
model name: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
stepping: 9
cpu MHz: 2485.659
cache size: 6144 KB
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes

If your input string does not have leading whitespace (and you just indented your sample yourself to make it look lined up), then all you want to remove is tabs:

str1 = str1.replace('\t', '')

and be done with it.

like image 121
Martijn Pieters Avatar answered Jan 07 '23 13:01

Martijn Pieters


I don't know what you mean by "randomly", but you can remove all tabs with:

str1 = str1.replace("\t", "")
like image 23
jonrsharpe Avatar answered Jan 07 '23 13:01

jonrsharpe