Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check string indentation?

I'm building an analyzer for a series of strings. I need to check how much each line is indented (either by tabs or by spaces).

Each line is just a string in a text editor. How do I check by how much a string is indented?

Or rather, maybe I could check how much whitespace or \t are before a string, but I'm unsure of how.

like image 808
Dhruv Govil Avatar asked Nov 05 '12 22:11

Dhruv Govil


3 Answers

To count the number of spaces at the beginning of a string you could do a comparison between the left stripped (whitespace removed) string and the original:

a = "    indented string"
leading_spaces = len(a) - len(a.lstrip())
print(leading_spaces) 
# >>> 4

Tab indent is context specific... it changes based on the settings of whatever program is displaying the tab characters. This approach will only tell you the total number of whitespace characters (each tab will be considered one character).

Or to demonstrate:

a = "\t\tindented string"
leading_spaces = len(a) - len(a.lstrip())
print(leading_spaces)
# >>> 2

EDIT:

If you want to do this to a whole file you might want to try

with open("myfile.txt") as afile:
    line_lengths = [len(line) - len(line.lstrip()) for line in afile]
like image 119
Gizmo Avatar answered Sep 19 '22 17:09

Gizmo


I think Gizmo's basic idea is good, and it's relatively easy to extend it to handle any mixture of leading tabs and spaces by using a string object's expandtabs() method:

def indentation(s, tabsize=4):
    sx = s.expandtabs(tabsize)
    return 0 if sx.isspace() else len(sx) - len(sx.lstrip())

print indentation("  tindented string")
print indentation("\t\tindented string")
print indentation("  \t  \tindented string")

The last two print statements will output the same value.

Edit: I modified it to check and return 0 if a line of all tabs and spaces is encountered.

like image 28
martineau Avatar answered Sep 22 '22 17:09

martineau


The len() method will count tab (\t) as one. In some case, it will not behave expectedly. So my way is to use re.sub and then count the space(s).

indent_count = re.sub(r'^([\s]*)[\s]+.*$', r'\g<1>', line).count(' ')
like image 32
Alex Fang Avatar answered Sep 20 '22 17:09

Alex Fang