I'm confused by how the length of a string is calculated when expandtabs is used. I thought expandtabs replaces tabs with the appropriate number of spaces (with the default number of spaces per tab being 8). However, when I ran the commands using strings of varying lengths and varying numbers of tabs, the length calculation was different than I thought it would be (i.e., each tab didn't always result in the string length being increased by 8 for each instance of "/t").
Below is a detailed script output with comments explaining what I thought should be the result of the command executed above. Would someone please explain the how the length is calculated when expand tabs is used?
IDLE 2.6.5
>>> s = '\t'
>>> print len(s)
1
>>> #the length of the string without expandtabs was one (1 tab counted as a single space), as expected.
>>> print len(s.expandtabs())
8
>>> #the length of the string with expandtabs was eight (1 tab counted as eight spaces).
>>> s = '\t\t'
>>> print len(s)
2
>>> #the length of the string without expandtabs was 2 (2 tabs, each counted as a single space).
>>> print len(s.expandtabs())
16
>>> #the length of the string with expandtabs was 16 (2 tabs counted as 8 spaces each).
>>> s = 'abc\tabc'
>>> print len(s)
7
>>> #the length of the string without expandtabs was seven (6 characters and 1 tab counted as a single space).
>>> print len(s.expandtabs())
11
>>> #the length of the string with expandtabs was NOT 14 (6 characters and one 8 space tabs).
>>> s = 'abc\tabc\tabc'
>>> print len(s)
11
>>> #the length of the string without expandtabs was 11 (9 characters and 2 tabs counted as a single space).
>>> print len(s.expandtabs())
19
>>> #the length of the string with expandtabs was NOT 25 (9 characters and two 8 space tabs).
>>>
The expandtabs() character replaces the '\t' with whitespace until the next tab stop. The position of '\t' is 3 and the first tab stop is 8. Hence, the number of spaces after 'xyz' is 5. The next tab stops are the multiples of tabsize .
The expandtabs() method sets the tab size to the specified number of whitespaces.
Finding the indices of the spaces tell you how many there are so that you can calculate how many tabs there are - a tab is N spaces as defined by the user. You don't need to find the indices to count them; num_spaces = elem. count(' ') works fine.
As Alex Martelli points out in a comment, in Python 2, tabs are equivalent to 8 spaces, and adapting the example with a tab and 8 spaces shows that this is indeed the case.
Like when you are entering tabs in a text-editor, the tab character increases the length to the next multiple of 8.
So:
'\t'
by itself is 8, obviously.'\t\t'
is 16.'abc\tabc'
starts at 3 characters, then a tab pushes it up to 8, and then the last 'abc'
pushes it from 8 to 11...'abc\tabc\tabc'
likewise starts at 3, tab bumps it to 8, another 'abc'
goes to 11, then another tab pushes it to 16, and the final 'abc'
brings the length to 19.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With