Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suppressing treatment of string as iterable

UPDATE:

An idea to make built-in strings non-iterable was proposed on python.org in 2006. My question differs in that I'm trying to only suppress this features once in a while; still this whole thread is quite relevant.

Here are the critical comments by Guido who implemented non-iterable str on a trial basis:

[...] I implemented this (it was really simple to do) but then found I had to fix tons of places that iterate over strings. For example:

  • The sre parser and compiler use things like set("0123456789") and also iterate over the characters of the input regexp to parse it.

  • difflib has an API defined for either two lists of strings (a typical line-by-line diff of a file), or two strings (a typical intra-line diff), or even two lists of anything (for a generalized sequence diff).

  • small changes in optparse.py, textwrap.py, string.py.

And I'm not even at the point where the regrtest.py framework even works (due to the difflib problem).

I'm abandoning this project; the patch is SF patch 1471291. I'm no longer in favor of this idea; it's just not practical, and the premise that there are few good reasons to iterate over a string has been refuted by the use cases I found in both sre and difflib.

ORIGINAL QUESTION:

While it's a neat feature of the language that a string is an iterable, when combined with the duck typing, it may lead to disaster:

# record has to support [] operation to set/retrieve values
# fields has to be an iterable that contains the fields to be set
def set_fields(record, fields, value):
  for f in fields:
    record[f] = value

set_fields(weapon1, ('Name', 'ShortName'), 'Dagger')
set_fields(weapon2, ('Name',), 'Katana')
set_fields(weapon3, 'Name', 'Wand') # I was tired and forgot to put parentheses

No exception will be raised, and there's no easy way to catch this except by testing for isinstance(fields, str) in a myriad places. In some circumstances, this bug will take a very long time to find.

I want to disable strings from being treated as an iterable entirely in my project. Is it a good idea? Can it be done easily and safely?

Perhaps I could subclass built-in str such that I would need to explicitly call get_iter() if I wanted its object to be treated as an iterable. Then whenever I need a string literal, I would instead create an object of this class.

Here are some tangentially related questions:

How can I tell if a python variable is a string or a list?

how to tell a variable is iterable but not a string

like image 818
max Avatar asked Feb 06 '12 23:02

max


People also ask

Can a string be iterable?

For instance, strings are also iterable. If an object isn't technically an array, but represents a collection (list, set) of something, then for..of is a great syntax to loop over it, so let's see how to make it work.

Why is string iterable in Python?

The list numbers and string names are iterables because we are able to loop over them (using a for-loop in this case).

How do you check if something is iterable in Python?

As of Python 3.4, the most accurate way to check whether an object x is iterable is to call iter(x) and handle a TypeError exception if it isn't. This is more accurate than using isinstance(x, abc. Iterable) , because iter(x) also considers the legacy __getitem__ method, while the Iterable ABC does not.

Can you loop through a string in Python?

Looping through a stringOne way to iterate over a string is to use for i in range(len(str)): . In this loop, the variable i receives the index so that each character can be accessed using str[i] .


1 Answers

There aren't any ways to do this automatically, unfortunately. The solution you propose (a str subclass that isn't iterable) suffers from the same problem as isinstance() ... namely, you have to remember to use it everywhere you use a string, because there's no way to make Python use it in place of the native class. And of course you can't monkey-patch the built-in objects.

I might suggest that if you find yourself writing a function that takes either an iterable container or a string, maybe there's something wrong with your design. Sometimes you can't avoid it, though.

In my mind, the least intrusive thing to do is to put the check into a function and call that when you get into a loop. This at least puts the behavior change where you are most likely to see it: in the for statement, not buried away somewhere in a class.

def iterate_no_strings(item):
    if issubclass(item, str):   # issubclass(item, basestring) for Py 2.x
        return iter([item])
    else:
        return iter(item)

for thing in iterate_no_strings(things):
    # do something...
like image 122
kindall Avatar answered Sep 17 '22 12:09

kindall