Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What datatype is considered 'list-like' in Python?

In the Pandas documentation here for Series.isin(values), they state:

values : set or list-like

What is considered list-like? For a Python dictionary temp_dict, would temp_dict.keys() and temp_dict.values() be considered list-like?

like image 491
Yandle Avatar asked Apr 18 '26 15:04

Yandle


2 Answers

"List-like" isn't a standard Python term. Googling pandas list-like turns up pandas.api.types.is_list_like, but the documentation for that just says

Check if the object is list-like.

Objects that are considered list-like are for example Python lists, tuples, sets, NumPy arrays, and Pandas Series.

Strings and datetime objects, however, are not considered list-like.

which isn't really much of a spec. So, as a last resort, we turn to the source code, and after following a lot of imports and aliasing, we eventually find this function:

cdef bint c_is_list_like(object obj, bint allow_sets) except -1:
    # first, performance short-cuts for the most common cases
    if util.is_array(obj):
        # exclude zero-dimensional numpy arrays, effectively scalars
        return not cnp.PyArray_IsZeroDim(obj)
    elif isinstance(obj, list):
        return True
    # then the generic implementation
    return (
        # equiv: `isinstance(obj, abc.Iterable)`
        getattr(obj, "__iter__", None) is not None and not isinstance(obj, type)
        # we do not count strings/unicode/bytes as list-like
        # exclude Generic types that have __iter__
        and not isinstance(obj, (str, bytes, _GenericAlias))
        # exclude zero-dimensional duck-arrays, effectively scalars
        and not (hasattr(obj, "ndim") and obj.ndim == 0)
        # exclude sets if allow_sets is False
        and not (allow_sets is False and isinstance(obj, abc.Set))
    )

So Pandas considers an object list-like if it passes this complicated series of checks.

  • If an object is a 0-dimensional NumPy array, it's not list-like.
  • Otherwise, if it's a list, it's list-like.
  • Otherwise, it needs to pass all the following checks to be list-like:
    • It needs to have an __iter__ attribute that's not None.
    • It needs to not be a type.
    • It needs to not be a string, a bytestring, or a "generic alias" (a type used for some typing module things).
    • It needs to not have an ndim attribute equal to 0.
    • In some cases, Pandas will disallow instances of collections.abc.Set, which are sets, frozensets, and certain other set-like objects. (abc is collections.abc here.)

That means Pandas considers most iterable objects to be list-like. Strings, bytestrings, generic aliases, and iterable type objects (like Enum classes) are excluded, with that part about excluding iterable type objects probably being a bug - the code is trying to exclude non-iterable type objects whose instances are iterable.

The 0-dimensional array and ndim==0 checks attempt to exclude objects for which positive-dimensional instances of their type would be iterable, but 0-dimensional instances aren't.

Sets and other collections.abc.Set subclasses are sometimes excluded, but Series.isin doesn't pass the flag to exclude them.

like image 173
user2357112 supports Monica Avatar answered Apr 20 '26 05:04

user2357112 supports Monica


If you check the Pandas documentation, you find a function that determines whether something is list like. If you do a bunch of searching and searching you eventually end up at a pyx file that defines a C-ish version of the function:

cdef bint c_is_list_like(object obj, bint allow_sets) except -1:
    # first, performance short-cuts for the most common cases
    if util.is_array(obj):
        # exclude zero-dimensional numpy arrays, effectively scalars
        return not cnp.PyArray_IsZeroDim(obj)
    elif isinstance(obj, list):
        return True
    # then the generic implementation
    return (
        # equiv: `isinstance(obj, abc.Iterable)`
        getattr(obj, "__iter__", None) is not None and not isinstance(obj, type)
        # we do not count strings/unicode/bytes as list-like
        # exclude Generic types that have __iter__
        and not isinstance(obj, (str, bytes, _GenericAlias))
        # exclude zero-dimensional duck-arrays, effectively scalars
        and not (hasattr(obj, "ndim") and obj.ndim == 0)
        # exclude sets if allow_sets is False
        and not (allow_sets is False and isinstance(obj, abc.Set))
    )

It's a number of conditions. It excluded zero dimensional numpy arrays before allowing all lists. What remains must be iterable but not a string, bytes, or generic, not ndim == 0, and not a set if that flag is set.

like image 29
ifly6 Avatar answered Apr 20 '26 04:04

ifly6



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!