Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom IDE-compatible static-types in Python

Tags:

For the sake of nicer design and OOP, I would like to create a custom IDE-compatible static type. For instance, consider the following idealized class:

class IntOrIntString(Union[int, str]):

    @staticmethod
    def is_int_string(item):
        try:
            int(item)
            return True
        except:
            return False

    def __instancecheck__(self, instance):
        # I know __instacecheck__ is declared in the metaclass. It's written here for the sake of the argument.
        return isinstance(instance, int) or (isinstance(instance, str) and self.is_int_string(instance))

    @staticmethod
    def as_integer(item):
        return int(item)

Now, this is a silly class, I know, but it serves as a simple example. Defining such class has the following advantages:

  1. It allows for static type-checking in the IDE (e.g. def parse(s: IntOrIntString): ...).
  2. It allows dynamic type-checking (e.g. isinstance(item, IntOrIntString)).
  3. It can be used to better encapsulate type-related static functions (e.g. inetger = IntOrIntString.as_integer(item)).

However, this code won't run because Union[int, str] can not be subclassed - I get:

TypeError: Cannot subclass typing.Union

So, I tried to work-around this by creating this "type" by referring to it as an instance of Union (which it actually is). Meaning:

IntOrIntString = Union[int, str]
IntOrIntString.as_integer = lambda item: int(item)
...

but that didn't work either as I get the error message

AttributeError: '_Union' object has no attribute 'as_integer'

Any thoughts on how that could be accomplished, or, perhaps, justifications for why it shouldn't be possible to accomplish?

I use python 3.6, but that's not set in stone because I could change the version if needed. The IDE I use is PyCharm.

Thanks

Edit: Two more possible examples for where this is useful:

  1. The type AnyNumber that can accept any number that I wish. Maybe starting with float and int, but can be extended to support any number-like type I want such as int-strings, or single-item iterables. Such extension is immediately system-wide, which is a huge bonus. As an example, consider the function
def func(n: AnyNumber):
    n = AnyNumber.get_as_float()
    # The rest of the function is implemented just for float.
    ...
  1. Working with pandas, you can usually perform similar operations on Series, DataFrame and Index, so suppose that there's a "type-class" like above called SeriesContainer that simplifies the usage - allows me to handle all the data-types uniformly by invoking SeriesContainer.as_series_collection(...), or SeriesContainer.as_data_frame(...) depending on the usage.
like image 702
EZLearner Avatar asked Apr 17 '20 13:04

EZLearner


2 Answers

if I were you I would avoid creating such classes since they create unnecessary type ambiguity. Instead, to take your example, in order to achieve the objective of differentiating between a regular string and an int string, this is how I would go about it. First, make a (non static) intString class:

from typing import Union
class intString(object):
    def __init__(self, item: str):
        try:
            int(item)
        except ValueError:
            print("error message")
            exit(1)
        self.val = item

    def __int__(self):
        return int(self.val)

(It might be better to inherit from str, but I'm not sure how to do it correctly and it's not material to the issue).

Lets say we have the following three variables:

regular_string = "3"
int_string = intString(regular_string)
int_literal = 3

Now we can use the built in python tools to achieve our three objectives:

  1. static type checking:
def foo(f: Union[int, intString]):
    pass

foo(regular_string)      # Warning
foo(3)                   # No warnings
foo(int_string)          # No warnings

You will notice that here we have stricter type checking then what you were proposing - even though the first string can be cast into an intString, the IDE will recognize that it isn't one before runtime and warn you.

  1. Dynamic type checking:
print(isinstance(regular_string, (intString, int)))  # <<False
print(isinstance(int_string, (intString, int)))      # <<True
print(isinstance(int_literal, (intString, int)))     # <<True

Notice that isinstance returns true if any of the items in the tuple match any of its parent classes or its own class.

  1. I'm not sure that I understood how this relates to encapsulation honestly. But since we defined the int operator in the IntString class, we have polymorphism with ints as desired:
for i in [intString("4"), 5, intString("77"), "5"]:
    print(int(i))

will print 4,5,77 as expected.

I'm sorry if I got too hung up on this specific example, but I just found it hard to imagine a situation where merging different types like this would be useful, since I believe that the three advantages you brought up can be achieved in a more pythonic manner.

I suggest you take a look at https://docs.python.org/3/library/typing.html#newtype for more basic functionality relating to defining new types.

like image 140
mattan Avatar answered Sep 21 '22 04:09

mattan


A couple thoughts. First, Union[int, str] includes all strings, even strings like "9.3" and "cat", which don't look like an int.

If you're okay with this, you could do something like the following:

intStr = Union[int, str]

isinstance(5, intStr.__args__) # True
isinstance(5.3, intStr.__args__) # False
isinstance("5.3", intStr.__args__) # True
isinstance("howdy", intStr.__args__) # True

Note that when using a Union type, or a type with an origin of Union, you have to use .__args__ for isinstance() to work, as isinstance() doesn't work with straight up Unions. It can't differentiate Unions from generic types.

I'm assuming, though, that intStr shouldn't include all strings, but only a subset of strings. In this case, why not separate the type-checking methods from the type hinting?

def intStr_check(x):
    "checks if x is an instance of intStr"
    if isinstance(x, int):
        return True
    elif isinstance(x, str):
        try:
            x = int(x)
            return True
        except:
            return False
    else:
        return False

Then simply use that function in place of isinstance() when checking if the type is an intStr.

Note that your original method had an error, being that int(3.14) does not throw an error and would have passed your check.

Now that we've gotten isinstance() out of the way, if for parsing purposes you need to differentiate intStr objects from Union[int,str] objects, you could use the NewType from typing like so:

from typing import NewType

IntStr = NewType("IntStr", Union[int,str])

def some_func(a: IntStr):
    if intStr_check(a):
        return int(a) + 1
    else:
        raise ValueError("Argument must be an intStr (an int or string of an int)")


some_num = IntStr("9")

print(some_func(some_num)) # 10

There's no need to create an as_integer() function or method, as it's exactly the same as int(), which is more concise and readable.


My opinion on style: nothing should be done simply for the sake of OOP. Sure, sometimes you need to store state and update parameters, but in cases where that's unnecessary, I believe OOP tends to lead to more verbose code, and potentially more headaches maintaining mutable state and avoiding unintended side effects. Hence, I prefer to declare new classes only when necessary.


EDIT: Since you insist on reusing the function name isinstance, you can overwrite isinstance to add additional functionality like so:

from typing import NewType, Union, _GenericAlias

isinstance_original = isinstance

def intStr_check(x):
    "checks if x is an instance of intStr"
    if isinstance_original(x, int):
        return True
    elif isinstance_original(x, str):
        try:
            x = int(x)
            return True
        except:
            return False
    else:
        return False

def isinstance(x, t):
    if (t == 'IntStr'): # run intStr_check
        return intStr_check(x)
    elif (type(t) == _GenericAlias): # check Union types
        try:
            check = False
            for i in t.__args__:
                check = check or isinstance_original(x,i)
                if check == True: break
            return check
        except:
            return isinstance_original(x,t)
    else: # regular isinstance
        return isinstance_original(x, t)

# Some tests
assert isinstance("4", 'IntStr') == True
assert isinstance("4.2", 'IntStr') == False
assert isinstance("4h", 'IntStr') == False
assert isinstance(4, 'IntStr') == True
assert isinstance(4.2, int) == False
assert isinstance(4, int) == True
assert isinstance("4", int) == False
assert isinstance("4", str) == True
assert isinstance(4, Union[str,int]) == True
assert isinstance(4, Union[str,float]) == False

Just be careful not to run isinstance_original = isinstance multiple times.

You could still use IntStr = NewType("IntStr", Union[int,str]) for static type checking, but since you're in love with OOP, you could also do something like the following:

class IntStr:
    "an integer or a string of an integer"
    def __init__(self, value):
        self.value = value
        if not (isinstance(self.value, 'IntStr')):
            raise ValueError(f"could not convert {type(self.value)} to IntStr (an int or string of int): {self.value}")

    def check(self):
        return isinstance(self.value, 'IntStr')

    def as_integer(self):
        return int(self.value)

    def __call__(self):
        return self.value

# Some tests
try:
    a = IntStr("4.2")
except ValueError:
    print("it works")

a = IntStr("4")

print(f"a == {a()}")

assert a.as_integer() + 1 == 5
assert isinstance(a, IntStr) == True
assert isinstance(a(), str) == True
assert a.check() == True

a.value = 4.2

assert a.check() == False
like image 34
bug_spray Avatar answered Sep 19 '22 04:09

bug_spray