Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to create a regex-constrained type hint?

I have a helper function that converts a %Y-%m-%d %H:%M:%S-formatted string to a datetime.datetime:

def ymdt_to_datetime(ymdt: str) -> datetime.datetime:
    return datetime.datetime.strptime(ymdt, '%Y-%m-%d %H:%M:%S')

I can validate the ymdt format in the function itself, but it'd be more useful to have a custom object to use as a type hint for the argument, something like

from typing import NewType, Pattern

ymdt_pattern = '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]'
YmdString = NewType('YmdString', Pattern[ymdt_pattern])

def ymdt_to_datetime(ymdt: YmdString)...

Am I going down the wrong rabbit hole? Should this be an issue in mypy or someplace? Or can this be accomplished with the current type hint implementation (3.61)?

like image 807
Zev Averbach Avatar asked Jun 13 '17 20:06

Zev Averbach


1 Answers

There currently is no way for types to statically verify that your string matches a precise format, unfortunately. This is partially because checking at compile time the exact values a given variable can hold is exceedingly difficult to implement (and in fact, is NP-hard in some cases), and partially because the problem becomes impossible in the face of things like user input. As a result, it's unlikely that this feature will be added to either mypy or the Python typing ecosystem in the near future, if at all.

One potential workaround would be to leverage NewType, and carefully control when exactly you construct a string of that format. That is, you could do:

from typing import NewType
YmdString = NewType('YmdString', str)

def datetime_to_ymd(d: datetime) -> YmdString:
    # Do conversion here
    return YmdStr(s)

def verify_is_ymd(s: str) -> YmdString:
    # Runtime validation checks here
    return YmdString(s)

If you use only functions like these to introduce values of type YmdString and do testing to confirm that your 'constructor functions' are working perfectly, you can more or less safely distinguish between strings and YmdString at compile time. You'd then want to design your program to minimize how frequently you call these functions to avoid incurring unnecessary overhead, but hopefully, that won't be too onerous to do.

like image 174
Michael0x2a Avatar answered Oct 03 '22 01:10

Michael0x2a