Example converted to pattern matching
Here's the equivalent code using match and case:
match x:
case int():
pass
case str():
x = int(x)
case float() | Decimal():
x = round(x)
case _:
raise TypeError('Unsupported type')
Explanation
PEP 634, available since Python 3.10, specifies that isinstance() checks are performed with class patterns. To check for an instance of str, write case str(): .... Note that the parentheses are essential. That is how the grammar determines that this is a class pattern.
To check multiple classes at a time, PEP 634 provides an or-pattern using the | operator. For example, to check whether an object is an instance of float or Decimal, write case float() | Decimal(): .... As before, the parentheses are essential.
Example converted to pattern matching
Here's the equivalent code using match and case:
match x:
case int():
pass
case str():
x = int(x)
case float() | Decimal():
x = round(x)
case _:
raise TypeError('Unsupported type')
Explanation
PEP 634, available since Python 3.10, specifies that isinstance() checks are performed with class patterns. To check for an instance of str, write case str(): .... Note that the parentheses are essential. That is how the grammar determines that this is a class pattern.
To check multiple classes at a time, PEP 634 provides an or-pattern using the | operator. For example, to check whether an object is an instance of float or Decimal, write case float() | Decimal(): .... As before, the parentheses are essential.
Using python match case
Without Exception handling
match x:
case int():
pass
case str():
x = int(x)
case float() | Decimal():
x = round(x)
case _:
raise TypeError('Unsupported type')
Some extras
There are still some flows in this code.
With exception handling
match x:
case int():
pass
case str():
try:
x = int(x)
except ValueError:
raise TypeError('Unsupported type')
case float() | Decimal():
x = round(x)
case _:
raise TypeError('Unsupported type')
As a function
def func(x):
match x:
case int():
pass
case str():
try:
x = int(x)
except ValueError:
raise TypeError('Unsupported type')
case float() | Decimal():
x = round(x)
case _:
raise TypeError('Unsupported type')
return x
Videos
It's a pretty nifty feature and it's a much easier to extend or extend, like selectively flattening some values in a dictionary based on the key, for instance. I've written about it extensively on Mastering Structural Pattern Matching
This is a common "gotcha" of the new syntax: case clauses are not expressions. That is, if you put a variable name in a case clause, the syntax assigns to that name rather than reading that name.
It's a common misconception to think of match as like switch in other languages: it is not, not even really close. switch cases are expressions which test for equality against the switch expression; conversely, match cases are structured patterns which unpack the match expression. It's really much more akin to generalized iterable unpacking. It asks the question: "does the structure of the match expression look like the structure of the case clause?", a very different question from what a switch statement asks.
For example:
t = 12.0
match t:
case newvar: # This is equal to `newvar = t`
print(f"bound a new variable called newvar: {newvar}")
# prints "bound a new variable called newvar: 12.00000000"
# this pattern matches anything at all, so all following cases never run
case 13.0:
print("found 13.0")
case [a, b, c]: # matches an iterable with exactly 3 elements,
# and *assigns* those elements to the variables `a`, `b` and `c`
print(f"found an iterable of length exactly 3.")
print(f"these are the values in the iterable: {a} {b} {c}")
case [*_]:
print("found some sort of iterable, but it's definitely")
print("not of length 3, because that already matched earlier")
case my_fancy_type(): # match statement magic: this is how to type check!
print(f"variable t = {t} is of type {my_fancy_type}")
case _:
print("no match")
So what your OP actually does is kinda like this:
t = 12.0
tt = type(t) # float obviously
match tt:
case int: # assigns to int! `int = tt`, overwriting the builtin
print(f"the value of int: {int}")
# output: "the value of int: <class 'float'>"
print(int == float) # output: True (!!!!!!!!)
# In order to get the original builtin type, you'd have to do
# something like `from builtins import int as int2`
case float: # assigns to float, in this case the no-op `float = float`
# in fact this clause is identical to the previous clause:
# match anything and bind the match to its new name
print(f"match anything and bind it to name 'float': {float}")
# never prints, because we already matched the first case
case float(): # since this isn't a variable name, no assignment happens.
# under the hood, this equates to an `isinstance` check.
# `float` is not an instance of itself, so this wouldn't match.
print(f"tt: {tt} is an instance of float") # never prints
# of course, this case never executes anyways because the
# first case matches anything, skipping all following cases
Frankly, I'm not entirely sure how the under-the-hood instance check works, but it definitely works like the other answer says: by defintion of the match syntax, type checks are done like this:
match instance:
case type():
print(f"object {instance} is of type {type}!")
So we come back to where we started: case clauses are not expressions. As the PEP says, it's better to think of case clauses as kind of like function declarations, where we name the arguments to the function and possibly bind some default values to those newly-named arguments. But we never, ever read existing variables in case clauses, only make new variables. (There's some other subtleties involved as well, for instance a dotted access doesn't count as a "variable" for this purpose, but this is complicated already, best to end this answer here.)
Lose the type() and also add parentheses to your types:
t = 12.0
match t:
case int():
print("int")
case float():
print("float")
I'm not sure why what you've wrote is not working, but this one works.
Update
I condensed this answer into a python package to make matching as easy as pip install regex-spm,
import regex_spm
match regex_spm.fullmatch_in("abracadabra"):
case r"\d+": print("It's all digits")
case r"\D+": print("There are no digits in the search string")
case _: print("It's something else")
Original answer
As Patrick Artner correctly points out in the other answer, there is currently no official way to do this. Hopefully the feature will be introduced in a future Python version and this question can be retired. Until then:
PEP 634 specifies that Structural Pattern Matching uses the == operator for evaluating a match. We can override that.
import re
from dataclasses import dataclass
# noinspection PyPep8Naming
@dataclass
class regex_in:
string: str
def __eq__(self, other: str | re.Pattern):
if isinstance(other, str):
other = re.compile(other)
assert isinstance(other, re.Pattern)
# TODO extend for search and match variants
return other.fullmatch(self.string) is not None
Now you can do something like:
match regex_in(validated_string):
case r'\d+':
print('Digits')
case r'\s+':
print('Whitespaces')
case _:
print('Something else')
Caveat #1 is that you can't pass re.compile'd patterns to the case directly, because then Python wants to match based on class. You have to save the pattern somewhere first.
Caveat #2 is that you can't actually use local variables either, because Python then interprets it as a name for capturing the match subject. You need to use a dotted name, e.g. putting the pattern into a class or enum:
class MyPatterns:
DIGITS = re.compile('\d+')
match regex_in(validated_string):
case MyPatterns.DIGITS:
print('This works, it\'s all digits')
Groups
This could be extended even further to provide an easy way to access the re.Match object and the groups.
# noinspection PyPep8Naming
@dataclass
class regex_in:
string: str
match: re.Match = None
def __eq__(self, other: str | re.Pattern):
if isinstance(other, str):
other = re.compile(other)
assert isinstance(other, re.Pattern)
# TODO extend for search and match variants
self.match = other.fullmatch(self.string)
return self.match is not None
def __getitem__(self, group):
return self.match[group]
# Note the `as m` in in the case specification
match regex_in(validated_string):
case r'\d(\d)' as m:
print(f'The second digit is {m[1]}')
print(f'The whole match is {m.match}')
Clean solution
There is a clean solution to this problem. Just hoist the regexes out of the case-clauses where they aren't supported and into the match-clause which supports any Python object.
The combined regex will also give you better efficiency than could be had by having a series of separate regex tests. Also, the regex can be precompiled for maximum efficiency during the match process.
Example
Here's a worked out example for a simple tokenizer:
pattern = re.compile(r'(\d+\.\d+)|(\d+)|(\w+)|(".*)"')
Token = namedtuple('Token', ('kind', 'value', 'position'))
env = {'x': 'hello', 'y': 10}
for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
mo = pattern.fullmatch(s)
match mo.lastindex:
case 1:
tok = Token('NUM', float(s), mo.span())
case 2:
tok = Token('NUM', int(s), mo.span())
case 3:
tok = Token('VAR', env[s], mo.span())
case 4:
tok = Token('TEXT', s[1:-1], mo.span())
case _:
raise ValueError(f'Unknown pattern for {s!r}')
print(tok)
This outputs:
Token(kind='NUM', value=123, position=(0, 3))
Token(kind='NUM', value=123.45, position=(0, 6))
Token(kind='VAR', value='hello', position=(0, 1))
Token(kind='VAR', value=10, position=(0, 1))
Token(kind='TEXT', value='goodbye', position=(0, 9))
Better Example
The code can be improved by writing the combined regex in verbose format for intelligibility and ease of adding more cases. It can be further improved by naming the regex sub patterns:
pattern = re.compile(r"""(?x)
(?P<float>\d+\.\d+) |
(?P<int>\d+) |
(?P<variable>\w+) |
(?P<string>".*")
""")
That can be used in a match/case statement like this:
for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
mo = pattern.fullmatch(s)
match mo.lastgroup:
case 'float':
tok = Token('NUM', float(s), mo.span())
case 'int':
tok = Token('NUM', int(s), mo.span())
case 'variable':
tok = Token('VAR', env[s], mo.span())
case 'string':
tok = Token('TEXT', s[1:-1], mo.span())
case _:
raise ValueError(f'Unknown pattern for {s!r}')
print(tok)
Comparison to if/elif/else
Here is the equivalent code written using an if-elif-else chain:
for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
if (mo := re.fullmatch('\d+\.\d+', s)):
tok = Token('NUM', float(s), mo.span())
elif (mo := re.fullmatch('\d+', s)):
tok = Token('NUM', int(s), mo.span())
elif (mo := re.fullmatch('\w+', s)):
tok = Token('VAR', env[s], mo.span())
elif (mo := re.fullmatch('".*"', s)):
tok = Token('TEXT', s[1:-1], mo.span())
else:
raise ValueError(f'Unknown pattern for {s!r}')
print(tok)
Compared to the match/case, the if-elif-else chain is slower because it runs multiple regex matches and because there is no precompilation. Also, it is less maintainable without the case names.
Because all the regexes are separate we have to capture all the match objects separately with repeated use of assignment expressions with the walrus operator. This is awkward compared to the match/case example where we only make a single assignment.