Try with specifying the start and end rules in your regex:

re.compile(r'^test-\d+$')
Answer from hsz on Stack Overflow
🌐
Python documentation
docs.python.org β€Ί 3 β€Ί library β€Ί re.html
re β€” Regular expression operations
4 days ago - A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression (or if a given regular expression matches a particular string, which ...
🌐
W3Schools
w3schools.com β€Ί python β€Ί python_regex.asp
Python RegEx
Python has a built-in package called re, which can be used to work with Regular Expressions. ... You can add flags to the pattern when using regular expressions. A special sequence is a \ followed by one of the characters in the list below, and has a special meaning: A set is a set of characters inside a pair of square brackets [] with a special meaning: The findall() function returns a list containing all matches.
Discussions

python - How can I make a regex match the entire string? - Stack Overflow
Suppose I have a string like test-123. I want to test whether it matches a pattern like test- , where means one or more digit symbols. I tried this code: import re More on stackoverflow.com
🌐 stackoverflow.com
regex - How can I find all matches to a regular expression in Python? - Stack Overflow
When I use the re.search() function to find matches in a block of text, the program exits once it finds the first match in the block of text. How do I do this repeatedly where the program doesn't s... More on stackoverflow.com
🌐 stackoverflow.com
regex - Matching 2 regular expressions in Python - Stack Overflow
Is it possible to match 2 regular expressions in Python? For instance, I have a use-case wherein I need to compare 2 expressions like this: re.match('google\.com\/maps', 'google\.com\/maps2', re. More on stackoverflow.com
🌐 stackoverflow.com
Structural Pattern Matching Should Permit Regex String Matches - Ideas - Discussions on Python.org
I want to be able to match cases in a match statement against different regular expression strings. Essentially, I would like to be able to do something like what is described in this StackOverflow issue in python: matc… More on discuss.python.org
🌐 discuss.python.org
0
January 11, 2023
🌐
Python documentation
docs.python.org β€Ί 3 β€Ί howto β€Ί regex.html
Regular Expression HOWTO β€” Python 3.14.3 documentation
>>> print(re.match(r'From\s+', 'Fromage amk')) None >>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998') <re.Match object; span=(0, 5), match='From '> Under the hood, these functions simply create a pattern object for you and call the appropriate method on it. They also store the compiled object in a cache, so future calls using the same RE won’t need to parse the pattern again and again. Should you use these module-level functions, or should you get the pattern and call its methods yourself? If you’re accessing a regex within a loop, pre-compiling it will save a few function calls.
🌐
GeeksforGeeks
geeksforgeeks.org β€Ί python β€Ί re-match-in-python
re.match() in Python - GeeksforGeeks
July 23, 2025 - import re # Checking if the string starts with "Hello" s = "Hello, World!" match = re.match("Hello", s) if match: print("Pattern found!") else: print("Pattern not found.") ... Pattern found!
Find elsewhere
🌐
Google
developers.google.com β€Ί google for education β€Ί python β€Ί python regular expressions
Python Regular Expressions | Python Education | Google for Developers
In Python a regular expression ... and searches for that pattern within the string. If the search is successful, search() returns a match object or None otherwise....
Top answer
1 of 8
16

Outside of the syntax clarification on re.match, I think I am understanding that you are struggling with taking two or more unknown (user input) regex expressions and classifying which is a more 'specific' match against a string.

Recall for a moment that a Python regex really is a type of computer program. Most modern forms, including Python's regex, are based on Perl. Perl's regex's have recursion, backtracking, and other forms that defy trivial inspection. Indeed a rogue regex can be used as a form of denial of service attack.

To see of this on your own computer, try:

>>> re.match(r'^(a+)+$','a'*24+'!')

That takes about 1 second on my computer. Now increase the 24 in 'a'*24 to a bit larger number, say 28. That take a lot longer. Try 48... You will probably need to CTRL+C now. The time increase as the number of a's increase is, in fact, exponential.

You can read more about this issue in Russ Cox's wonderful paper on 'Regular Expression Matching Can Be Simple And Fast'. Russ Cox is the Goggle engineer that built Google Code Search in 2006. As Cox observes, consider matching the regex 'a?'*33 + 'a'*33 against the string of 'a'*99 with awk and Perl (or Python or PCRE or Java or PHP or ...) Awk matches in 200 microseconds but Perl would require 1015 years because of exponential back tracking.

So the conclusion is: it depends! What do you mean by a more specific match? Look at some of Cox's regex simplification techniques in RE2. If your project is big enough to write your own libraries (or use RE2) and you are willing to restrict the regex grammar used (i.e., no backtracking or recursive forms), I think the answer is that you would classify 'a better match' in a variety of ways.

If you are looking for a simple way to state that (regex_3 < regex_1 < regex_2) when matched against some string using Python or Perl's regex language, I think that the answer is it is very very hard (i.e., this problem is NP Complete)

Edit

Everything I said above is true! However, here is a stab at sorting matching regular expressions based on one form of 'specific': How many edits to get from the regex to the string. The greater number of edits (or the higher the Levenshtein distance) the less 'specific' the regex is.

You be the judge if this works (I don't know what 'specific' means to you for your application):

import re

def ld(a,b):
    "Calculates the Levenshtein distance between a and b."
    n, m = len(a), len(b)
    if n > m:
        # Make sure n <= m, to use O(min(n,m)) space
        a,b = b,a
        n,m = m,n

    current = range(n+1)
    for i in range(1,m+1):
        previous, current = current, [i]+[0]*n
        for j in range(1,n+1):
            add, delete = previous[j]+1, current[j-1]+1
            change = previous[j-1]
            if a[j-1] != b[i-1]:
                change = change + 1
            current[j] = min(add, delete, change)      
    return current[n]

s='Mary had a little lamb'    
d={}
regs=[r'.*', r'Mary', r'lamb', r'little lamb', r'.*little lamb',r'\b\w+mb',
        r'Mary.*little lamb',r'.*[lL]ittle [Ll]amb',r'\blittle\b',s,r'little']

for reg in regs:
    m=re.search(reg,s)
    if m:
        print "'%s' matches '%s' with sub group '%s'" % (reg, s, m.group(0))
        ld1=ld(reg,m.group(0))
        ld2=ld(m.group(0),s)
        score=max(ld1,ld2)
        print "  %i edits regex->match(0), %i edits match(0)->s" % (ld1,ld2)
        print "  score: ", score
        d[reg]=score
        print
    else:
        print "'%s' does not match '%s'" % (reg, s)   

print "   ===== %s =====    === %s ===" % ('RegEx'.center(10),'Score'.center(10))

for key, value in sorted(d.iteritems(), key=lambda (k,v): (v,k)):
    print "   %22s        %5s" % (key, value) 

The program is taking a list of regex's and matching against the string Mary had a little lamb.

Here is the sorted ranking from "most specific" to "least specific":

   =====   RegEx    =====    ===   Score    ===
   Mary had a little lamb            0
        Mary.*little lamb            7
            .*little lamb           11
              little lamb           11
      .*[lL]ittle [Ll]amb           15
               \blittle\b           16
                   little           16
                     Mary           18
                  \b\w+mb           18
                     lamb           18
                       .*           22

This based on the (perhaps simplistic) assumption that: a) the number of edits (the Levenshtein distance) to get from the regex itself to the matching substring is the result of wildcard expansions or replacements; b) the edits to get from the matching substring to the initial string. (just take one)

As two simple examples:

  1. .* (or .*.* or .*?.* etc) against any sting is a large number of edits to get to the string, in fact equal to the string length. This is the max possible edits, the highest score, and the least 'specific' regex.
  2. The regex of the string itself against the string is as specific as possible. No edits to change one to the other resulting in a 0 or lowest score.

As stated, this is simplistic. Anchors should increase specificity but they do not in this case. Very short stings don't work because the wild-card may be longer than the string.

Edit 2

I got anchor parsing to work pretty darn well using the undocumented sre_parse module in Python. Type >>> help(sre_parse) if you want to read more...

This is the goto worker module underlying the re module. It has been in every Python distribution since 2001 including all the P3k versions. It may go away, but I don't think it is likely...

Here is the revised listing:

import re
import sre_parse

def ld(a,b):
    "Calculates the Levenshtein distance between a and b."
    n, m = len(a), len(b)
    if n > m:
        # Make sure n <= m, to use O(min(n,m)) space
        a,b = b,a
        n,m = m,n

    current = range(n+1)
    for i in range(1,m+1):
        previous, current = current, [i]+[0]*n
        for j in range(1,n+1):
            add, delete = previous[j]+1, current[j-1]+1
            change = previous[j-1]
            if a[j-1] != b[i-1]:
                change = change + 1
            current[j] = min(add, delete, change)      
    return current[n]

s='Mary had a little lamb'    
d={}
regs=[r'.*', r'Mary', r'lamb', r'little lamb', r'.*little lamb',r'\b\w+mb',
        r'Mary.*little lamb',r'.*[lL]ittle [Ll]amb',r'\blittle\b',s,r'little',
        r'^.*lamb',r'.*.*.*b',r'.*?.*',r'.*\b[lL]ittle\b \b[Ll]amb',
        r'.*\blittle\b \blamb$','^'+s+'$']

for reg in regs:
    m=re.search(reg,s)
    if m:
        ld1=ld(reg,m.group(0))
        ld2=ld(m.group(0),s)
        score=max(ld1,ld2)
        for t, v in sre_parse.parse(reg):
            if t=='at':      # anchor...
                if v=='at_beginning' or 'at_end':
                    score-=1   # ^ or $, adj 1 edit

                if v=='at_boundary': # all other anchors are 2 char
                    score-=2

        d[reg]=score
    else:
        print "'%s' does not match '%s'" % (reg, s)   

print
print "   ===== %s =====    === %s ===" % ('RegEx'.center(15),'Score'.center(10))

for key, value in sorted(d.iteritems(), key=lambda (k,v): (v,k)):
    print "   %27s        %5s" % (key, value) 

And soted RegEx's:

   =====      RegEx      =====    ===   Score    ===
        Mary had a little lamb            0
      ^Mary had a little lamb$            0
          .*\blittle\b \blamb$            6
             Mary.*little lamb            7
     .*\b[lL]ittle\b \b[Ll]amb           10
                    \blittle\b           10
                 .*little lamb           11
                   little lamb           11
           .*[lL]ittle [Ll]amb           15
                       \b\w+mb           15
                        little           16
                       ^.*lamb           17
                          Mary           18
                          lamb           18
                       .*.*.*b           21
                            .*           22
                         .*?.*           22
2 of 8
2

It depends on what kind of regular expressions you have; as @carrot-top suggests, if you actually aren't dealing with "regular expressions" in the CS sense, and instead have crazy extensions, then you are definitely out of luck.

However, if you do have traditional regular expressions, you might make a bit more progress. First, we could define what "more specific" means. Say R is a regular expression, and L(R) is the language generated by R. Then we might say R1 is more specific than R2 if L(R1) is a (strict) subset of L(R2) (L(R1) < L(R2)). That only gets us so far: in many cases, L(R1) is neither a subset nor a superset of L(R2), and so we might imagine that the two are somehow incomparable. An example, trying to match "mary had a little lamb", we might find two matching expressions: .*mary and lamb.*.

One non-ambiguous solution is to define specificity via implementation. For instance, convert your regular expression in a deterministic (implementation-defined) way to a DFA and simply count states. Unfortunately, this might be relatively opaque to a user.

Indeed, you seem to have an intuitive notion of how you want two regular expressions to compare, specificity-wise. Why not simple write down a definition of specificity, based on the syntax of regular expressions, that matches your intuition reasonably well?

Totally arbitrary rules follow:

  1. Characters = 1.
  2. Character ranges of n characters = n (and let's say \b = 5, because I'm not sure how you might choose to write it out long-hand).
  3. Anchors are 5 each.
  4. * divides its argument by 2.
  5. + divides its argument by 2, then adds 1.
  6. . = -10.

Anyway, just food for thought, as the other answers do a good job of outlining some of the issues you're facing; hope it helps.

🌐
Python.org
discuss.python.org β€Ί ideas
Structural Pattern Matching Should Permit Regex String Matches - Ideas - Discussions on Python.org
January 11, 2023 - I want to be able to match cases in a match statement against different regular expression strings. Essentially, I would like to be able to do something like what is described in this StackOverflow issue in python: match regex_in(some_string): case r'\d+': print('Digits') case r'\s+': print('Whitespaces') case _: print('Something else') The above syntax is just an example taken from the SO post - I don’t have strong feelings about what exactly it should look...
Top answer
1 of 7
29

Update

I condensed this answer into a python package to make matching as easy as pip install regex-spm,

import regex_spm

match regex_spm.fullmatch_in("abracadabra"):
  case r"\d+": print("It's all digits")
  case r"\D+": print("There are no digits in the search string")
  case _: print("It's something else")

Original answer

As Patrick Artner correctly points out in the other answer, there is currently no official way to do this. Hopefully the feature will be introduced in a future Python version and this question can be retired. Until then:

PEP 634 specifies that Structural Pattern Matching uses the == operator for evaluating a match. We can override that.

import re
from dataclasses import dataclass

# noinspection PyPep8Naming
@dataclass
class regex_in:
    string: str

    def __eq__(self, other: str | re.Pattern):
        if isinstance(other, str):
            other = re.compile(other)
        assert isinstance(other, re.Pattern)
        # TODO extend for search and match variants
        return other.fullmatch(self.string) is not None

Now you can do something like:

match regex_in(validated_string):
    case r'\d+':
        print('Digits')
    case r'\s+':
        print('Whitespaces')
    case _:
        print('Something else')

Caveat #1 is that you can't pass re.compile'd patterns to the case directly, because then Python wants to match based on class. You have to save the pattern somewhere first.

Caveat #2 is that you can't actually use local variables either, because Python then interprets it as a name for capturing the match subject. You need to use a dotted name, e.g. putting the pattern into a class or enum:

class MyPatterns:
    DIGITS = re.compile('\d+')

match regex_in(validated_string):
    case MyPatterns.DIGITS:
        print('This works, it\'s all digits')

Groups

This could be extended even further to provide an easy way to access the re.Match object and the groups.

# noinspection PyPep8Naming
@dataclass
class regex_in:
    string: str
    match: re.Match = None

    def __eq__(self, other: str | re.Pattern):
        if isinstance(other, str):
            other = re.compile(other)
        assert isinstance(other, re.Pattern)
        # TODO extend for search and match variants
        self.match = other.fullmatch(self.string)
        return self.match is not None

    def __getitem__(self, group):
        return self.match[group]

# Note the `as m` in in the case specification
match regex_in(validated_string):
    case r'\d(\d)' as m:
        print(f'The second digit is {m[1]}')
        print(f'The whole match is {m.match}')
2 of 7
23

Clean solution

There is a clean solution to this problem. Just hoist the regexes out of the case-clauses where they aren't supported and into the match-clause which supports any Python object.

The combined regex will also give you better efficiency than could be had by having a series of separate regex tests. Also, the regex can be precompiled for maximum efficiency during the match process.

Example

Here's a worked out example for a simple tokenizer:

pattern = re.compile(r'(\d+\.\d+)|(\d+)|(\w+)|(".*)"')
Token = namedtuple('Token', ('kind', 'value', 'position'))
env = {'x': 'hello', 'y': 10}

for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
    mo = pattern.fullmatch(s)
    match mo.lastindex:
        case 1:
            tok = Token('NUM', float(s), mo.span())
        case 2:
            tok = Token('NUM', int(s), mo.span())
        case 3:
            tok = Token('VAR', env[s], mo.span())
        case 4:
            tok = Token('TEXT', s[1:-1], mo.span())
        case _:
            raise ValueError(f'Unknown pattern for {s!r}')
    print(tok) 

This outputs:

Token(kind='NUM', value=123, position=(0, 3))
Token(kind='NUM', value=123.45, position=(0, 6))
Token(kind='VAR', value='hello', position=(0, 1))
Token(kind='VAR', value=10, position=(0, 1))
Token(kind='TEXT', value='goodbye', position=(0, 9))

Better Example

The code can be improved by writing the combined regex in verbose format for intelligibility and ease of adding more cases. It can be further improved by naming the regex sub patterns:

pattern = re.compile(r"""(?x)
    (?P<float>\d+\.\d+) |
    (?P<int>\d+) |
    (?P<variable>\w+) |
    (?P<string>".*")
""")

That can be used in a match/case statement like this:

for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
    mo = pattern.fullmatch(s)
    match mo.lastgroup:
        case 'float':
            tok = Token('NUM', float(s), mo.span())
        case 'int':
            tok = Token('NUM', int(s), mo.span())
        case 'variable':
            tok = Token('VAR', env[s], mo.span())
        case 'string':
            tok = Token('TEXT', s[1:-1], mo.span())
        case _:
            raise ValueError(f'Unknown pattern for {s!r}')
    print(tok)

Comparison to if/elif/else

Here is the equivalent code written using an if-elif-else chain:

for s in ['123', '123.45', 'x', 'y', '"goodbye"']:
    if (mo := re.fullmatch('\d+\.\d+', s)):
        tok = Token('NUM', float(s), mo.span())
    elif (mo := re.fullmatch('\d+', s)):
        tok = Token('NUM', int(s), mo.span())
    elif (mo := re.fullmatch('\w+', s)):
        tok = Token('VAR', env[s], mo.span())
    elif (mo := re.fullmatch('".*"', s)):
        tok = Token('TEXT', s[1:-1], mo.span())
    else:
        raise ValueError(f'Unknown pattern for {s!r}')
    print(tok)

Compared to the match/case, the if-elif-else chain is slower because it runs multiple regex matches and because there is no precompilation. Also, it is less maintainable without the case names.

Because all the regexes are separate we have to capture all the match objects separately with repeated use of assignment expressions with the walrus operator. This is awkward compared to the match/case example where we only make a single assignment.

🌐
Mimo
mimo.org β€Ί glossary β€Ί python β€Ί regex-regular-expressions
Python Regex: Master Regular Expressions in Python
The most common functions are re.search() to find the first match of a pattern anywhere in a string, and re.findall() to find all non-overlapping matches. ... Become a Python developer. Master Python from basics to advanced topics, including data structures, functions, classes, and error handling ...
🌐
Python.org
discuss.python.org β€Ί python help
Regular Expressions (RE) Module - Search and Match Comparison - Python Help - Discussions on Python.org
October 26, 2023 - Hello, I have a question regarding the regular expression compile. I created a code snippet to compare the different search and match results using different strings and using different patterns. Here is the test code snippet: import re s1 = 'bob has a birthday on Feb 25th' s2 = 'sara has a birthday on March 3rd' s3 = '12eup 586iu' s4 = '0turt' # '\w\w\w \d\d\w\w' bday1_re = re.compile('\w+ \d+\w+') # Also tried: '\w+ \d+\w+' bday2_re = re.comp...
🌐
Regex101
regex101.com
regex101: build, test, and debug regex
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
🌐
Rexegg
rexegg.com β€Ί regex-python.php
Python Regex Tutorial
Python Regex Tutorial. Discusses the Python re and regex classes, provides working code for matching, replacing and splitting.
🌐
Automate the Boring Stuff
automatetheboringstuff.com β€Ί 3e β€Ί chapter9.html
Chapter 9 - Text Pattern Matching with Regular Expressions, Automate the Boring Stuff with Python, 3rd Ed
For example, the characters \d in a regex stand for a decimal numeral between 0 and 9. Python uses the regex string r'\d\d\d-\d\d\d-\d\d\d\d' to match the same text pattern the previous is_phone_number() function did: a string of three numbers, a hyphen, three more numbers, another hyphen, ...
Top answer
1 of 5
78

You could create a little class that returns the boolean result of calling match, and retains the matched groups for subsequent retrieval:

import re

class REMatcher(object):
    def __init__(self, matchstring):
        self.matchstring = matchstring

    def match(self,regexp):
        self.rematch = re.match(regexp, self.matchstring)
        return bool(self.rematch)

    def group(self,i):
        return self.rematch.group(i)


for statement in ("I love Mary", 
                  "Ich liebe Margot", 
                  "Je t'aime Marie", 
                  "Te amo Maria"):

    m = REMatcher(statement)

    if m.match(r"I love (\w+)"): 
        print "He loves",m.group(1) 

    elif m.match(r"Ich liebe (\w+)"):
        print "Er liebt",m.group(1) 

    elif m.match(r"Je t'aime (\w+)"):
        print "Il aime",m.group(1) 

    else: 
        print "???"

Update for Python 3 print as a function, and Python 3.8 assignment expressions - no need for a REMatcher class now:

import re

for statement in ("I love Mary",
                  "Ich liebe Margot",
                  "Je t'aime Marie",
                  "Te amo Maria"):

    if m := re.match(r"I love (\w+)", statement):
        print("He loves", m.group(1))

    elif m := re.match(r"Ich liebe (\w+)", statement):
        print("Er liebt", m.group(1))

    elif m := re.match(r"Je t'aime (\w+)", statement):
        print("Il aime", m.group(1))

    else:
        print()
2 of 5
33

Less efficient, but simpler-looking:

m0 = re.match("I love (\w+)", statement)
m1 = re.match("Ich liebe (\w+)", statement)
m2 = re.match("Je t'aime (\w+)", statement)
if m0:
  print("He loves", m0.group(1))
elif m1:
  print("Er liebt", m1.group(1))
elif m2:
  print("Il aime", m2.group(1))

The problem with the Perl stuff is the implicit updating of some hidden variable. That's simply hard to achieve in Python because you need to have an assignment statement to actually update any variables.

The version with less repetition (and better efficiency) is this:

pats = [
    ("I love (\w+)", "He Loves {0}" ),
    ("Ich liebe (\w+)", "Er Liebe {0}" ),
    ("Je t'aime (\w+)", "Il aime {0}")
 ]
for p1, p3 in pats:
    m = re.match(p1, statement)
    if m:
        print(p3.format(m.group(1)))
        break

A minor variation that some Perl folk prefer:

pats = {
    "I love (\w+)" : "He Loves {0}",
    "Ich liebe (\w+)" : "Er Liebe {0}",
    "Je t'aime (\w+)" : "Il aime {0}",
}
for p1 in pats:
    m = re.match(p1, statement)
    if m:
        print(pats[p1].format(m.group(1)))
        break

This is hardly worth mentioning except it does come up sometimes from Perl programmers.

🌐
Rexegg
rexegg.com β€Ί regex-quickstart.php
Regex Cheat Sheet
The next column, "Legend", explains what the element means (or encodes) in the regex syntax. The next two columns work hand in hand: the "Example" column gives a valid regular expression that uses the element, and the "Sample Match" column presents a text string that could be matched by the regular expression.