re.match is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^ in the pattern.
As the re.match documentation says:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding
MatchObjectinstance. ReturnNoneif the string does not match the pattern; note that this is different from a zero-length match.Note: If you want to locate a match anywhere in string, use
search()instead.
re.search searches the entire string, as the documentation says:
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding
MatchObjectinstance. ReturnNoneif no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
So if you need to match at the beginning of the string, or to match the entire string use match. It is faster. Otherwise use search.
The documentation has a specific section for match vs. search that also covers multiline strings:
Python offers two different primitive operations based on regular expressions:
matchchecks for a match only at the beginning of the string, whilesearchchecks for a match anywhere in the string (this is what Perl does by default).Note that
matchmay differ fromsearcheven when using a regular expression beginning with'^':'^'matches only at the start of the string, or inMULTILINEmode also immediately following a newline. The โmatchโ operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optionalposargument regardless of whether a newline precedes it.
Now, enough talk. Time to see some example code:
# example code:
string_with_newlines = """something
someotherthing"""
import re
print re.match('some', string_with_newlines) # matches
print re.match('someother',
string_with_newlines) # won't match
print re.match('^someother', string_with_newlines,
re.MULTILINE) # also won't match
print re.search('someother',
string_with_newlines) # finds something
print re.search('^someother', string_with_newlines,
re.MULTILINE) # also finds something
m = re.compile('thing$', re.MULTILINE)
print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines,
re.MULTILINE) # also matches
Answer from nosklo on Stack Overflowpython - What is the difference between re.search and re.match? - Stack Overflow
How do you use re.match() ? Python
Regular Expressions (RE) Module - Search and Match Comparison
regex - How can I find all matches to a regular expression in Python? - Stack Overflow
Videos
re.match is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^ in the pattern.
As the re.match documentation says:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding
MatchObjectinstance. ReturnNoneif the string does not match the pattern; note that this is different from a zero-length match.Note: If you want to locate a match anywhere in string, use
search()instead.
re.search searches the entire string, as the documentation says:
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding
MatchObjectinstance. ReturnNoneif no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
So if you need to match at the beginning of the string, or to match the entire string use match. It is faster. Otherwise use search.
The documentation has a specific section for match vs. search that also covers multiline strings:
Python offers two different primitive operations based on regular expressions:
matchchecks for a match only at the beginning of the string, whilesearchchecks for a match anywhere in the string (this is what Perl does by default).Note that
matchmay differ fromsearcheven when using a regular expression beginning with'^':'^'matches only at the start of the string, or inMULTILINEmode also immediately following a newline. The โmatchโ operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optionalposargument regardless of whether a newline precedes it.
Now, enough talk. Time to see some example code:
# example code:
string_with_newlines = """something
someotherthing"""
import re
print re.match('some', string_with_newlines) # matches
print re.match('someother',
string_with_newlines) # won't match
print re.match('^someother', string_with_newlines,
re.MULTILINE) # also won't match
print re.search('someother',
string_with_newlines) # finds something
print re.search('^someother', string_with_newlines,
re.MULTILINE) # also finds something
m = re.compile('thing$', re.MULTILINE)
print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines,
re.MULTILINE) # also matches
search โ find something anywhere in the string and return a match object.
match โ find something at the beginning of the string and return a match object.
Iโve been looking at tutorials and I canโt figure out why I canโt get it to work, Iโm trying to see if a substring matches a certain format but the output is incorrect.
re.match(โ[A-Z],[A-Z] $โ , s)
Is this not how to set it up?
Use re.findall or re.finditer instead.
re.findall(pattern, string) returns a list of matching strings.
re.finditer(pattern, string) returns an iterator over MatchObject objects.
Example:
re.findall( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')
# Output: ['cats', 'dogs']
[x.group() for x in re.finditer( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')]
# Output: ['all cats are', 'all dogs are']
Another method (a bit in keeping with OP's initial spirit albeit 13 years later) is to compile the pattern and call search() on the compiled pattern and move along the pattern. This is a bit verbose but if you don't want a lookahead etc. or you want to search over a string more explicitly, then you can use the following function.
import re
def find_all_matches(pattern, string, group=0):
pat = re.compile(pattern)
pos = 0
out = []
while m := pat.search(string, pos):
pos = m.start() + 1
out.append(m[group])
return out
pat = r'all (.*?) are'
s = 'all cats are smarter than dogs, all dogs are dumber than cats'
find_all_matches(pat, s) # ['all cats are', 'all dogs are']
find_all_matches(pat, s, group=1) # ['cats', 'dogs']
This works for overlapping matches too:
find_all_matches(r'(\w\w)', "hello") # ['he', 'el', 'll', 'lo']
import re
pattern = re.compile("^([A-Z][0-9]+)+$")
pattern.match(string)
One-liner: re.match(r"pattern", string) # No need to compile
import re
>>> if re.match(r"hello[0-9]+", 'hello1'):
... print('Yes')
...
Yes
You can evaluate it as bool if needed
>>> bool(re.match(r"hello[0-9]+", 'hello1'))
True