You need the first captured group:
a.group(1)
b.group(1)
...
without any captured group specification as argument to group(), it will show the full match, like what you're getting now.
Here's an example:
In [8]: string_one = 'file_record_transcript.pdf'
In [9]: re.search(r'^(file.*)\.pdf$', string_one).group()
Out[9]: 'file_record_transcript.pdf'
In [10]: re.search(r'^(file.*)\.pdf$', string_one).group(1)
Out[10]: 'file_record_transcript'
Answer from heemayl on Stack OverflowYou need the first captured group:
a.group(1)
b.group(1)
...
without any captured group specification as argument to group(), it will show the full match, like what you're getting now.
Here's an example:
In [8]: string_one = 'file_record_transcript.pdf'
In [9]: re.search(r'^(file.*)\.pdf$', string_one).group()
Out[9]: 'file_record_transcript.pdf'
In [10]: re.search(r'^(file.*)\.pdf$', string_one).group(1)
Out[10]: 'file_record_transcript'
you can also use match[index]
a[0] => Full match (file_record_transcript.pdf)
a[1] => First group (file_record_transcript)
a[2] => Second group (if any)
Videos
findall just returns the captured groups:
>>> re.findall('abc(de)fg(123)', 'abcdefg123 and again abcdefg123')
[('de', '123'), ('de', '123')]
Relevant doc excerpt:
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
Use groups freely. The matches will be returned as a list of group-tuples:
>>> re.findall('(1(23))45', '12345')
[('123', '23')]
If you want the full match to be included, just enclose the entire regex in a group:
>>> re.findall('(1(23)45)', '12345')
[('12345', '23')]
Hi
I have used Perl extensively in the past but now I move to python and I am probably doing regex-stuff wrong...
I have a list of lines that I want to match against a regex and capture some variables if the regex matches (not all lines match).
I am doing this:
start_rx = re.compile("silence_start: ([\d\.]+)")
for line in lines:
if start_rx.search(line):
start_time = start_rx.search(line).group(1)
print(start_time)And that works, but what I don't like is that I call search() twice - once to see if the line matches at all (otherwise I get an exception on the lines that don't match) and once to retrieve the capture-group.
Surely there is a better way to do this?
hey all, curious what would need to be done to have a regex with a capture group show all of the matches.
import re var = 'Agent Alice and Agent Bob' ourRegex = re.compile(r'Agent (\w)\w*') print(ourRegex.findall(var))
This will output "['A', 'B']" and not "Agent Alice" and "Agent Bob"