Drop the * from your regex (so it matches exactly one instance of your pattern). Then use either re.findall(...) or re.finditer (see here) to return all matches.
It sounds like you're essentially building a recursive descent parser. For relatively simple parsing tasks, it is quite common and entirely reasonable to do that by hand. If you're interested in a library solution (in case your parsing task may become more complicated later on, for example), have a look at pyparsing.
Answer from phooji on Stack OverflowDrop the * from your regex (so it matches exactly one instance of your pattern). Then use either re.findall(...) or re.finditer (see here) to return all matches.
It sounds like you're essentially building a recursive descent parser. For relatively simple parsing tasks, it is quite common and entirely reasonable to do that by hand. If you're interested in a library solution (in case your parsing task may become more complicated later on, for example), have a look at pyparsing.
The regex module fixes this, by adding a .captures method:
>>> m = regex.match(r"(..)+", "a1b2c3")
>>> m.captures(1)
['a1', 'b2', 'c3']
Videos
You can't use groups for this, I'm afraid. Each group can match only once, I believe all regexes work this way. A possible solution is to try to use findall() or similar.
r=re.compile(r'\w')
r.findall(x)
# 'a', 'b', 'c', 'd', 'e', 'f'
The regex module can do this.
> m = regex.match('(\w){6}', "abcdef")
> m.captures(1)
['a', 'b', 'c', 'd', 'e', 'f']
Also works with named captures:
> m = regex.match('(?P<letter>)\w)', "abcdef")
> m.capturesdict()
{'letter': ['a', 'b', 'c', 'd', 'e', 'f']}
The regex module is expected to replace the 're' module - it is a drop-in replacement that acts identically, except it has many more features and capabilities.
Hello all,
I'm trying to understand how to access a "symbolic group name" within regex groups.
https://docs.python.org/3.10/library/re.html?highlight=re#regular-expression-syntax
The documentation states:
"Similar to regular parentheses, but the substring matched by the group is accessible via the symbolic group name name. Group names must be valid Python identifiers, and each group name must be defined only once within a regular expression. A symbolic group is also a numbered group, just as if the group were not named.
Named groups can be referenced in three contexts. If the pattern is (?P<quote>['"]).*?(?P=quote) (i.e. matching a string quoted with either single or double quotes):"
I would have expected to do something like this:
import re
text_string = 'This is a nice string: we should use it some time\r\nThis is NOT a nice string: we should NEVER use it\r\n'
found = re.findall(r'(?P<name>.*?): (?P<value>.*?)\r\n', text_string)
print(f'Data Content:\t{found[0]("value")}')
If I'm correct and you can access a symbolic group name how do you do it? The documentation is not very clear on that section.
Kind regards