Join the list on the pipe character |, which represents different options in regex.

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))", x)

Output: ['fun']

You cannot use match as it will match from start. Using search you will get only the first match. So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

Answer from vks on Stack Overflow
Top answer
1 of 5
64

Join the list on the pipe character |, which represents different options in regex.

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))", x)

Output: ['fun']

You cannot use match as it will match from start. Using search you will get only the first match. So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

2 of 5
25

regex module has named lists (sets actually):

#!/usr/bin/env python
import regex as re # $ pip install regex

p = re.compile(r"\L<words>", words=['fun', 'dum', 'sun', 'gum'])
if p.search("I love to have fun."):
    print('matched')

Here words is just a name, you can use anything you like instead.
.search() methods is used instead of .* before/after the named list.

To emulate named lists using stdlib's re module:

#!/usr/bin/env python
import re

words = ['fun', 'dum', 'sun', 'gum']
longest_first = sorted(words, key=len, reverse=True)
p = re.compile(r'(?:{})'.format('|'.join(map(re.escape, longest_first))))
if p.search("I love to have fun."):
    print('matched')

re.escape() is used to escape regex meta-characters such as .*? inside individual words (to match the words literally).
sorted() emulates regex behavior and it puts the longest words first among the alternatives, compare:

>>> import re
>>> re.findall("(funny|fun)", "it is funny")
['funny']
>>> re.findall("(fun|funny)", "it is funny")
['fun']
>>> import regex
>>> regex.findall(r"\L<words>", "it is funny", words=['fun', 'funny'])
['funny']
>>> regex.findall(r"\L<words>", "it is funny", words=['funny', 'fun'])
['funny']
Discussions

How can use RegEx to match a list of strings in C#? - Stack Overflow
I need to find all the regex matches from a list of strings. For example, I need to be able to take the string "This foo is a foobar" and match any instances of either "foo" or "bar". What would the More on stackoverflow.com
🌐 stackoverflow.com
November 11, 2011
c# - How to filter a list of strings matching a pattern - Stack Overflow
I have a list of strings (file names actually) and I'd like to keep only those that match a filter expression like: \*_Test.txt. What would be the best to achieve this? Here is the answer that I ... More on stackoverflow.com
🌐 stackoverflow.com
regex - Match list of words without the list of chars around - Stack Overflow
I have this regex (?:$|^| )(one|common|word|or|another)(?:$|^| ) which matches fine unless the two words are next to each other. One one's more word'word common word or another word more another ... More on stackoverflow.com
🌐 stackoverflow.com
c# - Is it possible to write a regex that does one search then uses its results to do another search? - Software Engineering Stack Exchange
Basically, if an item matches the pattern in the first list, I want to search for that letter prefix and number suffix in the second list. Is there a term for using regex to do this kind of search? I know that I can just write my own search by using string manipulation to get the letter and ... More on softwareengineering.stackexchange.com
🌐 softwareengineering.stackexchange.com
🌐
O'Reilly
oreilly.com › library › view › regular-expressions-cookbook › 9781449327453 › ch03s11.html
3.10. Retrieve a List of All Matches - Regular Expressions Cookbook, 2nd Edition [Book]
August 27, 2012 - Construct a Regex object if you want to use the same regular expression with a large number of strings: Dim RegexObj As New Regex("\d+") Dim MatchList = RegexObj.Matches(SubjectString)
Authors   Jan GoyvaertsSteven Levithan
Published   2012
Pages   609
Top answer
1 of 3
49

You probably want to use a regular expression for this if your patterns are going to be complex....

you could either use a proper regular expression as your filter (e.g for your specific example it would be new Regex(@"^.*_Test\.txt$") or you could apply a conversion algorithm.

Either way you could then just use linq to apply the regex.

for example

var myRegex=new Regex(@"^.*_Test\.txt$");
List<string> resultList=files.Where(myRegex.IsMatch).ToList();

Some people may think the above answer is incorrect, but you can use a method group instead of a lambda. If you wish the full lamda you would use:

var myRegex=new Regex(@"^.*_Test\.txt$");
List<string> resultList=files.Where(f => myRegex.IsMatch(f)).ToList();

or non Linq

List<string> resultList=files.FindAll(delegate(string s) { return myRegex.IsMatch(s);});

if you were converting the filter a simple conversion would be

 var myFilter="*_Test.txt";
 var myRegex=new Regex("^" + myFilter.Replace("*",".*") +"$");

You could then also have filters like "*Test*.txt" with this method.

However, if you went down this conversion route you would need to make sure you escaped out all the special regular expression chars e.g. "." becomes @".", "(" becomes @"(" etc.......

Edit -- The example replace is TOO simple because it doesn't convert the . so it would find "fish_Textxtxt" so escape atleast the .

so

string myFilter="*_Test.txt";
foreach(char x in @"\+?|{[()^$.#") {
  myFilter = myFilter.Replace(x.ToString(),@"\"+x.ToString());
}
Regex myRegex=new Regex(string.Format("^{0}$",myFilter.Replace("*",".*")));
2 of 3
8

Have you tried LINQ:

List<string> resultList = files.Where(x => x.EndsWith("_Test.txt")).ToList();

or if you are running this on some old/legacy .NET version (< 3.5):

List<string> resultList = files.FindAll(delegate(string s) { 
    return s.EndsWith("_Test.txt"); 
});
🌐
Mozilla
developer.mozilla.org › en-US › docs › Web › JavaScript › Guide › Regular_expressions
Regular expressions - JavaScript - MDN Web Docs
2 weeks ago - For example, to match a single "a" followed by zero or more "b"s followed by "c", you'd use the pattern /ab*c/: the * after "b" means "0 or more occurrences of the preceding item." In the string "cbbabbbbcdebc", this pattern will match the substring "abbbbc". The following pages provide lists of the different special characters that fit into each category, along with descriptions and examples.
Find elsewhere
Top answer
1 of 1
4

Regular expressions work on strings, not on a "string list" and not multiple string lists. Wherever you need to process more than one string, you will typically need some environmental code to do the processing. For your example, this code has to apply the regex to every element of the first list, then collect the results and use this results to process the second list.

Said that, the usual approach to apply a regexp to a list of strings is to concatenate them by a separator character like "newline". To concatenate two lists and distinguish them, you would need at least a special "magic" character or word for separating the first list from the second, which is not part of the list. Using such magic can cause some maintenance headaches if you are not very careful, nevertheless by combining this with backreferences, this can be used to solve your problem.

For example, numbered backreferences like \1 to \9 refer to other capturing groups found before. Lets assume you used "###" as a separator for the two lists, a regexp along the lines of

  ^([A-Z])\W*([0-9]+)$.*###.*^(\1\W*\+\+\W*\2)$
    ^         ^          ^     ^           ^
    |         |          |     |           |
   first    second       |    backrefs to first
   group    group        |    or second group
                       lists
                       separator

might be a first approximation for what you are looking for (beware of bugs, I did not test it). Put this into a global regexp search, then it should produce all pairs of matches which fit to your constraints.

As a final remark: the resulting code may be very compact, nevertheless harder to maintain (and probably slower) than a more explicit solution where you process the two lists individually.

🌐
Scaler
scaler.com › home › topics › how to use regex in c?
How to use regex in C?- Scaler Topics
May 4, 2023 - Talking about POSIX is a widely known library in the C language and most of its classes are present inside the regex.h header file and are primarily used for the implementation of regular expressions. Let's have a look at the below table, here we are having various POSIX classes and with respect to them, their character equivalent representations and their follow-up descriptions are given explaining what each class will return as a match. ... [:cntrl:] - It looks for the match of control characters in the given target string.
🌐
Reddit
reddit.com › r/regex › match string only if part of a list
r/regex on Reddit: match string only if part of a list
December 3, 2024 -

**** RESOLVED ****

Hi,

I’m not sure if this is possible:

I’m looking for specific strings that contain an "a" with this regex: (flavour is c# (.net))

([^\s]+?)a([^\s]+?)\b

but they should only match if the found word is part of a list. Some kind of opposite of negative lookbehind.

So the above regex captures all kind of strings with "a" in them, but it should only match if the string is part of

"fass" or "arbecht" as I need to replace the a by some other string.

example: it should match "verfassen" or "verarbeit" but not "passen"

Best regards,

Pascal

Edit: Solution:

These two versions work fine and credits and many thanks go to:

u/gumnos: \b(?=\S*(?:fass|arbeit))(\S*?)a(\S*)\b

u/rainshifter (with some editing to match what I really need): (?<=(?:\b(?=\w*(?:fass|arbeit))|\G(?<!^))\w*)(\S*?)a(\S*)\b

🌐
Quora
quora.com › What-is-matching-a-list-of-regex-to-every-string-in-a-list-one-by-one-Python
What is matching a list of regex to every string in a list one by one (Python)? - Quora
Answer (1 of 2): The answer is (as usual) you just do that. Suppose I have a list of regular expression objects made with re.compile called pats and a list of strings called strs. Then: [code]for s in strs: x = [p.match(s) for p in pats] ... do something with x [/code]x is a list of Nones or ...
Top answer
1 of 2
12

There are quite a lot of regular expression packages, but yours seems to match the one in POSIX: regcomp() etc.

The two structures it defines in <regex.h> are:

  • regex_t containing at least size_t re_nsub, the number of parenthesized subexpressions.

  • regmatch_t containing at least regoff_t rm_so, the byte offset from start of string to start of substring, and regoff_t rm_eo, the byte offset from start of string of the first character after the end of substring.

Note that 'offsets' are not pointers but indexes into the character array.

The execution function is:

  • int regexec(const regex_t *restrict preg, const char *restrict string, size_t nmatch, regmatch_t pmatch[restrict], int eflags);

Your printing code should be:

for (int i = 0; i <= r.re_nsub; i++)
{
    int start = m[i].rm_so;
    int finish = m[i].rm_eo;
//  strcpy(matches[ind], ("%.*s\n", (finish - start), p + start));  // Based on question
    sprintf(matches[ind], "%.*s\n", (finish - start), p + start);   // More plausible code
    printf("Storing:  %.*s\n", (finish - start), matches[ind]);     // Print once
    ind++;
    printf("%.*s\n", (finish - start), p + start);                  // Why print twice?
}

Note that the code should be upgraded to ensure that the string copy (via sprintf()) does not overflow the target string — maybe by using snprintf() instead of sprintf(). It is also a good idea to mark the start and end of a string in the printing. For example:

    printf("<<%.*s>>\n", (finish - start), p + start);

This makes it a whole heap easier to see spaces etc.

[In future, please attempt to provide an MCVE (Minimal, Complete, Verifiable Example) or SSCCE (Short, Self-Contained, Correct Example) so that people can help more easily.]

This is an SSCCE that I created, probably in response to another SO question in 2010. It is one of a number of programs I keep that I call 'vignettes'; little programs that show the essence of some feature (such as POSIX regexes, in this case). I find them useful as memory joggers.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <regex.h>

#define tofind    "^DAEMONS=\\(([^)]*)\\)[ \t]*$"

int main(int argc, char **argv)
{
    FILE *fp;
    char line[1024];
    int retval = 0;
    regex_t re;
    regmatch_t rm[2];
    //this file has this line "DAEMONS=(sysklogd network sshd !netfs !crond)"
    const char *filename = "/etc/rc.conf";

    if (argc > 1)
        filename = argv[1];

    if (regcomp(&re, tofind, REG_EXTENDED) != 0)
    {
        fprintf(stderr, "Failed to compile regex '%s'\n", tofind);
        return EXIT_FAILURE;
    }
    printf("Regex: %s\n", tofind);
    printf("Number of captured expressions: %zu\n", re.re_nsub);

    fp = fopen(filename, "r");
    if (fp == 0)
    {
        fprintf(stderr, "Failed to open file %s (%d: %s)\n", filename, errno, strerror(errno));
        return EXIT_FAILURE;
    }

    while ((fgets(line, 1024, fp)) != NULL)
    {
        line[strcspn(line, "\n")] = '\0';
        if ((retval = regexec(&re, line, 2, rm, 0)) == 0)
        {
            printf("<<%s>>\n", line);
            // Complete match
            printf("Line: <<%.*s>>\n", (int)(rm[0].rm_eo - rm[0].rm_so), line + rm[0].rm_so);
            // Match captured in (...) - the \( and \) match literal parenthesis
            printf("Text: <<%.*s>>\n", (int)(rm[1].rm_eo - rm[1].rm_so), line + rm[1].rm_so);
            char *src = line + rm[1].rm_so;
            char *end = line + rm[1].rm_eo;
            while (src < end)
            {
                size_t len = strcspn(src, " ");
                if (src + len > end)
                    len = end - src;
                printf("Name: <<%.*s>>\n", (int)len, src);
                src += len;
                src += strspn(src, " ");
            }
        }
    } 
    return EXIT_SUCCESS;
}

This was designed to find a particular line starting DAEMONS= in a file /etc/rc.conf (but you can specify an alternative file name on the command line). You can adapt it to your purposes easily enough.

2 of 2
0

Since g++ regex is bugged until who knows when, you can use my code instead (License: AGPL, no warranty, your own risk, ...)

/**
 * regexp (License: AGPL3 or higher)
 * @param re extended POSIX regular expression
 * @param nmatch maximum number of matches
 * @param str string to match
 * @return An array of char pointers. You have to free() the first element (string storage). the second element is the string matching the full regex, then come the submatches.
*/
char **regexp(char *re, int nmatch, char *str) {
  char **result;
  char *string;
  regex_t regex;
  regmatch_t *match;
  int i;

  match=malloc(nmatch*sizeof(*match));
  if (!result) {
    fprintf(stderr, "Out of memory !");
    return NULL;
  }

  if (regcomp(&regex, re, REG_EXTENDED)!=0) {
    fprintf(stderr, "Failed to compile regex '%s'\n", re);
    return NULL;
  }

  string=strdup(str);
  if (regexec(&regex,string,nmatch,match,0)) {
#ifdef DEBUG
    fprintf(stderr, "String '%s' does not match regex '%s'\n",str,re);
#endif
    free(string);
    return NULL;
  }

  result=malloc(sizeof(*result));
  if (!result) {
    fprintf(stderr, "Out of memory !");
    free(string);
    return NULL;
  }

  for (i=0; i<nmatch; ++i) {
    if (match[i].rm_so>=0) {
      string[match[i].rm_eo]=0;
      ((char**)result)[i]=string+match[i].rm_so;
#ifdef DEBUG
      printf("%s\n",string+match[i].rm_so);
#endif                                                                                                                                                                                                                                                   
    } else {                             
      ((char**)result)[i]="";            
    }
  }

  result[0]=string;                      

  return result;                         

}
🌐
Wikipedia
en.wikipedia.org › wiki › Regular_expression
Regular expression - Wikipedia
February 28, 2026 - A match is made, not when all the atoms of the string are matched, but rather when all the pattern atoms in the regex have matched. The idea is to make a small pattern of characters stand for a large number of possible strings, rather than compiling a large list of all the literal possibilities.
🌐
Pyladiespdx
pyladiespdx.github.io › listcomps
Get Comfortable with List Comprehensions and Regex | Pyladiespdx
group() : won’t change matching behavior, but gives you the ability to extract the pattern captured inside as a logical group ex: would give you the ability to extract either the email name or email host from the pattern match, where: ... findall() : a very useful re module function that finds all of the matches and returns them as a list of strings
Top answer
1 of 6
282

Regular expressions actually aren't part of ANSI C. It sounds like you might be talking about the POSIX regular expression library, which comes with most (all?) *nixes. Here's an example of using POSIX regexes in C (based on this):

#include <regex.h>        
regex_t regex;
int reti;
char msgbuf[100];

/* Compile regular expression */
reti = regcomp(&regex, "^a[[:alnum:]]", 0);
if (reti) {
    fprintf(stderr, "Could not compile regex\n");
    exit(1);
}

/* Execute regular expression */
reti = regexec(&regex, "abc", 0, NULL, 0);
if (!reti) {
    puts("Match");
}
else if (reti == REG_NOMATCH) {
    puts("No match");
}
else {
    regerror(reti, &regex, msgbuf, sizeof(msgbuf));
    fprintf(stderr, "Regex match failed: %s\n", msgbuf);
    exit(1);
}

/* Free memory allocated to the pattern buffer by regcomp() */
regfree(&regex);

Alternatively, you may want to check out PCRE, a library for Perl-compatible regular expressions in C. The Perl syntax is pretty much that same syntax used in Java, Python, and a number of other languages. The POSIX syntax is the syntax used by grep, sed, vi, etc.

2 of 6
22

This is an example of using REG_EXTENDED. This regular expression

"^(-)?([0-9]+)((,|.)([0-9]+))?\n$"

Allows you to catch decimal numbers in Spanish system and international. :)

#include <regex.h>
#include <stdlib.h>
#include <stdio.h>

regex_t regex;
char msgbuf[100];
int reti = regcomp(&regex, "^(-)?([0-9]+)((,|.)([0-9]+))?\n$", REG_EXTENDED);

int main(int argc, char const *argv[])
{
    while(1){
        fgets( msgbuf, 100, stdin );
        if (reti) {
            fprintf(stderr, "Could not compile regex\n");
            exit(1);
        }
        
        /* Execute regular expression */
        printf("%s\n", msgbuf);
        reti = regexec(&regex, msgbuf, 0, NULL, 0);
        if (!reti) {
            puts("Match");
        }
        else if (reti == REG_NOMATCH) {
            puts("No match");
        }
        else {
            regerror(reti, &regex, msgbuf, sizeof(msgbuf));
            fprintf(stderr, "Regex match failed: %s\n", msgbuf);
            exit(1);
        }
        
        /* Free memory allocated to the pattern buffer by regcomp() */
        regfree(&regex);
    }

}
🌐
Tidyverse
stringr.tidyverse.org › articles › regular-expressions.html
Regular expressions • stringr
To match a literal “$” or “^”, you need to escape them, \$, and \^. For multiline strings, you can use regex(multiline = TRUE).
🌐
Cplusplus
cplusplus.com › reference › regex › regex_search
std::regex_search
Object of a match_results type (such as cmatch or smatch) that is filled by this function with information about the match results and any submatches found.
🌐
Cplusplus
cplusplus.com › reference › regex › regex_match
std::regex_match
Deleting the signature with a moving ... string object. An optional parameter, flags, allows to specify options on how to match the expression. The entire target sequence must match the regular expression for this function to return true (i.e., without any additional characters before or after the match). For a function that returns true when the match is only part of the sequence, see regex_sea...