regex list of words python

match a list of words in a line using regex in python

stackoverflow.com › questions › 7430188 › match-a-list-of-words-in-a-line-using-regex-in-python

Simple string operation:

mywords = ("xxx", "yyy", "zzz")
all(x in mystring for x in mywords)

If word boundaries are relevant (i. e. you want to match zzz but not Ozzzy):

import re
all(re.search(r"\b" + re.escape(word) + r"\b", mystring) for word in mywords)

Answer from Tim Pietzcker on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 7430188 › match-a-list-of-words-in-a-line-using-regex-in-python

match a list of words in a line using regex in python - Stack Overflow

Top answer

1 of 2

Simple string operation:

mywords = ("xxx", "yyy", "zzz")
all(x in mystring for x in mywords)

If word boundaries are relevant (i. e. you want to match zzz but not Ozzzy):

import re
all(re.search(r"\b" + re.escape(word) + r"\b", mystring) for word in mywords)

2 of 2

I'd use all and re.search for finding matches.

>>> words = ('xxx', 'yyy' ,'zzz')
>>> text = "sdfjhgdsf zzz sdfkjsldjfds yyy dfgdfgfd xxx"
>>> all([re.search(w, text) for w in words])
True

Python documentation

docs.python.org › 3 › library › re.html

re — Regular expression operations — Python 3.14.3 ...

3 days ago - Source code: Lib/re/ This module provides regular expression matching operations similar to those found in Perl. Both patterns and strings to be searched can be Unicode strings ( str) as well as 8-...

Discussions

python - How can I create a regex from a list of words? - Stack Overflow

I have a dict of words (actually I have nested dicts of verb conjugations, but that isn't relevant) and I want to make a regex by combining them. { 'yo': 'hablaba', 'tú': 'hablabas', 'él': ' More on stackoverflow.com

stackoverflow.com

regex - How to match any string from a list of strings in regular expressions in python? - Stack Overflow

Lets say I have a list of strings, string_lst = ['fun', 'dum', 'sun', 'gum'] I want to make a regular expression, where at a point in it, I can match any of the strings i have in that list, within a More on stackoverflow.com

stackoverflow.com

python - List of all words matching regular expression - Stack Overflow

Let assume that I have some string: "Lorem ipsum dolor sit amet" I need a list of all words with lenght more than 3. Can I do it with regular expressions? e.g. pattern = re.compile(r'some pattern') More on stackoverflow.com

stackoverflow.com

November 19, 2011

regex - Python regular expression match multiple words anywhere - Stack Overflow

I'm trying to use python's regular expression to match a string with several words. For example, the string is "These are oranges and apples and pears, but not pinapples or .." The list of words I ... More on stackoverflow.com

stackoverflow.com

reddit.com › r/regex › python regex when string contains any word from a list of words and any word from another list

r/regex on Reddit: Python regex when string contains any word from a list of words AND any word from another list

December 8, 2022 -

I'd like to be able to match strings which meet these criteria:

['foo' OR 'bar' OR 'Python'] AND ['me', OR 'you' OR 'we']

Top answer

1 of 2

Use lookaheads. ^(?=.*foo|.*bar|.*Python)(?=.*me|.*you|.*we)

Add \b around the words (e.g. \bfoo\b) if you want them as isolated words, otherwise you get matches like fool)

https://regex101.com/r/dnqSjr/1

2 of 2

If you want to use regex you have to construct the regex string in your code from the lists.

You have to sting all the words together using regex '|' or - and that might not be the most efficient solution depending on the length of the word lists.

But lets start with the base regex:

\bAWORD\b

Will match "AWORD". \b means word boundary, meaning we don't match partial words. In sted of AWORD we can use a list here: (word1|word2|...ect).

This list you can construct in with python, like so:

import re
word_list1 = ['foo', 'bar', 'Python']
word_list2 = ['me', 'you', 'we']

words1 = '|'.join(word_list1)
words2 = '|'.join(word_list2)
regex = r'\b(?:{})\b'
test_str = "foo is a me word"
return (re.search(regex.format(words1), test_str) and
        re.search(regex.format(words2), test_str)) != None

.format just inserts the '|' spectated words into the regex in place of '{}'. I am sure the is a more "pythonic" way of doing this, but this is the regex way. :)

Stack Overflow

stackoverflow.com › questions › 14945553 › how-can-i-create-a-regex-from-a-list-of-words

python - How can I create a regex from a list of words? - Stack Overflow

Top answer

1 of 2

Yes, I believe this is possible.

To get you started, this is how I would break down the problem.

Calculate the root by finding the longest possible string that matches the start of all of the declined values:

>>> root = ''
>>> for c in hablar['yo']:
...     if all(v.startswith(root + c) for v in hablar.itervalues()):
...         root += c
...     else:
...        break
... 
>>> root
'habl'

Whatever's left of the words makes a list of endings.

>>> endings = [v[len(root):] for v in hablar.itervalues()]
>>> print endings
['abas', 'aba', 'abais', 'aba', '\xc3\xa1bamos', 'aban', 'abas']

You may then want to weed out the duplicates:

>>> unique_endings = set(endings)
>>> print unique_endings
set(['abas', 'abais', '\xc3\xa1bamos', 'aban', 'aba'])

Then join these endings together with pipes:

>>> conjoined_endings = '|'.join(unique_endings)
>>> print conjoined_endings
abas|abais|ábamos|aban|aba

Forming the regular expression is a simple matter combining the root and the conjoined_endings string in parentheses:

>>> final_regex = '{}({})'.format(root, conjoined_endings)
>>> print final_regex
habl(abas|abais|ábamos|aban|aba)

2 of 2

I think you need to have a less clever approach

>>> x={
...   'yo': 'hablaba',
...   'tú': 'hablabas',
...   'él': 'hablaba',
...   'nosotros': 'hablábamos',
...   'vosotros': 'hablabais',
...   'ellos': 'hablaban',
...   'vos': 'hablabas',
... }
>>> x
{'t\xc3\xba': 'hablabas', 'yo': 'hablaba', 'vosotros': 'hablabais', '\xc3\xa9l': 'hablaba', 'nosotros': 'habl\xc3\xa1bamos', 'ellos': 'hablaban', 'vos': 'hablabas'}
>>> x.values
<built-in method values of dict object at 0x20e6490>
>>> x.values()
['hablabas', 'hablaba', 'hablabais', 'hablaba', 'habl\xc3\xa1bamos', 'hablaban', 'hablabas']
>>> "|".join(x.values())
'hablabas|hablaba|hablabais|hablaba|habl\xc3\xa1bamos|hablaban|hablabas'

If you just join the hash values with an alternation operator then it should do what you want

w3resource

w3resource.com › python-exercises › re › python-re-exercise-26.php

Python: Match if two words from a list of words starting with letter 'P' - w3resource

July 22, 2025 - Python Exercises, Practice and Solution: Write a Python program to match if two words from a list of words start with the letter 'P'.

Stack Overflow

stackoverflow.com › questions › 33406313 › how-to-match-any-string-from-a-list-of-strings-in-regular-expressions-in-python

regex - How to match any string from a list of strings in regular expressions in python? - Stack Overflow

Top answer

1 of 5

Join the list on the pipe character |, which represents different options in regex.

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))", x)

Output: ['fun']

You cannot use match as it will match from start. Using search you will get only the first match. So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

2 of 5

regex module has named lists (sets actually):

#!/usr/bin/env python
import regex as re # $ pip install regex

p = re.compile(r"\L<words>", words=['fun', 'dum', 'sun', 'gum'])
if p.search("I love to have fun."):
    print('matched')

Here words is just a name, you can use anything you like instead.
.search() methods is used instead of .* before/after the named list.

To emulate named lists using stdlib's re module:

#!/usr/bin/env python
import re

words = ['fun', 'dum', 'sun', 'gum']
longest_first = sorted(words, key=len, reverse=True)
p = re.compile(r'(?:{})'.format('|'.join(map(re.escape, longest_first))))
if p.search("I love to have fun."):
    print('matched')

re.escape() is used to escape regex meta-characters such as .*? inside individual words (to match the words literally).
sorted() emulates regex behavior and it puts the longest words first among the alternatives, compare:

>>> import re
>>> re.findall("(funny|fun)", "it is funny")
['funny']
>>> re.findall("(fun|funny)", "it is funny")
['fun']
>>> import regex
>>> regex.findall(r"\L<words>", "it is funny", words=['fun', 'funny'])
['funny']
>>> regex.findall(r"\L<words>", "it is funny", words=['funny', 'fun'])
['funny']

UI Bakery

uibakery.io › regex-library › match-words-regex-python

Regex match words Python

# Validate words words_pattern = "^\\b(?:\\w|-)+\\b$" re.match(words_pattern, 'word') # Returns Match object re.match(words_pattern, 'pet-friendly') # Returns Match object re.match(words_pattern, 'not a word') # Returns None # Extract words from a string words_extract_pattern = "\\b(?:\\w|-)+\\b" re.findall(words_extract_pattern, 'Hello, world!') # returns ['Hello', 'world']

W3Schools

w3schools.com › python › python_regex.asp

Python RegEx

RegEx can be used to check if a string contains the specified search pattern. Python has a built-in package called re, which can be used to work with Regular Expressions. ... You can add flags to the pattern when using regular expressions. A special sequence is a \ followed by one of the characters in the list below, and has a special meaning:

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 4594161 › list-of-all-words-matching-regular-expression

python - List of all words matching regular expression - Stack Overflow

Top answer

1 of 4

>>> import re
>>> myre = re.compile(r"\w{4,}")
>>> myre.findall('Lorem, ipsum! dolor sit? amet...')
['Lorem', 'ipsum', 'dolor', 'amet']

Take note that in Python 3, where all strings are Unicode, this will also find words that use non-ASCII letters:

>>> import re
>>> myre = re.compile(r"\w{4,}")
>>> myre.findall('Lorem, ipsum! dolör sit? amet...')
['Lorem', 'ipsum', 'dolör', 'amet']

In Python 2, you'd have to use

>>> myre = re.compile(r"\w{4,}", re.UNICODE)
>>> myre.findall(u'Lorem, ipsum! dolör sit? amet...')
[u'Lorem', u'ipsum', u'dol\xf6r', u'amet']

2 of 4

That is a tipical use case for list comprehensions in Python, which can be used for filtering:

text = 'Lorem ipsum dolor sit amet'
result = [word for word in  pattern.findall(text) if len(word) > 3]

Spark By {Examples}

sparkbyexamples.com › home › python › python regex list

Python regex list - Spark By {Examples}

May 31, 2024 - Using list comprehension, we iterate over each word in the words list and check if it matches the specified pattern using re.match(). If a word matches the pattern, it is included in the filtered_words list. ... The re.sub() method can be so useful in scenarios where you want to replace elements in a list. Let us look at an example: import re # list of strings to modify strings = ["Python123", "PySpark456", "Scala789", "MongoDB123"] # pattern to match and replace pattern = r'\d+' # Matches one or more digits # replacement string replacement = "NUM" # iterate over the list and replace matching elements using re.sub() modified_strings = [re.sub(pattern, replacement, string) for string in strings] # print the modified list print(modified_strings)

Regex Tester

regextester.com › 93690

Find any word in a list of words - Regex Tester/Debugger

Url checker with or without http:// or https:// Match string not containing string Check if a string only contains numbers Match elements of a url Match an email address Validate an ip address Match or Validate phone number Match html tag Match dates (M/D/YY, M/D/YYY, MM/DD/YY, MM/DD/YYYY) Empty String Checks the length of number and not starts with 0 Match a valid hostname Not Allowing Special Characters Validate datetime Person Name string between quotes + nested quotes Match brackets Url match a wide range of international phone number Match IPv6 Address · Regex Tester isn't optimized for mobile devices yet.

Spark By {Examples}

sparkbyexamples.com › home › python › python regex search list

Python regex search list - Spark By {Examples}

May 31, 2024 - In this tutorial, we will be exploring how we can use regex in Python to search through a list. ... The question that might pop into your mind is why in the first place would you even bother to use regex to search a list? Below are the reasons why you should use regex for your list search: ... In this section, I will demonstrate how you can use the re.search() method to search a list. Here is an example: import re # list of names names = ["Guido Van Rossum", "Brendan Eich", "Rasmus Lerdorf", "Bjarne Stroustrup", "Dennis Ritchie"] # regex pattern to search for names starting with "B" pattern = r"^B\w+" # iterate over the list and search for matching names for name in names: # calling the search() method match = re.search(pattern, name) if match: print("Match found:", name)

Stack Overflow

stackoverflow.com › questions › 26985228 › python-regular-expression-match-multiple-words-anywhere

regex - Python regular expression match multiple words anywhere - Stack Overflow

Top answer

1 of 2

You've got a few problems there.

First, matches are case-sensitive unless you use the IGNORECASE/I flag to ignore case. So, 'AND' doesn't match 'and'.

Also, unless you use the VERBOSE/X flag, those spaces are part of the pattern. So, you're checking for 'AND ', not 'AND'. If you wanted that, you probably wanted spaces on each side, not just those sides (otherwise, 'band leader' is going to match…), and really, you probably wanted \b, not a space (otherwise a sentence starting with 'And another thing' isn't going to match).

Finally, if you think you need .* before and after your pattern and $ and ^ around it, there's a good chance you wanted to use search, findall, or finditer, rather than match.

So:

>>> s = "These are oranges and apples and pears, but not pinapples or .."
>>> r = re.compile(r'\bAND\b | \bOR\b | \bNOT\b', flags=re.I | re.X)
>>> r.findall(s)
['and', 'and', 'not', 'or']

Debuggex Demo

2 of 2

Try this:

>>> re.findall(r"\band\b|\bor\b|\bnot\b", "These are oranges and apples and pears, but not pinapples or ..")
['and', 'and', 'not', 'or']

a|b means match either a or b

\b represents a word boundary

re.findall(pattern, string) returns an array of all instances of pattern in string

Stack Overflow

stackoverflow.com › questions › 37543724 › python-regex-for-finding-all-words-in-a-string

Python regex for finding all words in a string - Stack Overflow

Top answer

1 of 1

Use word boundary \b

import re

shop="hello seattle what have you got"
regex = r'\b\w+\b'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

or simply \w+ is enough

import re

shop="hello seattle what have you got"
regex = r'\w+'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

Python documentation

docs.python.org › 3 › howto › regex.html

Regular Expression HOWTO — Python 3.14.3 documentation

Regular expressions (called REs, or regexes, or regex patterns) are essentially a tiny, highly specialized programming language embedded inside Python and made available through the re module. Using this little language, you specify the rules for the set of possible strings that you want to match; this set might contain English sentences, or e-mail addresses, or TeX commands, or anything you like.

Medium

medium.com › quantrium-tech › extracting-words-from-a-string-in-python-using-regex-dac4b385c1b8

Extracting Words from a string in Python using RegEx

October 6, 2020 - QUANTRIUM GUIDES Extracting Words from a string in Python using the “re” module Extract word from your text data using Python’s built in Regular Expression Module Regular expression (RegEx) is …

Guru99

guru99.com › home › python › python regex: re.match(), re.search(), re.findall() with example

Python RegEx: re.match(), re.search(), re.findall() with Example

August 13, 2025 - But if a match is found in some other line, the Python RegEx Match function returns null. For example, consider the following code of Python re.match() function. The expression “w+” and “\W” will match the words starting with letter ‘g’ and thereafter, anything which is not started with ‘g’ is not identified. To check match for each element in the list ...

Stack Overflow

stackoverflow.com › questions › 6750240 › how-to-do-re-compile-with-a-list-in-python

regex - how to do re.compile() with a list in python - Stack Overflow

Top answer

1 of 5

fruit_list = ['apple', 'banana', 'peach', 'plum', 'pineapple', 'kiwi']
fruit = re.compile('|'.join(fruit_list))

As ridgerunner pointed out in comments, you will probably want to add word boundaries to the regex, otherwise the regex will match on words like plump since they have a fruit as a substring.

fruit = re.compile(r'\b(?:%s)\b' % '|'.join(fruit_list))

Lastly, if the strings in fruit_list could contain special characters, you will probably want to use re.escape.

'|'.join(map(re.escape, fruit_list))

2 of 5

As you want exact matches, no real need for regex imo...

fruits = ['apple', 'cherry']
sentences = ['green apple', 'yellow car', 'red cherry']
for s in sentences:
    if any(f in s for f in fruits):
        print s, 'contains a fruit!'
# green apple contains a fruit!
# red cherry contains a fruit!

EDIT: If you need access to the strings that matched:

from itertools import compress

fruits = ['apple', 'banana', 'cherry']
s = 'green apple and red cherry'

list(compress(fruits, (f in s for f in fruits)))
# ['apple', 'cherry']

ZetCode

zetcode.com › python › regularexpressions

Python regular expressions - using regular expressions in Python

We look for matches with regex functions. The match, fullmatch, and search functions return a match object if they are successful. Otherwise, they return None. The match function returns a match object if zero or more characters at the beginning of string match the regular expression pattern. ... #!/usr/bin/python import re words = ('book', 'bookworm', 'Bible', 'bookish','cookbook', 'bookstore', 'pocketbook') pattern = re.compile(r'book') for word in words: if re.match(pattern, word): print(f'The {word} matches')

Google

developers.google.com › google for education › python › python regular expressions

Python Regular Expressions | Python Education | Google for Developers

The re.findall(pat, str) function finds all matches of a pattern in a string and returns them as a list of strings or tuples, depending on whether the pattern contains capturing groups. Regular expressions are a powerful language for matching text patterns. This page gives a basic introduction to regular expressions themselves sufficient for our Python exercises and shows how regular expressions work in Python.