regex match any word in list

Match list of words without the list of chars around

stackoverflow.com › questions › 21448139 › match-list-of-words-without-the-list-of-chars-around

Since your capture groups define explicitly one character on either side of the common word, it's looking for space word space and then when it doesn't find another space, it fails.

In this case, since you don't want to match all the characters word boundary's would catch (period, apostrophe, etc.) you need to use a bit of trickery with lookaheads, lookbehinds, and non-capture groups. Try this:

(?:^|(?<= ))(one|common|word|or|another)(?:(?= )|$)

http://regex101.com/r/cM9hD8

Word boundaries are still simpler to implement, so for reference sake, you could also do this (though it would include ', ., etc.).

\b(one|common|word|or|another)\b

Answer from brandonscript on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 21448139 › match-list-of-words-without-the-list-of-chars-around

regex - Match list of words without the list of chars around - Stack Overflow

Top answer

1 of 2

Since your capture groups define explicitly one character on either side of the common word, it's looking for space word space and then when it doesn't find another space, it fails.

(?:^|(?<= ))(one|common|word|or|another)(?:(?= )|$)

http://regex101.com/r/cM9hD8

Word boundaries are still simpler to implement, so for reference sake, you could also do this (though it would include ', ., etc.).

\b(one|common|word|or|another)\b

2 of 2

It will not match one's , someone ,etc...

Check DEMO

reddit.com › r/regex › python regex when string contains any word from a list of words and any word from another list

r/regex on Reddit: Python regex when string contains any word from a list of words AND any word from another list

December 8, 2022 -

I'd like to be able to match strings which meet these criteria:

['foo' OR 'bar' OR 'Python'] AND ['me', OR 'you' OR 'we']

Top answer

1 of 2

Use lookaheads. ^(?=.*foo|.*bar|.*Python)(?=.*me|.*you|.*we)

Add \b around the words (e.g. \bfoo\b) if you want them as isolated words, otherwise you get matches like fool)

https://regex101.com/r/dnqSjr/1

2 of 2

If you want to use regex you have to construct the regex string in your code from the lists.

You have to sting all the words together using regex '|' or - and that might not be the most efficient solution depending on the length of the word lists.

But lets start with the base regex:

\bAWORD\b

Will match "AWORD". \b means word boundary, meaning we don't match partial words. In sted of AWORD we can use a list here: (word1|word2|...ect).

This list you can construct in with python, like so:

import re
word_list1 = ['foo', 'bar', 'Python']
word_list2 = ['me', 'you', 'we']

words1 = '|'.join(word_list1)
words2 = '|'.join(word_list2)
regex = r'\b(?:{})\b'
test_str = "foo is a me word"
return (re.search(regex.format(words1), test_str) and
        re.search(regex.format(words2), test_str)) != None

.format just inserts the '|' spectated words into the regex in place of '{}'. I am sure the is a more "pythonic" way of doing this, but this is the regex way. :)

Discussions

regular expressions: matching all words containing a specific list of letters - Emacs Stack Exchange

If the regexp looks puzzling, the idea is that when lookahead (the ?= part) matches it doesn't advance the parser, so it can match multiple times when looking ahead from the same place without consuming any input. ... You say that you "have a file containing a list of all words of a language" ... More on emacs.stackexchange.com

emacs.stackexchange.com

January 26, 2015

Regular Expression to find certain words in a document - regular-expressions - Drafts Community

I’m total newbie with regular expressions. What I want to do: This is to help solve NY Times Spelling Bee. In a Drafts doc, I have a running list of words that are useful in solving the bee. About 1200 words now. For the spelling bee, you want to make as many words as possible from 7 letters. More on forums.getdrafts.com

forums.getdrafts.com

October 17, 2021

regex - How to match any string from a list of strings in regular expressions in python? - Stack Overflow

Lets say I have a list of strings, string_lst = ['fun', 'dum', 'sun', 'gum'] I want to make a regular expression, where at a point in it, I can match any of the strings i have in that list, within a More on stackoverflow.com

stackoverflow.com

regex - Match any one item in a list - Stack Overflow

For a reddit bot I want to find comments that match a certain regex plus any word of a list. More on stackoverflow.com

stackoverflow.com

O'Reilly

oreilly.com › library › view › regular-expressions-cookbook › 9781449327453 › ch05s02.html

5.2. Find Any of Multiple Words - Regular Expressions Cookbook, 2nd Edition [Book]

August 27, 2012 - The simple solution is to alternate between the words you want to match: ... More complex examples of matching similar words are shown in Recipe 5.3. var subject = "One times two plus one equals three."; // Solution 1: var regex = ...

Authors Jan GoyvaertsSteven Levithan

Published 2012

Pages 609

Regex Tester

regextester.com › 93690

Find any word in a list of words - Regex Tester/Debugger

Regex Tester is a tool to learn, build, & test Regular Expressions (RegEx / RegExp). Results update in real-time as you type. Roll over a match or expression for details. Save & share expressions with others. Explore the Library for help & examples. Undo & Redo with {{getCtrlKey()}}-Z / Y. Search for & rate Community patterns. ... extended (x) extra (X) single line (s) unicode (u) Ungreedy (U) Anchored (A) dup subpattern names(J)

Google Support

support.google.com › administrators › gmail › advanced › examples of regular expressions

Examples of regular expressions | Advanced | Google Workspace Help

The following examples illustrate the use and construction of simple regular expressions. Each example includes the type of text to match, one or more regular expressions that match that text, and notes that explain the use of the special characters and formatting.

Super User

superuser.com › questions › 903168 › how-should-i-write-a-regex-to-match-a-specific-word

How should I write a regex to match a specific word? - Super User

Top answer

1 of 7

I suggest bookmarking the MSDN Regular Expression Quick Reference

you want to achieve a case insensitive match for the word "rocket" surrounded by non-alphanumeric characters. A regex that would work would be:

\W*((?i)rocket(?-i))\W*

What it will do is look for zero or more (*) non-alphanumeric (\W) characters, followed by a case insensitive version of rocket ( (?i)rocket(?-i) ), followed again by zero or more (*) non-alphanumeric characters (\W). The extra parentheses around the rocket-matching term assigns the match to a separate group. The word rocket will thus be in match group 1.

UPDATE 1: Matt said in the comment that this regex is to be used in python. Python has a slightly different syntax. To achieve the same result in python, use this regex and pass the re.IGNORECASE option to the compile or match function.

\W*(rocket)\W*

On Regex101 this can be simulated by entering "i" in the textbox next to the regex input.

UPDATE 2 Ismael has mentioned, that the regex is not quite correct, as it might match "1rocket1". He posted a much better solution, namely

(?:^|\W)rocket(?:$|\W)

2 of 7

I think the look-aheads are overkill in this case, and you would be better off using word boundaries with the ignorecase option,

\brocket\b

In other words, in python:

>>> x="rocket's"
>>> y="rocket1."
>>> c=re.compile(r"\brocket\b",re.I)  # with the ignorecase option
>>> c.findall(y)
[]
>>> c.findall(x)
['rocket']

Stack Exchange

emacs.stackexchange.com › questions › 7715 › regular-expressions-matching-all-words-containing-a-specific-list-of-letters

regular expressions: matching all words containing a specific list of letters - Emacs Stack Exchange

Top answer

1 of 5

You're looking for something that can be found by a regexp (a word), but which should additionally obey some constraint.

In this case the constraint is a form of subset-relation:

(defun string-subset-p (s1 s2)
  "Return t, if S1 is a subset of S2, when viewed as char-sets."
  (let ((s2-chars (append s2 nil)))
    (cl-every (lambda (ch)
                (memq ch s2-chars))
              (append s1 nil))))

When put together (in the most trivial way):

(defun search-word-containg-chars-forward (chars)
  (interactive "sChars: ")
  (while (and (re-search-forward "\\w+")
              (not (string-subset-p chars (match-string 0))))))

More efficient implementations for the string-subset-p function are left as an exercise to the reader. Though, chances are, that it won't really matter.

2 of 5

Here's one way to implement some equivalent to the "AND"ing of regexp needed for this specific application.

The word at point is first character sorted so that dollars becomes adllors in a temporary buffer. That temporary sorted string is then matched with occurrence of any optional alphabet followed by d, followed by any optional alphabet followed by l, followed by any optional alphabet followed by s, followed by any optional alphabet. If that match is true, the word is highlighted, else a message is displayed.

To do this over the whole buffer, do M-x my/match-word-whole-buffer.

(defun my/match-word ()
  "Matches words containing all chars d, l, s in any order: dollars solid 
Match will fail if a word is missing any of those characters. e.g. dollar"
  (interactive)
  (let ((this-word (thing-at-point 'word)); get the word at point
        (match))
    (with-temp-buffer
      (insert this-word)
      (sort-regexp-fields nil "\\w" "\\&" (point-min) (point-max)) ; sort chars in word
      (beginning-of-buffer)
      ;; Now that the chars are sorted alphabetically, you can search for
      ;; the letters in alphabetical order: d, l, s
      (if (looking-at "\\w*[d]+\\w*[l]+\\w*[s]+\\w*")
          (setq match t)
        (setq match nil)))
    (when match
      (highlight-symbol-at-point))))

(defun my/match-word-whole-buffer ()
  (interactive)
  (beginning-of-buffer)
  (forward-word)
  (while (not (eobp))
    (when (string-match "\\w\\{3,\\}" (thing-at-point 'word))
      (my/match-word))
    (forward-word)))

Drafts Community

forums.getdrafts.com › t › regular-expression-to-find-certain-words-in-a-document › 11378

Regular Expression to find certain words in a document - regular-expressions - Drafts Community

October 17, 2021 - I’m total newbie with regular expressions. What I want to do: This is to help solve NY Times Spelling Bee. In a Drafts doc, I have a running list of words that are useful in solving the bee. About 1200 words now. Fo…

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 33406313 › how-to-match-any-string-from-a-list-of-strings-in-regular-expressions-in-python

regex - How to match any string from a list of strings in regular expressions in python? - Stack Overflow

Top answer

1 of 5

Join the list on the pipe character |, which represents different options in regex.

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))", x)

Output: ['fun']

You cannot use match as it will match from start. Using search you will get only the first match. So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

2 of 5

regex module has named lists (sets actually):

#!/usr/bin/env python
import regex as re # $ pip install regex

p = re.compile(r"\L<words>", words=['fun', 'dum', 'sun', 'gum'])
if p.search("I love to have fun."):
    print('matched')

Here words is just a name, you can use anything you like instead.
.search() methods is used instead of .* before/after the named list.

To emulate named lists using stdlib's re module:

#!/usr/bin/env python
import re

words = ['fun', 'dum', 'sun', 'gum']
longest_first = sorted(words, key=len, reverse=True)
p = re.compile(r'(?:{})'.format('|'.join(map(re.escape, longest_first))))
if p.search("I love to have fun."):
    print('matched')

re.escape() is used to escape regex meta-characters such as .*? inside individual words (to match the words literally).
sorted() emulates regex behavior and it puts the longest words first among the alternatives, compare:

>>> import re
>>> re.findall("(funny|fun)", "it is funny")
['funny']
>>> re.findall("(fun|funny)", "it is funny")
['fun']
>>> import regex
>>> regex.findall(r"\L<words>", "it is funny", words=['fun', 'funny'])
['funny']
>>> regex.findall(r"\L<words>", "it is funny", words=['funny', 'fun'])
['funny']

W3Schools

w3schools.com › python › python_regex.asp

Python RegEx

A special sequence is a \ followed by one of the characters in the list below, and has a special meaning: A set is a set of characters inside a pair of square brackets [] with a special meaning: The findall() function returns a list containing all matches.

Medium

medium.com › @qdangdo › regex-a-way-to-match-any-pattern-of-string-1dd327130fc6

Regex: A Way to Match Any Pattern of String | by Quang Do | Medium

October 5, 2021 - Looking at the first string, you can see that the purple line is before the characters “a” and “g”. That is because there is no word character before “a” and “g”. The second and third example shows the word boundary right after “c” and before “d”. The space and the hyphen between the two letters are not word characters thus a word boundary is established there. We will go over groupings in the next list. [] - Matches Characters in brackets [^ ] - Matches Characters NOT in brackets | - Either Or ( ) - Group

Max Planck Institute

mpi.nl › corpus › html › trova › ch01s04.html

1.4. Regular Expressions

Regular expressions allow users to create complicated queries. Below follows a list of most commonly used regular expressions together with explanations and some potential uses. [abc] means "a or b or c", e.g. query "[br]ang" will match both "adbarnirrang" and "bang"

Stack Overflow

stackoverflow.com › questions › 46114544 › match-any-one-item-in-a-list

regex - Match any one item in a list - Stack Overflow

You would usually match any words from a list of words [a, b, c] using alternation: (?:a|b|c).

DCode

dcode.fr › games and solvers › word search › regex word search

Regex Word Search - Regular Expression - Online Dictionary Finder

Searching for words using a regular ... and selecting all words that match this pattern. A regex can define specific criteria such as word length, the presence or position of certain letters, or even repetitions. Knowledge of the syntax specific to regular expressions is essential to using them effectively. To search for a word using a regular expression, start by entering a pattern describing the word's characteristics. dCode allows you to search the exhaustive list of words matching ...

NTU Singapore

www3.ntu.edu.sg › home › ehchua › programming › howto › Regexe.html

Regular Expression (Regex) Tutorial

Regex is supported in all the scripting ... as Java; and even word processors such as Word for searching texts. Getting started with regex may not be easy due to its geeky syntax, but it is certainly worth the investment of your time. This section is meant for those who need to refresh their memory. For novices, go to the next section to learn the syntax, before looking at these examples. Character: All characters, except those having special meaning in regex, matches ...

freeCodeCamp

forum.freecodecamp.org › python

Matching a list of regex to every string in a list one by one - Python - The freeCodeCamp Forum

Top answer

1 of 1

The problem with your code is your inner for loop will append a value (either a matched value or ‘None’ for each pattern checked. You don’t want that. Your code does correctly append a matched value one time and stops searching the rest of the regex expressions, but if there is no match, your else …

GeeksforGeeks

geeksforgeeks.org › python-check-if-string-matches-regex-list

Python | Check if string matches regex list | GeeksforGeeks

April 23, 2023 - Method : Using join regex + loop + re.match() This task can be performed using combination of above functions. In this, we create a new regex string by joining all the regex list and then match the string against it to check for match using match() with any of the element of regex list...

UI Bakery

uibakery.io › regex-library › match-words

Regex match words

... var wordsRegex = /^\b(?:\w|-)+\b$/; // Validate words wordsRegex.test('word'); // Returns true wordsRegex.test('pet-friendly'); // Returns true wordsRegex.test('not a word'); // Returns false // Extract words from a string var wordsRegexG = /\b(?:\w|-)+\b/g; 'Hello, world!'.match(wordsRegexG); ...

Mozilla

developer.mozilla.org › en-US › docs › Web › JavaScript › Guide › Regular_expressions › Cheatsheet

Regular expression syntax cheat sheet - JavaScript | MDN

2 weeks ago - This page provides an overall cheat sheet of all the capabilities of RegExp syntax by aggregating the content of the articles in the RegExp guide. If you need more information on a specific topic, please follow the link on the corresponding heading to access the full article or head to the guide. Character classes distinguish kinds of characters such as, for example, distinguishing between letters and digits. Assertions include boundaries, which indicate the beginnings and endings of lines and words, and other patterns indicating in some way that a match is possible (including look-ahead, look-behind, and conditional expressions).

Stack Exchange

salesforce.stackexchange.com › questions › 309069 › regex-matching-multiple-words-from-the-list

apex - Regex matching multiple words from the LIST - Salesforce Stack Exchange

Top answer

1 of 2

A simple solution would be to just remove elements from the set as you find matches, and check if the set is empty at the end.

for (String targetValue : targetValues)
    if (result.containsIgnoreCase(targetValue))
        targetValues.remove(targetValue);
Boolean matchesAll = !targetValues.isEmpty();

2 of 2

As a regular expression, you could do this:

Pattern p = Pattern.compile('(?i)(one|two|three)');
Set<String> expected = new Set<String>{'one','two','three'};
Set<String> matches = new Set<String>();
Matcher m = p.matcher(result);
while(m.find()) {
  matches.add(m.group(0).toLowerCase());
}
if(matches == expected) {
  // All matches were found //
}