Use contains for boolean mask and then numpy.where:

m = df['a'].str.contains('foo') & (df['b'] == 'bar')
print (m)
0     True
1    False
2    False
dtype: bool

df['new'] = np.where(m, 'yes', 'no')
print (df)
        a       b    c  new
0     foo     bar  baz  yes
1     bar     foo  baz   no
2  foobar  barfoo  baz   no

Or if need alo check column b for substrings:

m = df['a'].str.contains('foo') & df['b'].str.contains('bar')
df['new'] = np.where(m, 'yes', 'no')
print (df)
        a       b    c  new
0     foo     bar  baz  yes
1     bar     foo  baz   no
2  foobar  barfoo  baz  yes

If need custom function, what should be slowier in bigger DataFrame:

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

print (df.apply(somefunction, axis=1))
0    yes
1     no
2     no
dtype: object

def somefunction (row):
    if 'foo' in row['a']  and  'bar' in row['b']:
        return 'yes'
    return 'no'

print (df.apply(somefunction, axis=1))
0    yes
1     no
2    yes
dtype: object

Timings:

df = pd.concat([df]*1000).reset_index(drop=True)

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

In [269]: %timeit df['new'] = df.apply(somefunction, axis=1)
10 loops, best of 3: 60.7 ms per loop

In [270]: %timeit df['new1'] = np.where(df['a'].str.contains('foo') & (df['b'] == 'bar'), 'yes', 'no')
100 loops, best of 3: 3.25 ms per loop

df = pd.concat([df]*10000).reset_index(drop=True)

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

In [272]: %timeit df['new'] = df.apply(somefunction, axis=1)
1 loop, best of 3: 614 ms per loop

In [273]: %timeit df['new1'] = np.where(df['a'].str.contains('foo') & (df['b'] == 'bar'), 'yes', 'no')
10 loops, best of 3: 23.5 ms per loop
Answer from jezrael on Stack Overflow
🌐
Note.nkmk.me
note.nkmk.me › home › python
String Comparison in Python (Exact/Partial Match, etc.) | note.nkmk.me
April 29, 2025 - This operation is case-sensitive, as are other comparison operators and methods. Case-insensitive comparisons are discussed later. ... To check for partial matches, use the in operator, which determines if one string contains another string.
Top answer
1 of 2
3

Use contains for boolean mask and then numpy.where:

m = df['a'].str.contains('foo') & (df['b'] == 'bar')
print (m)
0     True
1    False
2    False
dtype: bool

df['new'] = np.where(m, 'yes', 'no')
print (df)
        a       b    c  new
0     foo     bar  baz  yes
1     bar     foo  baz   no
2  foobar  barfoo  baz   no

Or if need alo check column b for substrings:

m = df['a'].str.contains('foo') & df['b'].str.contains('bar')
df['new'] = np.where(m, 'yes', 'no')
print (df)
        a       b    c  new
0     foo     bar  baz  yes
1     bar     foo  baz   no
2  foobar  barfoo  baz  yes

If need custom function, what should be slowier in bigger DataFrame:

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

print (df.apply(somefunction, axis=1))
0    yes
1     no
2     no
dtype: object

def somefunction (row):
    if 'foo' in row['a']  and  'bar' in row['b']:
        return 'yes'
    return 'no'

print (df.apply(somefunction, axis=1))
0    yes
1     no
2    yes
dtype: object

Timings:

df = pd.concat([df]*1000).reset_index(drop=True)

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

In [269]: %timeit df['new'] = df.apply(somefunction, axis=1)
10 loops, best of 3: 60.7 ms per loop

In [270]: %timeit df['new1'] = np.where(df['a'].str.contains('foo') & (df['b'] == 'bar'), 'yes', 'no')
100 loops, best of 3: 3.25 ms per loop

df = pd.concat([df]*10000).reset_index(drop=True)

def somefunction (row):
    if 'foo' in row['a'] and row['b'] == 'bar':
        return 'yes'
    return 'no'

In [272]: %timeit df['new'] = df.apply(somefunction, axis=1)
1 loop, best of 3: 614 ms per loop

In [273]: %timeit df['new1'] = np.where(df['a'].str.contains('foo') & (df['b'] == 'bar'), 'yes', 'no')
10 loops, best of 3: 23.5 ms per loop
2 of 2
1

Your exception is probably from the fact that you write

if row['a'].str.contains('foo')==True

Remove '.str':

if row['a'].contains('foo')==True
Discussions

Partial string matches in structural pattern matching - Ideas - Discussions on Python.org
Hi there, I’d love to be able to use partial string matches: match text: case "prefix_" + cmd: # checking prefix print("got", cmd) case "Hello " + last_name + ", " + first_name + "!": # more complex e… More on discuss.python.org
🌐 discuss.python.org
2
July 20, 2023
Python search strings with partial match
string1 = 'First/Second/Third/Fourth/Fifth' string2 = 'Second/Third/SomethingElse/Etc' Is there a native way in Python to search for partial matches stating at any point and ending at any point? In the examples above there is a partial match of 'Second/Third' between the two strings, then I ... More on forum.inductiveautomation.com
🌐 forum.inductiveautomation.com
1
0
March 9, 2023
python - How to retrieve partial matches from a list of strings - Stack Overflow
For approaches to retrieving partial matches in a numeric list, go to: How to return a subset of a list that matches a condition? Python: Find in list But if you're looking for how to retrieve pa... More on stackoverflow.com
🌐 stackoverflow.com
How to efficiently match strings between two big lists with python? (510.000.000 comparisons)
it results in 510.000.000 comparisons. This seems computational too expensive. Aside from all the ideas on how to do this this comment sticks out. 510_000_000 comparisons does not seem large. Is it something you do a lot? Does the data change frequently, is either set of data somewhat static? Just ran it on my machine, ran in 4 seconds on heavily duplicate data. More on reddit.com
🌐 r/learnpython
20
6
October 6, 2020
🌐
Reddit
reddit.com › r/rstats › what's the best method of partial string matching?
r/rstats on Reddit: What's the best method of partial string matching?
May 4, 2022 -

I have a dataframe with a few million rows of names and accompanying columns with relevant info. I want to narrow down the dataframe to only include names from a list of 2,000 names. What's the best method of going about this when I have middle names and states to help distinguish between duplicate names?

Here's an example of the list of names:

John Smith Alabama R
John Smith Alabama
Jeremy Smith Washington P

What I want to do is first match the name and state to the dataframe if there is a middle initial match (the last letter in the list name if there is a middle name). If not, then I would just like to match by the name and state.

Here's what I tried so far:

df2 <- df[grep(paste(list_of_names, collapse = "|"), df$name_state_middle_initial),]

However, I'm only getting complete string matches with the above code. Any help would be great!

🌐
Medium
chestadhingra25.medium.com › partial-string-matching-and-deduplication-using-python-1d79d67a0b7c
Partial String Matching and DeDuplication using Python | by Chesta Dhingra | Medium
January 30, 2023 - In this article we’ll be looking for finding the duplicates among textual data via exact or partial match. The idea for this article came when I was working on a project and have to find the dedupes of some company names or addresses present in a dataframe. That makes the data quite inconsistent. Manual cleaning on small data is easy to accomplish but when we deal with big data then it is not going to be an easy task. In that case some python functions and library like fuzzywuzzy, multiprocessing comes in handy.
🌐
Mindee
mindee.com › blog of mindee › partial string matching: jaccard, sub-string percentage, levenshtein distance and regex
Boost Data Precision with Partial String Matching Techniques in Python
July 8, 2025 - Using maximum length: A score of 100% is possible only when the two strings are exactly the same. Here is a python implementation of this method using difflib: from difflib import SequenceMatcher def longest_common_substring(s1: str, s2: str) -> str: """Computes the longest common substring of s1 and s2""" seq_matcher = SequenceMatcher(isjunk=None, a=s1, b=s2) match = seq_matcher.find_longest_match(0, len(s1), 0, len(s2)) if match.size: return s1[match.a : match.a + match.size] else: return "" def longest_common_substring_percentage(s1 : str, s2 : str) -> float: """Computes the longest common substring percentage of s1 and s2""" assert min(len(s1), len(s2)) > 0, "One of the given string is empty" return len(longest_common_substring(s1, s2))/min(len(s1), len(s2))
🌐
Python documentation
docs.python.org › 3 › library › re.html
re — Regular expression operations
3 days ago - Source code: Lib/re/ This module provides regular expression matching operations similar to those found in Perl. Both patterns and strings to be searched can be Unicode strings ( str) as well as 8-...
🌐
Python.org
discuss.python.org › ideas
Partial string matches in structural pattern matching - Ideas - Discussions on Python.org
July 20, 2023 - Hi there, I’d love to be able to use partial string matches: match text: case "prefix_" + cmd: # checking prefix print("got", cmd) case "Hello " + last_name + ", " + first_name + "!": # more complex e…
Find elsewhere
🌐
DataCamp
datacamp.com › tutorial › fuzzy-string-python
Fuzzy String Matching in Python Tutorial | DataCamp
February 6, 2019 - The partial_ratio() calculates ... in this instance. Therefore, to get a 100% similarity match, you would have to move the "K D" part (signifying my middle name) to the end of the string....
🌐
GeeksforGeeks
geeksforgeeks.org › python › python-get-matching-substrings-in-string
Python | Get matching substrings in string - GeeksforGeeks
March 24, 2023 - Method #2: Using filter() + lambda This task can also be performed using the filter function which performs the task of filtering out the resultant strings that is checked for existence using the lambda function. ... # Python3 code to demonstrate working of # Get matching substrings in string # Using lambda and filter() # initializing string test_str = "GfG is good website"; # initializing potential substrings test_list = ["GfG", "site", "CS", "Geeks", "Tutorial"] # printing original string print("The original string is : " + test_str) # printing potential strings list print("The original list is : " + str(test_list)) # using lambda and filter() # Get matching substrings in string res = list(filter(lambda x: x in test_str, test_list)) # printing result print("The list of found substrings : " + str(res))
🌐
Finxter
blog.finxter.com › home › learn python blog › how to find a partial string in a python list?
How to Find a Partial String in a Python List? - Be on the Right Side of Change
September 10, 2022 - The most Pythonic way to find a list of partial matches of a given string query in a string list lst is to use the membership operator in and the list comprehension statement like so: [s for s in lst if query in s].
🌐
W3Schools
w3schools.com › python › python_regex.asp
Python RegEx
Print the part of the string where there was a match.
🌐
YouTube
youtube.com › codetube
python string partial match - YouTube
Download this code from https://codegive.com In this tutorial, we will explore the concept of partial string matching in Python. Partial string matching invo...
Published   December 24, 2023
Views   15
Top answer
1 of 5
53
  • startswith and in, return a Boolean.
  • The in operator is a test of membership.
  • This can be performed with a list-comprehension or filter.
  • Using a list-comprehension, with in, is the fastest implementation tested.
  • If case is not an issue, consider mapping all the words to lowercase.
    • l = list(map(str.lower, l)).
  • Tested with python 3.11.0

filter:

  • Using filter creates a filter object, so list() is used to show all the matching values in a list.
l = ['ones', 'twos', 'threes']
wanted = 'three'

# using startswith
result = list(filter(lambda x: x.startswith(wanted), l))

# using in
result = list(filter(lambda x: wanted in x, l))

print(result)
[out]:
['threes']

list-comprehension

l = ['ones', 'twos', 'threes']
wanted = 'three'

# using startswith
result = [v for v in l if v.startswith(wanted)]

# using in
result = [v for v in l if wanted in v]

print(result)
[out]:
['threes']

Which implementation is faster?

  • Tested in Jupyter Lab using the words corpus from nltk v3.7, which has 236736 words
  • Words with 'three'
    • ['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']
from nltk.corpus import words

%timeit list(filter(lambda x: x.startswith(wanted), words.words()))
%timeit list(filter(lambda x: wanted in x, words.words()))
%timeit [v for v in words.words() if v.startswith(wanted)]
%timeit [v for v in words.words() if wanted in v]

%timeit results

62.8 ms ± 816 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53.8 ms ± 982 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
56.9 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
47.5 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
2 of 5
10

A simple, direct answer:

test_list = ['one', 'two','threefour']
r = [s for s in test_list if s.startswith('three')]
print(r[0] if r else 'nomatch')

Result:

threefour

Not sure what you want to do in the non-matching case. r[0] is exactly what you asked for if there is a match, but it's undefined if there is no match. The print deals with this, but you may want to do so differently.

🌐
Reddit
reddit.com › r/learnpython › how to efficiently match strings between two big lists with python? (510.000.000 comparisons)
r/learnpython on Reddit: How to efficiently match strings between two big lists with python? (510.000.000 comparisons)
October 6, 2020 -

I am facing the problem of a very long running for loop.

There are two python lists (A and B):

A contains around 170.000 strings with lengths between 1 and 100 characters. B contains around 3.000 strings with the same length variety.

Now i need to find items from list A which contain one item from list B.

Considering that each string from A needs to be compared with each string from B it results in 510.000.000 comparisons. This seems computational too expensive.

What possibilities are there to speed things up?
I don't want to stop after the first match as there could be more matches. The goal is to store all matches in some new variable/db.

Pseudo-code:

A = []  # length: 170.000 (strings)
B = []  # length: 3.000 (strings)

for item in A:
    for element in B:
        if element in item:
            print("store the item which contains the element to db")

# Some sample content
A[0] = "This is some random text in which I want to find words"
A[1] = "It is just some random text"
...
B[0] = "text"
B[1] = "some random text"
...
🌐
Real Python
realpython.com › python-string-contains-substring
How to Check if a Python String Contains a Substring – Real Python
December 1, 2024 - In this case, you get another match and not a ValueError. That means that the text contains the substring more than once. But how often is it in there? You can use .count() to get your answer quickly using descriptive and idiomatic Python code: ... You used .count() on the lowercase string and passed the substring "secret" as an argument.
🌐
Stack Overflow
stackoverflow.com › questions › 68089019 › how-to-get-partial-match-of-a-substring-in-a-list-of-string
python - How to get partial match of a substring in a list of string? - Stack Overflow
I wanted to see if the second list contains any word from the first list (even partial matches as it contains dot and hyphen). How can I achieve this? ... This is practically a duplicate of Check if substring is in a list of strings?, you just need to add a loop over the first list.
🌐
Analytics Vidhya
analyticsvidhya.com › home › learn how to check if a string contains a substring in python
Learn How to Check If a String Contains a Substring in Python - Analytics Vidhya
January 10, 2024 - For example, handle cases where the substring is an empty string, handle exceptions raised by the index() method and consider the behavior of the methods when dealing with special characters or escape sequences. Different methods for substring checking have different performance characteristics. The ‘in’ operator and the find() method are generally faster than the index() method and regular expressions. However, regular expressions provide more flexibility and advanced pattern-matching capabilities.
🌐
DigitalOcean
digitalocean.com › community › tutorials › python-string-comparison
Python Compare Strings - Methods & Best Practices | DigitalOcean
April 17, 2025 - Learn how to compare strings in Python using ==, !=, startswith(), endswith(), and more. Find the best approach for your use case with examples.
🌐
Medium
medium.com › @MehrazHossainRumman › how-to-filter-data-by-partial-match-in-python-38b194df943e
How to Filter Data by Partial Match in Python | by Mehraz Hossain Rumman | Medium
August 11, 2024 - The filter_by_partial_match function is a simple yet powerful tool for filtering lists of dictionaries by partial string matches. Whether you’re building a search feature or just need to sift through data, this function can help you streamline your code and improve efficiency.