If you're looking for exact matches, str.contains may not be the function you should be using. The output looks correct to me in that all of the strings in the output do contain your keyword. Answer from LeBob93 on reddit.com
🌐
Reddit
reddit.com › r/learnpython › how to use str.contains to get exact matches and not partial ones?
r/learnpython on Reddit: How to use str.contains to get exact matches and not partial ones?
November 10, 2021 -

Hi, I don't get why when I use str.contains to get exact matches from a list of keywords, the output still contains partial matches. Here is an extract of what I have (I'm only including one keyword in the list for the example):

keyword= ['SE.TER.ENRL']

subset = df[df['Code'].str.contains('|'.join(keyword), case=False, na=False)]

Output: ['SE.TER.ENRL' 'SE.TER.ENRL.FE' 'SE.TER.ENRL.FE.ZS']

Does anyone know how to get around this?

Thanks!

🌐
Reddit
reddit.com › r/learnpython › match exact strings from a list in a pandas string column
r/learnpython on Reddit: match exact strings from a list in a pandas string column
July 19, 2021 -

Is there a way to match a list of strings exactly with the strings in a pandas column to filter out the ones that do not have?

Say, words = ['ab', 'ml']

df =

data
'example string ab'
'absolute value'

After filtering, I must get only the row with value 'example string ab' for it contains exact string 'ab' from the list 'words'.

🌐
Note.nkmk.me
note.nkmk.me › home › python › pandas
pandas: Extract rows that contain specific strings from a DataFrame | note.nkmk.me
July 30, 2023 - ... print(df['state'].isin(['NY', ... # 2 Charlie 18 CA 70 ... By using str.contains(), you can generate a Series where elements that contain a given substring are True....
🌐
Note.nkmk.me
note.nkmk.me › home › python
String Comparison in Python (Exact/Partial Match, etc.) | note.nkmk.me
April 29, 2025 - This article explains string comparisons in Python, covering topics such as exact matches, partial matches, forward/backward matches, and more. Exact match (equality comparison): ==, != Partial match: ...
Find elsewhere
🌐
Parallax
learn.parallax.com › tutorials › robot › cyberbot › strings-characters-primer › compare-find-check › your-turn-exact-match-vs-found
Your Turn: Exact Match vs Found in String | LEARN.PARALLAX.COM
One way to solve this is to use the is equal to == operator to check if the string is an exact match. If it isn't, then use the string.find() method to check if the substring is anywhere in the string. Enter, name, and save comp_find_check_your_turn. Click the Send to micro:bit button.
Top answer
1 of 2
3

It sounds like the thing you're trying to do is somewhat insane. With 40k first names to search for, false positives are inevitable. At the same time, with only 40k names, false negatives are also inevitable. People's names are untidy; hopefully you have plans to accommodate. Even when you get correct matches for a "first" and "last" name, as your example email shows, there's no guarantee that they'll be the first and last names of the same person.

Maybe someone with experience in natural-language-processing AI would be able to solve your problem in a robust way. More likely you've resigned yourself to a solution that simply isn't robust. You still pretty definitely need case-sensitivity and "whole word" matching.

I'm not convinced by the example you give of a false positive. The pandas function you're using is regex-based. r'tero' does not match 't er o'; it does match 'interoperability'. With name lists as long as you're using, it seems more likely that you over-looked some other match in the email in question. I would kinda expect just a few of the names to be responsible for the majority of false-positives; outputting the matched text will help you identify them.

  • Case-sensitive regex matching should be the default.
  • I think \b...\b as a regex pattern will give the kind of "whole word" matching you need.
  • pandas.extract will do the capturing.

Given the size of your datasets, you may be a bit concerned with the performance. Or you may not, it's up to you.

I haven't tested this at all:

# Import datasets and create lists/variables
import pandas as pd
from pandas import ExcelWriter
from typing import Iterable

# Document, sheet, and column names:
names_source_file = 'names.xlsx'
first_names_sheet = 'Alle Navne'
first_names_column = 'Names'
last_names_sheet = 'Frie Efternavne'
last_names_column = 'Frie Efternavne'
subject_file = 'Entreprise Beskeder.xlsx'
subject_sheet = 'dataark'
subject_column = 'Besked'
output_first_name = 'Navner'
output_last_name = 'Efternavner'
output_file = 'PythonExport.xlsx'

# Build (very large!) search patterns:
first_names_df = pd.read_excel(names_file, sheet_name=first_names_sheet)
first_names: Iterable[str] = namesdf[first_names_column]
first_names_regex = '''\b{}\b'''.format('|'.join(first_names))
last_names_df = pd.read_excel(names_file, sheet_name=last_names_sheet)
last_names: Iterable[str] = lastnamesdf[last_names_column]
last_names_regex = '''\b{}\b'''.format('|'.join(last_names))

# Import dataset and drop NULLS:
data_frame = pd.read_excel(subject_file, sheet_name=subject_sheet)
data_frame[subject_column].dropna(inplace=True)

# Add columns for found first and last names:
data_frame[output_first_name] = data_frame[subject_column].str.extract(
    first_names_regex,
    expand=False
)
data_frame[output_last_name] = data_frame[subject_column].str.extract(
    last_names_regex,
    expand=False
)

# Save the result
writer = ExcelWriter(output_file)
df.to_excel(writer)
writer.save()

One obvious problem that I still haven't talked about is that there may be multiple name matches in a given subject. Assuming that you care about multiple matches, you can probably do something with extractall.

2 of 2
2

To see what is being matched, use apply() with a python function:

import re

regex = re.compile(pat)

def search(item):
    mo = regex.search(item)
    if mo:
        return mo[0]
    else:
        return ''

df.msg.apply(search)

This will yield a Series with the names that matched or '' if there isn't a match.

🌐
Reddit
reddit.com › r/learnpython › check if substring contains exact match in first 4 characters
r/learnpython on Reddit: Check if substring contains exact match in first 4 characters
March 8, 2019 -

Hi,

I have a dataframe with columns made up of strings and date which I'd like to create a new dataframe with the condition of column B containing a specific set of string in the first four characters.

The dataframe looks like this:

ABC
['textstring1, 'textstring2',...,'textstringN']['1234-5678-9']2018-01-23
['textstring1, 'textstring2',...,'textstringN']['9876-5432-1]2018-02-12

And I wish to create a dataframe with the rows containing '1234' in the first four characters of the cells in column B.

The code I have so far looks like this (example)

import pandas as pd
df = pd.DataFrame(["['1234-9493']", "['1254-1234']", "['3838-1234']", "['1235-3845']"])
df_sorted = df[(df[0].str.contains('1234'))]
df_sorted

However... it doesn't take the position into account and the output looks like:

0
01234-9493
11254-1234
23838-1234

How I wish it would look like:

0
01234-9493

How can I change the code to take the position of the substring into account?

🌐
GoLinuxCloud
golinuxcloud.com › home › python › check if python string contains substring [5 methods]
Check if Python String contains Substring [5 Methods] | GoLinuxCloud
January 9, 2024 - Raises an Exception for ‘Not ... additional exception handling logic. The str.count() method in Python is used to count the occurrences of a substring in a given string....
🌐
GeeksforGeeks
geeksforgeeks.org › python › check-if-string-contains-substring-in-python
Check if String Contains Substring in Python - GeeksforGeeks
December 20, 2025 - The index() method works similarly ... It is useful when you need the exact position of the substring and want an explicit error if it’s missing. ... If 'Kingdom' does not exist, it raises a ValueError. Python - Test if string contains element from list...
🌐
Codecademy
codecademy.com › article › how-to-check-if-a-string-contains-a-substring-in-python
How to Check if a String Contains a Substring in Python | Codecademy
Next, let’s go through an example that uses the .contains() method to check for a substring in a string: ... In all the methods that we’ve discussed so far, we’ve kept matching cases for strings and substrings. But what happens if the cases don’t match? Let’s discuss that in the next section. Python...
🌐
Programiz
programiz.com › python-programming › pandas › methods › str-contains
Pandas str.contains() (With Examples)
data.str.contains('a') - only returns True for elements where a appears in the exact case specified (lowercase a). data.str.contains('a', case=False) - ignores the case of a, thus matching both a and A in any element of the data Series.
🌐
Quora
quora.com › In-Python-how-do-you-check-for-an-exact-string-match-while-ignoring-case-sensitivity
In Python, how do you check for an exact string match while ignoring case sensitivity? - Quora
Answer (1 of 6): A few helpful ideas. Python re is very forgiving just replace occurrences using re.sub then no need of a match before having to further process after a match. And if it finds no sub match or search it wont throw an error. If re doesn't find a sub it skips it anyways. And re.find...