python str.contains exact match

How to use str.contains to get exact matches and not partial ones?

reddit.com › r › learnpython › comments › qqrq62 › how_to_use_strcontains_to_get_exact_matches_and

If you're looking for exact matches, str.contains may not be the function you should be using. The output looks correct to me in that all of the strings in the output do contain your keyword. Answer from LeBob93 on reddit.com

reddit.com › r/learnpython › how to use str.contains to get exact matches and not partial ones?

r/learnpython on Reddit: How to use str.contains to get exact matches and not partial ones?

November 10, 2021 -

Hi, I don't get why when I use str.contains to get exact matches from a list of keywords, the output still contains partial matches. Here is an extract of what I have (I'm only including one keyword in the list for the example):

keyword= ['SE.TER.ENRL']

subset = df[df['Code'].str.contains('|'.join(keyword), case=False, na=False)]

Output: ['SE.TER.ENRL' 'SE.TER.ENRL.FE' 'SE.TER.ENRL.FE.ZS']

Does anyone know how to get around this?

Thanks!

Top answer

1 of 3

If you're looking for exact matches, str.contains may not be the function you should be using. The output looks correct to me in that all of the strings in the output do contain your keyword.

2 of 3

Add ^ to the start of the regex and $ at the end will sort that out

Stack Overflow

stackoverflow.com › questions › 33193792 › pandas-str-contains-for-exact-matches-of-partial-strings

python - Pandas str.contains for exact matches of partial strings - Stack Overflow

Top answer

1 of 1

You can pass regex=False to avoid confusion in the interpretation of the argument to str.contains:

>>> df.full_path.str.contains(ex)
0    False
1    False
2    False
3    False
4    False
5    False
Name: full_path, dtype: bool
>>> df.full_path.str.contains(ex, regex=False)
0    False
1    False
2    False
3    False
4    False
5     True
Name: full_path, dtype: bool

(Aside: your lambda x: ex in x should have worked. The NameError is a sign that you hadn't defined ex for some reason.)

Videos

02:00

Realpython

Check if a Python String Contains a Substring (Overview) (Video) ...

April 4, 2023

08:20

YouTube

How to check if a string contains a substring in Python | Python ...

Check If A String Contains A Substring In Python - YouTube

May 29, 2021

6.06K

youtube.com

How to Find in String using Python | String Contains Substring ...

December 19, 2024

04:31

Realpython

Confirm the Presence of a Substring (Video) – Real Python

April 4, 2023

View all

Pandas

pandas.pydata.org › docs › reference › api › pandas.Series.str.contains.html

pandas.Series.str.contains — pandas 3.0.1 documentation

>>> s1.str.contains("house|dog", regex=True) 0 False 1 True 2 True 3 False 4 False dtype: bool

reddit.com › r/learnpython › how do i find strings in a row that are an exact match using pandas str.match or str.contains

r/learnpython on Reddit: How do I find strings in a row that are an exact match using pandas str.match or str.contains

June 15, 2017 -

My problem is using str.contains or str.match returns rows that contain even substrings of the string I am looking for. new_dataframe = df[df['number'].str.match(number)]

I want only the rows that are an exact match for the string.

Top answer

1 of 2

I got it. Man that was causing my code fits for a few strings :$ new_dataframe = df[df['number'] == number]

2 of 2

You should use regex for this. It will give you the exact matching part as well.

reddit.com › r/learnpython › match exact strings from a list in a pandas string column

r/learnpython on Reddit: match exact strings from a list in a pandas string column

July 19, 2021 -

Is there a way to match a list of strings exactly with the strings in a pandas column to filter out the ones that do not have?

Say, words = ['ab', 'ml']

df =

data
'example string ab'
'absolute value'

After filtering, I must get only the row with value 'example string ab' for it contains exact string 'ab' from the list 'words'.

Top answer

1 of 1

df[df['data'].str.contains('|'.join(words))]

Note.nkmk.me

note.nkmk.me › home › python › pandas

pandas: Extract rows that contain specific strings from a DataFrame | note.nkmk.me

July 30, 2023 - ... print(df['state'].isin(['NY', ... # 2 Charlie 18 CA 70 ... By using str.contains(), you can generate a Series where elements that contain a given substring are True....

Pandas

pandas.pydata.org › pandas-docs › version › 0.22 › generated › pandas.Series.str.match.html

pandas.Series.str.match — pandas 0.22.0 documentation

Series.str.match(pat, case=True, flags=0, na=nan, as_indexer=None)[source]¶

Note.nkmk.me

note.nkmk.me › home › python

String Comparison in Python (Exact/Partial Match, etc.) | note.nkmk.me

April 29, 2025 - This article explains string comparisons in Python, covering topics such as exact matches, partial matches, forward/backward matches, and more. Exact match (equality comparison): ==, != Partial match: ...

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 18632491 › how-do-i-check-for-an-exact-word-in-a-string-in-python

How do I check for an exact word or phrase in a string in Python ...

Top answer

1 of 8

You can use the word-boundaries of regular expressions. Example:

import re

s = '98787This is correct'
for words in ['This is correct', 'This', 'is', 'correct']:
    if re.search(r'\b' + words + r'\b', s):
        print('{0} found'.format(words))

That yields:

is found
correct found

For an exact match, replace \b assertions with ^ and $ to restrict the match to the begin and end of line.

2 of 8

Use the comparison operator == instead of in then:

if text == 'This is correct':
    print("Correct")

This will check to see if the whole string is just 'This is correct'. If it isn't, it will be False

Parallax

learn.parallax.com › tutorials › robot › cyberbot › strings-characters-primer › compare-find-check › your-turn-exact-match-vs-found

Your Turn: Exact Match vs Found in String | LEARN.PARALLAX.COM

One way to solve this is to use the is equal to == operator to check if the string is an exact match. If it isn't, then use the string.find() method to check if the substring is anywhere in the string. Enter, name, and save comp_find_check_your_turn. Click the Send to micro:bit button.

Stack Exchange

codereview.stackexchange.com › questions › 247434 › python-use-a-list-of-names-to-find-exact-match-in-pandas-column-containing-ema

Python - use a list of names to find exact match in pandas column containing emails - Code Review Stack Exchange

Top answer

1 of 2

It sounds like the thing you're trying to do is somewhat insane. With 40k first names to search for, false positives are inevitable. At the same time, with only 40k names, false negatives are also inevitable. People's names are untidy; hopefully you have plans to accommodate. Even when you get correct matches for a "first" and "last" name, as your example email shows, there's no guarantee that they'll be the first and last names of the same person.

Maybe someone with experience in natural-language-processing AI would be able to solve your problem in a robust way. More likely you've resigned yourself to a solution that simply isn't robust. You still pretty definitely need case-sensitivity and "whole word" matching.

I'm not convinced by the example you give of a false positive. The pandas function you're using is regex-based. r'tero' does not match 't er o'; it does match 'interoperability'. With name lists as long as you're using, it seems more likely that you over-looked some other match in the email in question. I would kinda expect just a few of the names to be responsible for the majority of false-positives; outputting the matched text will help you identify them.

Case-sensitive regex matching should be the default.
I think \b...\b as a regex pattern will give the kind of "whole word" matching you need.
pandas.extract will do the capturing.

Given the size of your datasets, you may be a bit concerned with the performance. Or you may not, it's up to you.

I haven't tested this at all:

# Import datasets and create lists/variables
import pandas as pd
from pandas import ExcelWriter
from typing import Iterable

# Document, sheet, and column names:
names_source_file = 'names.xlsx'
first_names_sheet = 'Alle Navne'
first_names_column = 'Names'
last_names_sheet = 'Frie Efternavne'
last_names_column = 'Frie Efternavne'
subject_file = 'Entreprise Beskeder.xlsx'
subject_sheet = 'dataark'
subject_column = 'Besked'
output_first_name = 'Navner'
output_last_name = 'Efternavner'
output_file = 'PythonExport.xlsx'

# Build (very large!) search patterns:
first_names_df = pd.read_excel(names_file, sheet_name=first_names_sheet)
first_names: Iterable[str] = namesdf[first_names_column]
first_names_regex = '''\b{}\b'''.format('|'.join(first_names))
last_names_df = pd.read_excel(names_file, sheet_name=last_names_sheet)
last_names: Iterable[str] = lastnamesdf[last_names_column]
last_names_regex = '''\b{}\b'''.format('|'.join(last_names))

# Import dataset and drop NULLS:
data_frame = pd.read_excel(subject_file, sheet_name=subject_sheet)
data_frame[subject_column].dropna(inplace=True)

# Add columns for found first and last names:
data_frame[output_first_name] = data_frame[subject_column].str.extract(
    first_names_regex,
    expand=False
)
data_frame[output_last_name] = data_frame[subject_column].str.extract(
    last_names_regex,
    expand=False
)

# Save the result
writer = ExcelWriter(output_file)
df.to_excel(writer)
writer.save()

One obvious problem that I still haven't talked about is that there may be multiple name matches in a given subject. Assuming that you care about multiple matches, you can probably do something with extractall.

2 of 2

To see what is being matched, use apply() with a python function:

import re

regex = re.compile(pat)

def search(item):
    mo = regex.search(item)
    if mo:
        return mo[0]
    else:
        return ''

df.msg.apply(search)

This will yield a Series with the names that matched or '' if there isn't a match.

reddit.com › r/learnpython › check if substring contains exact match in first 4 characters

r/learnpython on Reddit: Check if substring contains exact match in first 4 characters

March 8, 2019 -

Hi,

I have a dataframe with columns made up of strings and date which I'd like to create a new dataframe with the condition of column B containing a specific set of string in the first four characters.

The dataframe looks like this:

A	B	C
['textstring1, 'textstring2',...,'textstringN']	['1234-5678-9']	2018-01-23
['textstring1, 'textstring2',...,'textstringN']	['9876-5432-1]	2018-02-12

And I wish to create a dataframe with the rows containing '1234' in the first four characters of the cells in column B.

The code I have so far looks like this (example)

import pandas as pd
df = pd.DataFrame(["['1234-9493']", "['1254-1234']", "['3838-1234']", "['1235-3845']"])
df_sorted = df[(df[0].str.contains('1234'))]
df_sorted

However... it doesn't take the position into account and the output looks like:

	0
0	1234-9493
1	1254-1234
2	3838-1234

How I wish it would look like:

	0
0	1234-9493

How can I change the code to take the position of the substring into account?

Top answer

1 of 1

.startswith('1234')

Johns Hopkins University

cs.jhu.edu › ~langmea › resources › lecture_notes › 03_strings_exact_matching_v2.pdf pdf

Strings and Exact Matching Ben Langmead Department of Computer Science

Strings and Exact Matching · Department of Computer Science

GoLinuxCloud

golinuxcloud.com › home › python › check if python string contains substring [5 methods]

Check if Python String contains Substring [5 Methods] | GoLinuxCloud

January 9, 2024 - Raises an Exception for ‘Not ... additional exception handling logic. The str.count() method in Python is used to count the occurrences of a substring in a given string....

Stack Overflow

stackoverflow.com › questions › 69875045 › how-to-check-if-list-contains-exact-match-of-string

python - How to check if list contains exact match of string? - Stack Overflow

Top answer

1 of 5

fruitlist is a string, not a list.
fruitlist = str(sys.argv[2:]).upper() converts the sys.argv to str then applies the upper case.
to avoid this you can do this instead:

fruitlist = [x.upper() for x in sys.argv[2:]]

full code:

import sys
fruitlist = [x.upper() for x in sys.argv[2:]]
print(sys.argv[1])
print(fruitlist)
if sys.argv[1].strip() in fruitlist:
        print(sys.argv[1], 'exact match found in list')

2 of 5

Your fruitlist isn't actually a list; it is a string. Here is the correct code, which makes it a list not a string:

import sys
fruitlist = [str(a).upper() for a in sys.argv[2:]]
print(sys.argv[1])
print(fruitlist)
if sys.argv[1].strip() in fruitlist:
        print(sys.argv[1], 'exact match found in list')

GeeksforGeeks

geeksforgeeks.org › python › check-if-string-contains-substring-in-python

Check if String Contains Substring in Python - GeeksforGeeks

December 20, 2025 - The index() method works similarly ... It is useful when you need the exact position of the substring and want an explicit error if it’s missing. ... If 'Kingdom' does not exist, it raises a ValueError. Python - Test if string contains element from list...

Codecademy

codecademy.com › article › how-to-check-if-a-string-contains-a-substring-in-python

How to Check if a String Contains a Substring in Python | Codecademy

Next, let’s go through an example that uses the .contains() method to check for a substring in a string: ... In all the methods that we’ve discussed so far, we’ve kept matching cases for strings and substrings. But what happens if the cases don’t match? Let’s discuss that in the next section. Python...

Programiz

programiz.com › python-programming › pandas › methods › str-contains

Pandas str.contains() (With Examples)

data.str.contains('a') - only returns True for elements where a appears in the exact case specified (lowercase a). data.str.contains('a', case=False) - ignores the case of a, thus matching both a and A in any element of the data Series.

Stack Overflow

stackoverflow.com › questions › 44254816 › exact-match-of-string-in-pandas-python

regex - Exact match of string in pandas python - Stack Overflow

Top answer

1 of 3

You could simply use ==

string_a == string_b

It should return True if the two strings are equal. But this does not solve your issue.

Edit 2: You should use len(df1.index) instead of len(df1.columns). Indeed, len(df1.columns) will give you the number of columns, and not the number of rows.

Edit 3: After reading your second post, I've understood your problem. The solution you propose could lead to some errors. For instance, if you have:

ls=['[email protected]','[email protected]', '[email protected]']

the first and the third element will match str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]) And this is an unwanted behaviour.

You could add a check on the end of the string: str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')

Like this:

for i in range(len(ls)):
  df1 = df[df['A'].str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')]
  if len(df1.index != 0):
      print (ls[i])

(Remove parenthesis in the "print" if you use python 2.7)

2 of 3

Thanks for the help. But seems like I found a solution that is working as of now.

Must use str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]) This seems to solve the problem.

Although thanks to @IsaacDj for his help.

Quora

quora.com › In-Python-how-do-you-check-for-an-exact-string-match-while-ignoring-case-sensitivity

In Python, how do you check for an exact string match while ignoring case sensitivity? - Quora

Answer (1 of 6): A few helpful ideas. Python re is very forgiving just replace occurrences using re.sub then no need of a match before having to further process after a match. And if it finds no sub match or search it wont throw an error. If re doesn't find a sub it skips it anyways. And re.find...