python string in operator time complexity

Python string 'in' operator implementation algorithm and time complexity

stackoverflow.com › questions › 18139660 › python-string-in-operator-implementation-algorithm-and-time-complexity

It's a combination of Boyer-Moore and Horspool.

You can view the C code here:

Fast search/count implementation, based on a mix between Boyer-Moore and Horspool, with a few more bells and whistles on the top. For some more background, see: https://web.archive.org/web/20201107074620/http://effbot.org/zone/stringlib.htm.

From the link above:

When designing the new algorithm, I used the following constraints:

should be faster than the current brute-force algorithm for all test cases (based on real-life code), including Jim Hugunin’s worst-case test

small setup overhead; no dynamic allocation in the fast path (O(m) for speed, O(1) for storage)

sublinear search behaviour in good cases (O(n/m))

no worse than the current algorithm in worst case (O(nm))

should work well for both 8-bit strings and 16-bit or 32-bit Unicode strings (no O(σ) dependencies)

many real-life searches should be good, very few should be worst case

reasonably simple implementation

Answer from arshajii on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 18139660 › python-string-in-operator-implementation-algorithm-and-time-complexity

Python string 'in' operator implementation algorithm and time complexity - Stack Overflow

Top answer

1 of 2

58

It's a combination of Boyer-Moore and Horspool.

You can view the C code here:

Fast search/count implementation, based on a mix between Boyer-Moore and Horspool, with a few more bells and whistles on the top. For some more background, see: https://web.archive.org/web/20201107074620/http://effbot.org/zone/stringlib.htm.

From the link above:

When designing the new algorithm, I used the following constraints:

should be faster than the current brute-force algorithm for all test cases (based on real-life code), including Jim Hugunin’s worst-case test

small setup overhead; no dynamic allocation in the fast path (O(m) for speed, O(1) for storage)

sublinear search behaviour in good cases (O(n/m))

no worse than the current algorithm in worst case (O(nm))

should work well for both 8-bit strings and 16-bit or 32-bit Unicode strings (no O(σ) dependencies)

many real-life searches should be good, very few should be worst case

reasonably simple implementation

2 of 2

2

Since 2021, CPython uses Crochemore and Perrin's Two-Way algorithm for larger n.

From the source code:

If the strings are long enough, use Crochemore and Perrin's Two-Way algorithm, which has worst-case O(n) runtime and best-case O(n/k). Also compute a table of shifts to achieve O(n/k) in more cases, and often (data dependent) deduce larger shifts than pure C&P can deduce. See stringlib_find_two_way_notes.txt in this folder for a detailed explanation.

See https://github.com/python/cpython/pull/22904

GeeksforGeeks

geeksforgeeks.org › python › time-complexity-of-in-operator-in-python

Time Complexity of In Operator in Python - GeeksforGeeks

July 23, 2025 - When using the in operator to check if a substring exists within a string, Python performs a substring search, which can be quite complex. ... The time complexity is O(m * n), where n is the length of the string and m is the length of the substring ...

Discussions

Python String Addition Time Complexity

You do need to take into account the string copying as others have said. It matters a lot, because that's the difference between O(m*(n^2)) and O(m*n). The O(m*(n^2)) solution would likely time out on LeetCode, and it would be a huge red flag in an interview. However no one has explained how to analyze the complexity yet, and I think that'll help you. Let's say you have n strings, each of length m. The first time you append to encodedStr it takes m operations because encodedStr is empty. The second time it's 2m because it copies m chars from encodedStr and another m from what you're appending. Then 3m etc. So the total time is m + 2m + 3m + ... + nm, which is m(1 + 2 + ... + n). We can use a summation formula to get m*n*(n+1)/2. In big-O notation, that's O(m*(n^2)). (I'm assuming the strings are relatively short so that we can treat str(len(s)) + '#' as an O(1) operation.) As u/aocregacc said, to get an O(m*n) solution use a list and then str.join() at the end. This is the standard way in Python. More on reddit.com

r/leetcode

9

0

May 7, 2023

Complexity of *in* operator in Python - Stack Overflow

What is the complexity of the in operator in Python? Is it theta(n)? Is it the same as the following? def find(L, x): for e in L: if e == x: return True return False L is a... More on stackoverflow.com

stackoverflow.com

time complexity - Runtime of python's if substring in string - Stack Overflow

What is the big O of the following if statement? if "pl" in "apple": ... What is the overall big O of how python determines if the string "pl" is found in the string "apple" or any other subst... More on stackoverflow.com

stackoverflow.com

When checking if a character is in a string vs in a set, does this impact time complexity?

Searching for membership of a set has O(1) complexity (due to hashing), whereas searching a string has O(n) complexity. Typically the difference in efficiency is insignificant. ("my_int" is a terrible name for a string) More on reddit.com

r/learnpython

9

1

December 8, 2023

Videos

youtube.com

Understanding the O(n) Complexity of String Operations in ...

17:41

YouTube

Time & Space Complexity - Big O Notation - DSA Course in Python ...

June 26, 2024

03:05

YouTube

Is the time complexity for string comparison O(n) or O(1)? 🤔 ...

December 12, 2023

04:18

YouTube

Time Complexity vs. Space Complexity Explained | Big O Notation ...

stackoverflow.com › questions › 37133547 › time-complexity-of-string-concatenation-in-python

Time complexity of string concatenation in Python - Stack Overflow

Top answer

1 of 1

77

Yes, in your case^*1 string concatenation requires all characters to be copied, this is a O(N+M) operation (where N and M are the sizes of the input strings). M appends of the same word will trend to O(M^2) time therefor.

You can avoid this quadratic behaviour by using str.join():

word = ''.join(list_of_words)

which only takes O(N) (where N is the total length of the output). Or, if you are repeating a single character, you can use:

word = m * char

You are prepending characters, but building a list first, then reversing it (or using a collections.deque() object to get O(1) prepending behaviour) would still be O(n) complexity, easily beating your O(N^2) choice here.

^*1 As of Python 2.4, the CPython implementation avoids creating a new string object when using strA += strB or strA = strA + strB, but this optimisation is both fragile and not portable. Since you use strA = strB + strA (prepending) the optimisation doesn't apply.

UCI

ics.uci.edu › ~pattis › ICS-33 › lectures › complexitypython.txt

Complexity of Python Operations

This change will speed up the code, but it won't change the complexity analysis because O(N + N Log N) = O (N Log N). Speeding up code is always good, but finding an algorithm in a better complexity class (as we did going from is_unique1 to is_unique2) is much better Finally, is_unique2 works ...

Saturn Cloud

saturncloud.io › blog › python-string-in-operator-algorithm-and-time-complexity-explained

Python String 'in' Operator: Algorithm and Time Complexity Explained | Saturn Cloud Blog

September 9, 2023 - It uses a simple brute-force search algorithm, comparing each substring of the same length as the search string with the search string itself. The time complexity of the in operator is O(n), where n is the length of the larger string.

Note.nkmk.me

note.nkmk.me › home › python

The in Operator in Python (for List, String, Dictionary) | note.nkmk.me

May 9, 2023 - di = d.items() %%timeit for i in range(n_large): (i, i) in di # 1.18 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ... The in keyword is also used in for statements and list comprehensions.

DEV Community

dev.to › williams-37 › understanding-time-complexity-in-python-functions-5ehi

Understanding Time Complexity in Python Functions - DEV Community

October 25, 2024 - Searching for a substring in a string can take linear time in the worst case, where n is the length of the string and m is the length of the substring. ... Finding the length of a list, dictionary, or set is a constant time operation. List Comprehensions: [expression for item in iterable] → O(n) The time complexity of list comprehensions is linear, as they iterate through the entire iterable.

Find elsewhere

Google Bing Mojeek

reddit.com › r/leetcode › python string addition time complexity

r/leetcode on Reddit: Python String Addition Time Complexity

May 7, 2023 -

I am trying to determine the time complexity of the encode function below. It goes over every string in the input list, so that's O(n), where n is the length of the input list. But in each iteration, we add to the encodedStr. Since python creates a new string every time the add operation is performed, do we need to take the length of encodedStr into account for the time complexity of the function?

class Codec:
    def encode(self, strs: List[str]) -> str:
        # Encodes a list of strings to a single string.
        encodedStr = ''
        for s in strs
            encodedStr += str(len(s)) + '#' + s

        return encodedStr

Top answer

1 of 3

3

You do need to take into account the string copying as others have said. It matters a lot, because that's the difference between O(m*(n^2)) and O(m*n). The O(m*(n^2)) solution would likely time out on LeetCode, and it would be a huge red flag in an interview. However no one has explained how to analyze the complexity yet, and I think that'll help you. Let's say you have n strings, each of length m. The first time you append to encodedStr it takes m operations because encodedStr is empty. The second time it's 2m because it copies m chars from encodedStr and another m from what you're appending. Then 3m etc. So the total time is m + 2m + 3m + ... + nm, which is m(1 + 2 + ... + n). We can use a summation formula to get m*n*(n+1)/2. In big-O notation, that's O(m*(n^2)). (I'm assuming the strings are relatively short so that we can treat str(len(s)) + '#' as an O(1) operation.) As u/aocregacc said, to get an O(m*n) solution use a list and then str.join() at the end. This is the standard way in Python.

2 of 3

1

yes, you have to take that into account. You can do it more efficiently by building a list and using str.join in the end.

Stack Overflow

stackoverflow.com › questions › 13884177 › complexity-of-in-operator-in-python

Complexity of *in* operator in Python - Stack Overflow

Top answer

1 of 3

246

The complexity of in depends entirely on what L is. e in L will become L.__contains__(e).

See this time complexity document for the complexity of several built-in types.

Here is the summary for in:

list - Average: O(n)
set/dict - Average: O(1), Worst: O(n)

The O(n) worst case for sets and dicts is very uncommon, but it can happen if __hash__ is implemented poorly. This only happens if everything in your set has the same hash value.

2 of 3

25

It depends entirely on the type of the container. Hashing containers (dict, set) use the hash and are essentially O(1). Typical sequences (list, tuple) are implemented as you guess and are O(n). Trees would be average O(log n). And so on. Each of these types would have an appropriate __contains__ method with its big-O characteristics.

Quora

quora.com › What-is-the-time-complexity-of-the-find-function-in-Python-for-strings

What is the time complexity of the find() function in Python (for strings)? - Quora

Answer (1 of 5): I’m assuming you mean CPython, the most commonly used implementation of Python. In which case, the precise answers can be found in this file : https://github.com/python/cpython/blob/main/Objects/stringlib/fastsearch.h In particular, the comments immediately indicates that the B...

Python

python-list.python.narkive.com › oOFdL6yB › time-complexity-of-string-operations

Time Complexity of String Operations

Actually, it is roughly linear, at least for reasonable string lengths: $ python -V Python 2.5.2 $ python -mtimeit -s "n=1000; a='#'*n" "a+a" 1000000 loops, best of 3: 1 usec per loop $ python -mtimeit -s "n=10000; a='#'*n" "a+a" 100000 loops, best of 3: 5.88 usec per loop $ python -mtimeit ...

Quora

quora.com › What-is-the-time-complexity-of-converting-int-to-str-by-using-str-the-int-in-Python

What is the time complexity of converting int to str by using str (*the int*) in Python? - Quora

For arbitrarily large integers ... complexity of Python’s int → str conversion (str(n)) depends on the magnitude of the integer and on implementation details....

Codeforces

codeforces.com › blog › entry › 125610

Optimize Your Python Codeforces Solutions: Say Goodbye to str += str and Time Limit - Codeforces

When building strings iteratively, it's a common instinct to use the += operator to concatenate strings. However, what many Python developers may not realize is that this operation has a time complexity of approximately O(n^2).

Stack Overflow

stackoverflow.com › questions › 35220418 › runtime-of-pythons-if-substring-in-string

time complexity - Runtime of python's if substring in string - Stack Overflow

Top answer

1 of 4

58

The time complexity is O(N) on average, O(NM) worst case (N being the length of the longer string, M, the shorter string you search for). As of Python 3.10, heuristics are used to lower the worst-case scenario to O(N + M) by switching algorithms.

The same algorithm is used for str.index(), str.find(), str.__contains__() (the in operator) and str.replace(); it is a simplification of the Boyer-Moore with ideas taken from the Boyer–Moore–Horspool and Sunday algorithms.

See the original stringlib discussion post, as well as the fastsearch.h source code; until Python 3.10, the base algorithm has not changed since introduction in Python 2.5 (apart from some low-level optimisations and corner-case fixes).

The post includes a Python-code outline of the algorithm:

def find(s, p):
    # find first occurrence of p in s
    n = len(s)
    m = len(p)
    skip = delta1(p)[p[m-1]]
    i = 0
    while i <= n-m:
        if s[i+m-1] == p[m-1]: # (boyer-moore)
            # potential match
            if s[i:i+m-1] == p[:m-1]:
                return i
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + skip # (horspool)
        else:
            # skip
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + 1
    return -1 # not found

as well as speed comparisons.

In Python 3.10, the algorithm was updated to use an enhanced version of the Crochemore and Perrin's Two-Way string searching algorithm for larger problems (with p and s longer than 100 and 2100 characters, respectively, with s at least 6 times as long as p), in response to a pathological edgecase someone reported. The commit adding this change included a write-up on how the algorithm works.

The Two-way algorithm has a worst-case time complexity of O(N + M), where O(M) is a cost paid up-front to build a shift table from the s search needle. Once you have that table, this algorithm does have a best-case performance of O(N/M).

2 of 4

10

In Python 3.4.2, it looks like they are resorting to the same function, but there may be a difference in timing nevertheless. For example, s.find first is required to look up the find method of the string and such.

The algorithm used is a mix between Boyer-More and Horspool.

GeeksforGeeks

geeksforgeeks.org › python › complexity-cheat-sheet-for-python-operations

Complexity Cheat Sheet for Python Operations - GeeksforGeeks

July 12, 2025 - Note: Defaultdict has operations same as dict with same time complexity as it inherits from dict. Python’s set is another hash-based collection, optimized for membership checks and set operations: Tuples are immutable sequences, making them lighter but with limited operations compared to lists: Strings are immutable and behave similarly to tuples in terms of time complexities:

reddit.com › r/learnpython › when checking if a character is in a string vs in a set, does this impact time complexity?

r/learnpython on Reddit: When checking if a character is in a string vs in a set, does this impact time complexity?

December 8, 2023 -

Let's say I have an integer as a string value and I want to see if it is a certain string int value. For example.

my_int = "3"

# using `in` keyword
if my_int in "123789":
    print("accepted")
else:
    print("not accepted")

# using a set
if my_int in {'1', '2', '3', '7', '8', '9'}:
    print("accepted")
else:
    print("not accepted")

In these cases, is there a difference in time complexity? Using the in keyword seems like it might search through each character in some sort of loop, but maybe internally, python does something else to improve lookup?

Top answer

1 of 3

6

Searching for membership of a set has O(1) complexity (due to hashing), whereas searching a string has O(n) complexity. Typically the difference in efficiency is insignificant. ("my_int" is a terrible name for a string)

2 of 3

3

Those two pieces of code will have identical logical behaviour if your "integer as a string" has only one digit. The time complexities are different. Using in on a set is O(1) and using in on a sequence (string or list) is O(n). https://wiki.python.org/moin/TimeComplexity

reddit.com › r/learnpython › algorithm complexity with strings and slices

r/learnpython on Reddit: Algorithm complexity with strings and slices

December 1, 2019 -

Recently I was thinking about interview questions I got as an undergrad:
Things like "reverse a string" and "check if a string is a palindrome".

I did most of these in C++ with a loop and scrolling through the index using logic.

When I learned Python, I realized that I could "reverse a string" by simply going:

return mystring[::-1]

Likewise with "check if it is a palindrome" by doing:

return mystring == mystring[::-1]

The problem now is that, I don't know what kinda complexity it is.

From my point of view, it is constant, so O(1). But I am guessing that that is too good to be true as the string splicing is doing something behind the scenes.

Can anyone help me clarify?

Top answer

1 of 2

2

The python page on time-complexity shows that slicing lists has a time-complexity of O(k), where "k" is the length of the slice. That's for lists, not strings, but the complexity can't be O(1) for strings since the slicing must handle more characters as the size is increased. At a guess, the complexity of slicing strings would also be O(k). We can write a little bit of code to test that guess:

import time

StartSize = 2097152

size = StartSize
for _ in range(10):
    # create string of size "size"
    s = '*' * size

    # now time reverse slice
    start = time.time()
    r = s[::-1]
    delta = time.time() - start

    print(f'Size {size:9d}, time={delta:.3f}')

    # double size of the string
    size *= 2

This uses a simple method of timing. Other tools exist, but this is simple. When run I get:

$ python3 test.py
Size   2097152, time=0.006
Size   4194304, time=0.013
Size   8388608, time=0.024
Size  16777216, time=0.050
Size  33554432, time=0.098
Size  67108864, time=0.190
Size 134217728, time=0.401
Size 268435456, time=0.808
Size 536870912, time=1.610
Size 1073741824, time=3.192

which shows the time doubles when doubling the size of the string for each reverse slice. So O(n) (k == n for whole-string slicing).

Edit: spelling.

2 of 2

1

How difficult an algorithm is to write and how difficult it is to calculate are two separate things. Creating a reversed string with the shorthand still requires n order space and n order time. Keep in mind that, in most cases, creating a reversed array isn't necessary, you can just start at the top and go down, which is essentially what Python's reversed() function does

GitHub

gist.github.com › Gr1N › 60b346b5e91babb5efac

Complexity of Python Operations · GitHub

Complexity of Python Operations. GitHub Gist: instantly share code, notes, and snippets.

Python

wiki.python.org › moin › TimeComplexity

TimeComplexity - Python Wiki

Note that there is a fast-path for dicts that (in practice) only deal with str keys; this doesn't affect the algorithmic complexity, but it can significantly affect the constant factors: how quickly a typical program finishes. [1] = These operations rely on the "Amortized" part of "Amortized Worst Case".

Scaler

scaler.com › home › topics › how to check if string contains substring in python?

How to Check if String Contains Substring in Python? - Scaler Topics

March 15, 2024 - Using this method, we can check whether a Python contains a substring or not. Time Complexity: It checks if the substring is present within the string, and its time complexity is O(n).