python substring time complexity

1 of 2

2

The python page on time-complexity shows that slicing lists has a time-complexity of O(k), where "k" is the length of the slice. That's for lists, not strings, but the complexity can't be O(1) for strings since the slicing must handle more characters as the size is increased. At a guess, the complexity of slicing strings would also be O(k). We can write a little bit of code to test that guess:

import time

StartSize = 2097152

size = StartSize
for _ in range(10):
    # create string of size "size"
    s = '*' * size

    # now time reverse slice
    start = time.time()
    r = s[::-1]
    delta = time.time() - start

    print(f'Size {size:9d}, time={delta:.3f}')

    # double size of the string
    size *= 2

This uses a simple method of timing. Other tools exist, but this is simple. When run I get:

$ python3 test.py
Size   2097152, time=0.006
Size   4194304, time=0.013
Size   8388608, time=0.024
Size  16777216, time=0.050
Size  33554432, time=0.098
Size  67108864, time=0.190
Size 134217728, time=0.401
Size 268435456, time=0.808
Size 536870912, time=1.610
Size 1073741824, time=3.192

which shows the time doubles when doubling the size of the string for each reverse slice. So O(n) (k == n for whole-string slicing).

Edit: spelling.

2 of 2

1

How difficult an algorithm is to write and how difficult it is to calculate are two separate things. Creating a reversed string with the shorthand still requires n order space and n order time. Keep in mind that, in most cases, creating a reversed array isn't necessary, you can just start at the top and go down, which is essentially what Python's reversed() function does

stackoverflow.com › questions › 35180377 › time-complexity-of-string-slice

python - Time complexity of string slice - Stack Overflow

1 of 3

58

Short answer: str slices, in general, copy. That means that your function that does a slice for each of your string's n suffixes is doing O(n²) work. That said, you can avoid copies if you can work with bytes-like objects using memoryviews to get zero-copy views of the original bytes data. See How you can do zero copy slicing below for how to make it work.

Long answer: (C)Python str do not slice by referencing a view of a subset of the data. There are exactly three modes of operation for str slicing:

Complete slice, e.g. mystr[:]: Returns a reference to the exact same str (not just shared data, the same actual object, mystr is mystr[:] since str is immutable so there is no risk to doing so)
The zero length slice and (implementation dependent) cached length 1 slices; the empty string is a singleton (mystr[1:1] is mystr[2:2] is ''), and low ordinal strings of length one are cached singletons as well (on CPython 3.5.0, it looks like all characters representable in latin-1, that is Unicode ordinals in range(256), are cached)
All other slices: The sliced str is copied at creation time, and thereafter unrelated to the original str

The reason why #3 is the general rule is to avoid issues with large str being kept in memory by a view of a small portion of it. If you had a 1GB file, read it in and sliced it like so (yes, it's wasteful when you can seek, this is for illustration):

with open(myfile) as f:
    data = f.read()[-1024:]

then you'd have 1 GB of data being held in memory to support a view that shows the final 1 KB, a serious waste. Since slices are usually smallish, it's almost always faster to copy on slice instead of create views. It also means str can be simpler; it needs to know its size, but it need not track an offset into the data as well.

How you can do zero copy slicing

There are ways to perform view based slicing in Python, and in Python 2, it will work on str (because str is bytes-like in Python 2, supporting the buffer protocol). With Py2 str and Py3 bytes (as well as many other data types like bytearray, array.array, numpy arrays, mmap.mmaps, etc.), you can create a memoryview that is a zero copy view of the original object, and can be sliced without copying data. So if you can use (or encode) to Py2 str/Py3 bytes, and your function can work with arbitrary bytes-like objects, then you could do:

def do_something_on_all_suffixes(big_string):
    # In Py3, may need to encode as latin-1 or the like
    remaining_suffix = memoryview(big_string)
    # Rather than explicit loop, just replace view with one shorter view
    # on each loop
    while remaining_suffix:  # Stop when we've sliced to empty view
        some_constant_time_operation(remaining_suffix)
        remaining_suffix = remaining_suffix[1:]

The slices of memoryviews do make new view objects (they're just ultra-lightweight with fixed size unrelated to the amount of data they view), just not any data, so some_constant_time_operation can store a copy if needed and it won't be changed when we slice it down later. Should you need a proper copy as Py2 str/Py3 bytes, you can call .tobytes() to get the raw bytes obj, or (in Py3 only it appears), decode it directly to a str that copies from the buffer, e.g. str(remaining_suffix[10:20], 'latin-1').

2 of 3

4

It all depends on how big your slices are. I threw together the following two benchmarks. The first slices the entire string and the second only a little bit. Curve fitting with this tool gives

# s[1:-1]
y = 0.09 x^2 + 10.66 x - 3.25

# s[1:1000]
y = -0.15 x + 17.13706461

The first looks quite linear for slices of strings up to 4MB. I guess this really measures the time taken to construct a second string. The second is pretty constant, although it's so fast it's probably not that stable.

import time

def go(n):
    start = time.time()
    s = "abcd" * n
    for j in xrange(50000):

        #benchmark one
        a = s[1:-1]

        #benchmark two
        a = s[1:1000]

    end = time.time()
    return (end - start) * 1000

for n in range(1000, 100000, 5000):
    print n/1000.0, go(n)

Discussions

python - What is the time complexity of string slice? O(k) or O(n) - Stack Overflow

Why would it need an intermediate list at all? This is presumably implemented in C, not in Python, so it can directly access the internal representation of the string and just copy the requested substring. More on stackoverflow.com

stackoverflow.com

algorithm analysis - Python: A doubt on time and space complexity on string slicing - Computer Science Stack Exchange

This qns is from my school's exam paper. Context: In a game of Hangman, player tries to uncover a hidden word by guessing the letters in the word. Assume all words and letters are lowercase. The More on cs.stackexchange.com

cs.stackexchange.com

Time complexity of string.sub(str, i, i)?

What is the time-complexity of string.sub when singled around a specific character? I’d assume that it would start at the specified index, and only continue through to the third argument (making it effectively constant in the case of just one character). However, I can’t be sure, as I’m ... More on devforum.roblox.com

devforum.roblox.com

1

November 15, 2022

Time complexity of python slice operation - 💡-string-window - Coding Blocks Discussion Forum

what is the time complexity of slicing in python. Example: lis[2:] I wrote a program in leetcode with lis slicing and without lis slicing and still got the same time result. More on discuss.codingblocks.com

discuss.codingblocks.com

4

0

August 1, 2020

stackoverflow.com › questions › 35220418 › runtime-of-pythons-if-substring-in-string

time complexity - Runtime of python's if substring in string - Stack Overflow

geeksforgeeks.org › dsa › python-program-to-check-if-a-string-is-substring-of-another

1 of 4

58

The time complexity is O(N) on average, O(NM) worst case (N being the length of the longer string, M, the shorter string you search for). As of Python 3.10, heuristics are used to lower the worst-case scenario to O(N + M) by switching algorithms.

The same algorithm is used for str.index(), str.find(), str.__contains__() (the in operator) and str.replace(); it is a simplification of the Boyer-Moore with ideas taken from the Boyer–Moore–Horspool and Sunday algorithms.

See the original stringlib discussion post, as well as the fastsearch.h source code; until Python 3.10, the base algorithm has not changed since introduction in Python 2.5 (apart from some low-level optimisations and corner-case fixes).

The post includes a Python-code outline of the algorithm:

def find(s, p):
    # find first occurrence of p in s
    n = len(s)
    m = len(p)
    skip = delta1(p)[p[m-1]]
    i = 0
    while i <= n-m:
        if s[i+m-1] == p[m-1]: # (boyer-moore)
            # potential match
            if s[i:i+m-1] == p[:m-1]:
                return i
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + skip # (horspool)
        else:
            # skip
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + 1
    return -1 # not found

as well as speed comparisons.

In Python 3.10, the algorithm was updated to use an enhanced version of the Crochemore and Perrin's Two-Way string searching algorithm for larger problems (with p and s longer than 100 and 2100 characters, respectively, with s at least 6 times as long as p), in response to a pathological edgecase someone reported. The commit adding this change included a write-up on how the algorithm works.

The Two-way algorithm has a worst-case time complexity of O(N + M), where O(M) is a cost paid up-front to build a shift table from the s search needle. Once you have that table, this algorithm does have a best-case performance of O(N/M).

2 of 4

10

In Python 3.4.2, it looks like they are resorting to the same function, but there may be a difference in timing nevertheless. For example, s.find first is required to look up the find method of the string and such.

The algorithm used is a mix between Boyer-More and Horspool.

GeeksforGeeks

Python Program To Check If A String Is Substring Of Another - GeeksforGeeks

July 23, 2025 - Time complexity: O(m * n) where m and n are lengths of s1 and s2 respectively. A nested loop is used the outer loop runs from 0 to N-M and inner loop from 0 to M so the complexity is O(m*n). Space Complexity: O(1). As no extra space is required.

stackoverflow.com › questions › 61015455 › what-is-the-time-complexity-of-string-slice-ok-or-on › 61016371

python - What is the time complexity of string slice? O(k) or O(n) - Stack Overflow

dev.to › williams-37 › understanding-time-complexity-in-python-functions-5ehi

7

The relevant code is here and it is O(k) as it can be seen line 1628

result_buf = PyBytes_AS_STRING(result);
for (cur = start, i = 0; i < slicelength;cur += step, i++) {
        result_buf[i] = source_buf[cur];
}
return result;

DEV Community

Understanding Time Complexity in Python Functions - DEV Community

October 25, 2024 - Searching for a substring in a string can take linear time in the worst case, where n is the length of the string and m is the length of the substring. ... Finding the length of a list, dictionary, or set is a constant time operation.

Brainly

brainly.in › computer science › secondary school

What is the best time complexity to check string contains in python? - Brainly.in

April 18, 2024 - Generally, the most efficient approach is to use the built-in `in` operator or the `str.find()` method, both of which have an average time complexity of O(n + m), where n is the length of the string and m is the length of the substring being ...

Quora

quora.com › What-is-the-time-complexity-of-the-find-function-in-Python-for-strings

What is the time complexity of the find() function in Python (for strings)? - Quora

Answer (1 of 5): I’m assuming you mean CPython, the most commonly used implementation of Python. In which case, the precise answers can be found in this file : https://github.com/python/cpython/blob/main/Objects/stringlib/fastsearch.h In particular, the comments immediately indicates that the B...

Find elsewhere

Google Bing Mojeek

Scaler

scaler.com › home › topics › how to check if string contains substring in python?

How to Check if String Contains Substring in Python? - Scaler Topics

March 15, 2024 - Using this method, we can check whether a Python contains a substring or not. Time Complexity: Its time complexity is O(n), where n is the length of the main string.

Stack Exchange

cs.stackexchange.com › questions › 88793 › python-a-doubt-on-time-and-space-complexity-on-string-slicing

algorithm analysis - Python: A doubt on time and space complexity on string slicing - Computer Science Stack Exchange

devforum.roblox.com › help and feedback › scripting support

4

The running time of string concatenation depends on how it is implemented. It is impossible to tell whether new_word += i takes $O(1)$ or $\Theta(|\text{new_word}|)$ or anything else unless you know more about Python. Moreover, looking around stackoverflow, it seems that the complexity depends on which version of Python you are using.

In contrast, it is quite clear that the procedure uses $\Theta(|\text{word}|+1)$ space, since it builds new_word from scratch, not reusing the memory of word. This has nothing to do with the loop; the statement new_word = word[:], which makes a copy of word, also uses up $|\text{word}|$ space. Space consumption is the amount of memory used. If you're constructing a new string of length $|\text{word}|$, then it uses up memory proportional to $|\text{word}|$.

I suggest forgetting about what you call the "mathematical perspective", and focusing instead on the intuitive meaning of time and space complexity. Time complexity is the number of instructions executed by the program. Space complexity is the amount of memory that needs to be allocated in order to run the procedure. You are stating some rules for calculating these quantities, but it's much more important for you to understand what you are calculating rather than specific tricks which are dubious when applied mechanically.

Roblox Developer Forum

Time complexity of string.sub(str, i, i)? - Scripting Support - Developer Forum | Roblox

Indexing an array, which is probably how strings are stored in this implementation, is constant time. Copying characters is linear time. Allocating memory to store the result is both difficult to place a bound on and also not deterministic. As such you can assume sub() is linear time, O(n), for all …

2

GeeksforGeeks

geeksforgeeks.org › python › python-frequency-of-substring-in-given-string

Python | Frequency of substring in given string - GeeksforGeeks

June 2, 2023 - The original string is : GeeksforGeeks is for Geeks The original substring : Geeks The frequency of substring in string is 3 · time complexity: O(n), where n is the length of the string.

Medium

andrewwhit.medium.com › time-complexity-of-string-slicing-in-python-db25177d0c48

Time Complexity of String Slicing in Python | by Andrew | Feb, 2026 | Medium

February 22, 2026 - If you need to work with a large substring (e.g., k = 1,000,000), be aware that slicing will take O(k) time and memory. For very large strings, consider alternatives like generators or views (though Python strings don’t support views natively).

Coding Blocks

discuss.codingblocks.com › t › time-complexity-of-python-slice-operation › 99497

Time complexity of python slice operation - 💡-string-window - Coding Blocks Discussion Forum

August 1, 2020 - what is the time complexity of slicing in python. Example: lis[2:] I wrote a program in leetcode with lis slicing and without lis slicing and still got the same time result.

stackoverflow.com › questions › 15666550 › python-string-match-time-complexity › 15666707

python string match time complexity - Stack Overflow

cs.stackexchange.com › questions › 81383 › what-is-correct-time-complexity-of-the-substring-generation-algo

1 of 3

1

Your algorithm (in terms of number of comparisons) is O(n), where n is length of the string. In the worst case both string and pattern will be the same and then for every character in subStr you will move to next character of string. It'll be equivalent to simple comparison of strings.

However your implementation may be O(n^2) in terms of other operations and the reason for this, as you mentioned in you question, is the following line:

fn = fn[index+1 ::]

This is effectively copying the string (assuming the slice above is implemented as a copy). If you consider previous example again, for every character in a string you'd have to copy all remaining characters, which is O(n^2). This is because you'll be copying n-1 characters first, then n-2, n-3 and so on, and at the last iteration you will copy just one character. Total amount of items to be copied will be then n-1+n-2+...+1, which, as the arithmetic progression, is equal to (n-1)*((n-1)+1)/2 = (n-1)*n/2 = O(n^2). For other situations this could be generalised to O(m*n), where m is length of the pattern.

What your friend might like to tell you was: your algorithm is linear, but your implementation is not. It can be easily solved though. Use solution presented by @thkang or something more transparent to get rid of hidden complexity, for example:

try:
  si = iter(string)
  for c in subStr:
    while c != si.next():
      pass
except StopIteration:
  print "no match"
else:
  print "match"

2 of 3

1

Sorry to say but this is neither O(n) nor O(n+m). Both of them are better than O(n^2) and this algorithm is O(n^2). Why?

You iterate over source string which has O(n) complexity
In each iteration you call "index" on string which also has O(n) complexity

So this has worst case performance bounded by O(n^2) and is in fact an implementation of naive search algorithm. If you want to see O(n)/O(mn) I advise checking out Boyer-Moore algorithm

Stack Exchange

algorithm analysis - What is correct time complexity of the substring generation algo - Computer Science Stack Exchange