There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:
import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]
If you want to find overlapping matches, lookahead will do that:
[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]
If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:
search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]
re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.
Videos
Hi! This is entirely based on a pattern in LeetCode problems, but if I need to find ALL substrings/subsets/subarrays of a string/array, is there a way I can do it in less than O(n^2) time complexity? Thank you!
There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:
import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]
If you want to find overlapping matches, lookahead will do that:
[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]
If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:
search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]
re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.
>>> help(str.find)
Help on method_descriptor:
find(...)
S.find(sub [,start [,end]]) -> int
Thus, we can build it ourselves:
def find_all(a_str, sub):
start = 0
while True:
start = a_str.find(sub, start)
if start == -1: return
yield start
start += len(sub) # use start += 1 to find overlapping matches
list(find_all('spam spam spam spam', 'spam')) # [0, 5, 10, 15]
No temporary strings or regexes required.
I'd like to find all occurrences of a substring while ignore some characters. How can I do it in Python?
Example:
long_string = 'this is a t`es"t. Does the test work?' small_string = "test" chars_to_ignore = ['"', '`'] print(find_occurrences(long_string, small_string))
should return [(10, 16), (27, 31)] because we want to ignore the presence of chars ` and ".
(10, 16)is the start and end index of t`es"t inlong_string,(27, 31)is the start and end index oftestinlong_string.