Brave Search

How to split a string on regex in Python [duplicate]

stackoverflow.com › questions › 33536583 › how-to-split-a-string-on-regex-in-python

You need to use re.split if you want to split a string according to a regex pattern.

tokens = re.split(r'[.:]', ip)

Inside a character class | matches a literal | symbol and note that [.:] matches a dot or colon (| won't do the orring here).

So you need to remove | from the character class or otherwise it would do splitting according to the pipe character also.

Use string.split along with list_comprehension.

>>> ip = '192.168.0.1:8080'
>>> [j for i in ip.split(':') for j in i.split('.')]
['192', '168', '0', '1', '8080']

Answer from Avinash Raj on Stack Overflow

Python documentation

docs.python.org › 3 › library › re.html

re — Regular expression operations

4 days ago - split() splits a string into a list delimited by the passed pattern. The method is invaluable for converting textual data into data structures that can be easily read and modified by Python as demonstrated in the following example that creates ...

GeeksforGeeks

geeksforgeeks.org › python › re-split-in-python

re.split() in Python - GeeksforGeeks

July 23, 2025 - The re.split() method in Python is used to split a string by a pattern (using regular expressions). It is part of the re-module, which provides support for working with regular expressions.

Discussions

How to split a string on regex in Python - Stack Overflow

How do you incorporate a regular expression into the Python string.split method? More on stackoverflow.com

stackoverflow.com

regex - Python re.split() vs split() - Stack Overflow

In my quests of optimization, I discovered that that built-in split() method is about 40% faster that the re.split() equivalent. A dummy benchmark (easily copy-pasteable): import re, time, random... More on stackoverflow.com

stackoverflow.com

Split string based on RegEx in Python?

Post the input string you are regexing and what format it follows so we can help you come up with a regex. More on reddit.com

r/learnpython

January 2, 2023

python - Difference between re.split(" ", string) and re. ...

I'm currently studying regular expressions and have come across an inquiry. So the title of the question is what I'm trying to find out. I thought since \s represents a white space, re.split("... More on stackoverflow.com

stackoverflow.com

Videos

03:54

YouTube

Python standard library: Splitting strings with re.split - YouTube

September 19, 2019

03:39

YouTube

#77 How to Split String in Python Using Regex | Regular Expression ...

July 16, 2022

02:14

YouTube

Split Strings with Regex in Python - A Quick re.split() Guide! ...

September 7, 2024

02:03

YouTube

Python Regex: How To Split a String On Multiple Characters - YouTube

October 1, 2020

309

youtube.com

#77 How to Split String in Python Using Regex | Regular ...

View all

Codecademy

codecademy.com › docs › python › regular expressions › re.split()

Python | Regular Expressions | re.split() | Codecademy

July 2, 2023 - The .split() method of the re module divides a string into substrings at each occurrence of the specified character(s). This method is a good alternative to the default .split() string method for instances that require matching multiple characters.

Stack Overflow

stackoverflow.com › questions › 33536583 › how-to-split-a-string-on-regex-in-python

How to split a string on regex in Python - Stack Overflow

Top answer

1 of 1

You need to use re.split if you want to split a string according to a regex pattern.

tokens = re.split(r'[.:]', ip)

Inside a character class | matches a literal | symbol and note that [.:] matches a dot or colon (| won't do the orring here).

So you need to remove | from the character class or otherwise it would do splitting according to the pipe character also.

Use string.split along with list_comprehension.

>>> ip = '192.168.0.1:8080'
>>> [j for i in ip.split(':') for j in i.split('.')]
['192', '168', '0', '1', '8080']

PYnative

pynative.com › home › python › regex › python regex split string using re.split()

Python Regex Split String using re.split() – PYnative

July 27, 2021 - The Python regex split() method split the string by the occurrences of the regex pattern and returns a list of the resulting substrings.

Stack Overflow

stackoverflow.com › questions › 7501609 › python-re-split-vs-split

regex - Python re.split() vs split() - Stack Overflow

Top answer

1 of 3

re.split is expected to be slower, as the usage of regular expressions incurs some overhead.

Of course if you are splitting on a constant string, there is no point in using re.split().

2 of 3

When in doubt, check the source code. You can see that Python s.split() is optimized for whitespace and inlined. But s.split() is for fixed delimiters only.

For the speed tradeoff, a re.split regular expression based split is far more flexible.

>>> re.split(':+',"One:two::t h r e e:::fourth field")
['One', 'two', 't h r e e', 'fourth field']
>>> "One:two::t h r e e:::fourth field".split(':')
['One', 'two', '', 't h r e e', '', '', 'fourth field']
# would require an addition step to find the empty fields...
>>> re.split('[:\d]+',"One:two:2:t h r e e:3::fourth field")
['One', 'two', 't h r e e', 'fourth field']
# try that without a regex split in an understandable way...

That re.split() is only 29% slower (or that s.split() is only 40% faster) is what should be amazing.

Find elsewhere

Google Bing Mojeek

W3Schools

w3schools.com › python › python_regex.asp

Python RegEx

W3Schools offers free online tutorials, references and exercises in all the major languages of the web. Covering popular subjects like HTML, CSS, JavaScript, Python, SQL, Java, and many, many more.

reddit.com › r/learnpython › split string based on regex in python?

r/learnpython on Reddit: Split string based on RegEx in Python?

January 2, 2023 -

I have a fairly big regex matching various types of common mail headers which are used between replies. I'm trying to use re.split to separate each reply as follows:

r = re.compile(r'((?:^ *Original Message processed by david.+?$\\n{,7})(?:.*\\n){,3}(?:(?:^|\\n)[* ]*(?:Von|An|Cc)(?:\\s{,2}).*){2,})|^(?!Am.*Am\\s.+?schrieb.*:)(Am\\s(?:.+?\\s?)schrieb\\s(?:.+?\\s?.+?):)$|((?:(?:^|\\n)[* ]*(?:From|Sent|To|Subject|Date|Cc):[ *]*(?:\\s{,2}).*){2,}(?:\\n.*){,1})|^(?!On[.\\s]*On\\s(.+?\\s?.+?)\\swrote:)(On\\s(?:.+?\\s?.+?)\\swrote:)$|(?:(?:^|\\n)[* ]*(Von|Gesendet|An|Betreff|Datum):[ *]*(?:\\s{,2}).*){2,}|(^(> *))', flags=re.MULTILINE)
r.split(text)

However I'm getting back a lot of None and mix between matches and the reply body content. Not so sure why – any idea? How I would imagine re.split to work:

[
    'Latest reply.', 
    'Am So., 1. Jan. 2023 um 17:22 Uhr schrieb John Doe <\nnoreply@github.com>:\n\n\nSecond reply\n', 
    ...
]

Sample data and regex: https://regex101.com/r/cC1FUo/1

Top answer

1 of 5

Post the input string you are regexing and what format it follows so we can help you come up with a regex.

2 of 5

None means a capturing group that had no content. >>> re.split("(\n)|(\s)", 'line\n\nanotherline', flags=re.MULTILINE) ['line', '\n', None, '', '\n', None, 'anotherline'] >>> re.split("\n|\s", 'line\n\nanotherline', flags=re.MULTILINE) ['line', '', 'anotherline'] You could make all those non-grouping, but it seems much easier just to filter out the None's after the fact. result = list(filter(None, result))

Stack Overflow

stackoverflow.com › questions › 65438868 › difference-between-re-split-string-and-re-split-s-string

python - Difference between re.split(" ", string) and re. ...

Top answer

1 of 3

This only look similar based on your example.

A split on ' ' (a single space) does exactly that - it splits on a single space. Consecutive spaces will lead to empty "matches" when you split.

A split on '\s+' will also split on multiple occurences of those characters and it includes other whitespaces then "pure spaces":

import re

a = re.split(" ", "Why    is this  \t \t  wrong")
b = re.split("\s+", "Why    is this  \t \t  wrong")

print(a)
print(b)

Output:

# re.split(" ",data)
['Why', '', '', '', 'is', 'this', '', '\t', '\t', '', 'wrong']

# re.split("\s+",data)
['Why', 'is', 'this', 'wrong']

Documentation:

\s
Matches any whitespace character; this is equivalent to the class [ \t\n\r\f\v]. (https://docs.python.org/3/howto/regex.html#matching-characters)

2 of 3

It means about space characters. '\s' is split with any whitespaces characters(\b, \t, \n, \a, \r etc.). '+' is if it's following whitespaces. For example " \n \r \t \v". In my opinion, if you need to use directly string operations for separation, you should use my_string.split() like standart methods. Otherwise you should you regex. Because regex engine has a cost and developer should be able to predict that.

GeeksforGeeks

geeksforgeeks.org › python › python-regex-split

Python - Regex split() - GeeksforGeeks

April 18, 2025 - re.split() method in Python is generally used to split a string by a specified pattern. Its working is similar to the standard split() function but adds more functionality.

Note.nkmk.me

note.nkmk.me › home › python

Split a String in Python (Delimiter, Line Breaks, Regex) | note.nkmk.me

May 4, 2025 - This article explains how to split strings in Python using delimiters, line breaks, regular expressions, or a number of characters. Split a string by delimiter: split()Specify the delimiter: sepLimit ...

Interactive Chaos

interactivechaos.com › en › python › function › resplit

re.split | Interactive Chaos

May 18, 2021 - The re.split function splits the text string, considering as separator the occurrences of the regular expression pattern.

FavTutor

favtutor.com › blogs › python-split-regex

Python Split Regex: How to use re.split() function?

February 28, 2023 - Python's re module includes a split function for separating a text-based string based on a pattern.

Educative

educative.io › answers › how-to-use-resplit-in-python

How to use re.split() in Python

Python has a module named re that provides methods for working with regular expressions. The re.split() method in Python splits a text based on a regular expression pattern.

Python documentation

docs.python.org › 3 › howto › regex.html

Regular Expression HOWTO — Python 3.14.3 documentation

Regular expressions are also commonly used to modify strings in various ways, using the following pattern methods: The split() method of a pattern splits a string apart wherever the RE matches, returning a list of the pieces.

Imperial College London

python.pages.doc.ic.ac.uk › lessons › regex › 07-groups › 06-split.html

Advanced Lesson 1: Regular Expressions > re.split()

You can also limit the maximum number of splits. Let’s say we only want to split the string at maximum 3 points. >>> re.split(pattern, string, 3) ['doe', 'a', 'deer', 'a female deer.']

reddit.com › r/learnpython › split string preserving separator

r/learnpython on Reddit: Split string preserving separator

January 2, 2024 -

I want to split a string p ( a paragraph) in to sentences. If I do p.split(".") I get the sentences of p without the final dot. I want the final dot too. Is there other solution different to

use a regular expression instead
just re-add the dot to every single sentences Thabks

Top answer

1 of 4

sentences = [sentence + "." for sentence in p.split(".")] Not beautiful but it works. regex findall would be prettier. If you want to make this better, for example to correctly handle sentences that don't end in a full stop and to handle ellipsis correctly, you could look at the nltk module for natural language processing.

2 of 4

If the sentences always end with a dot followed by space, then I'd replace the space in ". " (dot space) with a string that I can be certain does not occur in the string (such as "¬¬"). The string can then be split the normal Python way. As a one-liner: sentences = p.replace(". ", ".¬¬").split("¬¬") A more robust solution: def split_paragraph(text: str) -> list[str]: # magic_string must not be present in the text. magic_string = "¬¬" if magic_string in text: raise ValueError(f"Invalid substring {magic_string}.") return text.replace(". ", f".{magic_string}").split(magic_string) If the sentences may end with other characters, such as "?" or "!", or ".\n", then I'd use regex: sentences = re.split(r'(?<=[.!?])\s+', p) ?<= Special sequence for "look behind" [.!?] Match any of "." or "!" or "?" \s+ followed by one or more whitespace characters. Be aware that there may be edge cases, for example "Franklin D. Roosevelt", "19th c. industrial development", "C.C.C.P. was a German synthpop group". To handle these kind of edge cases correctly you would probably need to use a specialist natural language processing library.

W3Schools

w3schools.com › python › ref_string_split.asp

Python String split() Method

Remove List Duplicates Reverse ... Interview Q&A Python Bootcamp Python Certificate Python Training ... The split() method splits a string into a list....

Educative

educative.io › answers › what-is-the-resplit-function-in-python

What is the re.split() function in Python?

The first parameter, pattern, denotes the string/pattern where the split will take place. The second parameter, string, denotes the string on which the re.split() operation will take place.

Tutorialspoint

tutorialspoint.com › python › python_re_split_method.htm

Python re split() method

The Python re.split() method is used to split a string by occurrences of a specified regular expression pattern. It returns a list of sub-strings.