The first argument to re.sub is treated as a regular expression, so the square brackets get a special meaning and don't match literally.
You don't need a regular expression for this replacement at all though (and you also don't need the loop counter i):
for name_line in name_lines:
text = text.replace(name_line, '')
Answer from Thomas on Stack OverflowRegular Expressions re.sub
Replacement argument in re.sub()
Regex re.sub("^\s*(.-)\s*$", "%1", s).replace("\\n", "\n") explained
Module re: add possibility to return matches during substitution
Videos
The first argument to re.sub is treated as a regular expression, so the square brackets get a special meaning and don't match literally.
You don't need a regular expression for this replacement at all though (and you also don't need the loop counter i):
for name_line in name_lines:
text = text.replace(name_line, '')
Try to use re.sub to replace the match:
import re
text = """\
Questions and Answers
Operator [1]
Shannon Siemsen Cross, Cross Research LLC - Co-Founder, Principal & Analyst [2]
I hope everyone is well. Tim, you talked about seeing some improvement in the second half of April. So I was wondering if you could just talk maybe a bit more on the segment and geographic basis what you're seeing in the various regions that you're selling in and what you're hearing from your customers. And then I have a follow-up.
Timothy D. Cook, Apple Inc. - CEO & Director [3]"""
text = re.sub(r".*\d]", "", text)
print(text)
Prints:
Questions and Answers
I hope everyone is well. Tim, you talked about seeing some improvement in the second half of April. So I was wondering if you could just talk maybe a bit more on the segment and geographic basis what you're seeing in the various regions that you're selling in and what you're hearing from your customers. And then I have a follow-up.
Hi. I am currently building my very first python project. I need to get rid of all the Characts inside and including the <>. In the following example, the text "like the side of a house" should be left over.
<span class="idiom_proverb">like the side of a <strong class="tilde">house
</strong> </span>
I want to use re.sub with regular expressions on this one and my current code looks like this:
new_text = re.sub(r"(?:\<.*\>)*", "", old_text)
The problem is, when using re.sub this way, I am essentially deleting everything between the first < and the very last >, loosing all the information in between. Is there a way to make sure that my regular expressions refer to the very next > once a < is found?
This way it would only replace <spand class="idiom\_proverb">, <strong class="tilde">, </strong> and </span>?
I would really appreciate any help. Thank you!
I am working on a code I found on Github and I am not familiar with regex. Can someone please explain the meaning of this re.sub("^\s*(.-)\s*$", "%1", s).replace("\\n", "\n")
s is the variable containing the string of interest.
You can use re.sub:
>>> import re
>>> text = "<a> text </a> <c> code </c>"
>>> new_text = re.sub(r'<c>.*?</c>', '', text)
>>> new_text
<a> text </a>
import re
text = "<a> text </a> <c> code </c>"
rg = r"<c>.*<\/c>"
for match in re.findall(rg, text):
text = text.replace(match, "")