You can use htmlmin to minify your html:
import htmlmin
html = """
<!DOCTYPE html>
<html lang="en">
<head>
<title>Bootstrap Case</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.1.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<h2>Well</h2>
<div class="well">Basic Well</div>
</div>
</body>
</html>
"""
minified = htmlmin.minify(html.decode("utf-8"), remove_empty_space=True)
print(minified)
Answer from user7161360 on Stack OverflowYou can use htmlmin to minify your html:
import htmlmin
html = """
<!DOCTYPE html>
<html lang="en">
<head>
<title>Bootstrap Case</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.1.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<h2>Well</h2>
<div class="well">Basic Well</div>
</div>
</body>
</html>
"""
minified = htmlmin.minify(html.decode("utf-8"), remove_empty_space=True)
print(minified)
htmlmin and html_slimmer are some simple html minifying tools for python. I have millions of html pages stored in my database and running htmlmin, I am able to reduce the page size between 5 and 50%. Neither of them do an optimal job at complete html minification (i.e. the font color #00000 can be reduced to #000), but it's a good start. I have a try/except block that runs htmlmin and then if that fails, html_slimmer because htmlmin seems to provide better compression, but it does not support non ascii characters.
Example Code:
import htmlmin
from slimmer import html_slimmer # or xhtml_slimmer, css_slimmer
try:
html=htmlmin.minify(html, remove_comments=True, remove_empty_space=True)
except:
html=html_slimmer( html.strip().replace('\n',' ').replace('\t',' ').replace('\r',' ') )
Good Luck!
» pip install minify-html
» pip install htmlmin
» pip install django-minify-html
So I recently saw this post here where this guy got absolutely wrecked for posting his shitty code here, and not gonna lie, that was pretty funny.
So here is my turn!
Since the most popular Python lib for minifying HTML (htmlmin) is way too slow, I decided to make my own and honestly I'm surprised how short it is. It can't be that simple right? It is also about 10x faster than htmlmin. I challenge you to break this code to produce incorrect HTML:
from html.parser import HTMLParser
import rjsmin
import rcssmin
class MinifyHTMLParser(HTMLParser):
def __init__(self):
super().__init__()
self.minified_html = ""
def handle_decl(self, decl: str) -> None:
self.minified_html += f"<!{decl}>"
def unknown_decl(self, data: str) -> None:
self.minified_html += f"<!{data}>"
def handle_pi(self, data):
self.minified_html += f"<?{data}>"
def handle_startendtag(self, tag, attrs):
self.add_tag(tag, attrs, "/>")
def handle_entityref(self, name):
self.minified_html += f"&{name};"
def handle_charref(self, name):
self.minified_html += f"&#x{name};"
def handle_starttag(self, tag, attrs):
self.add_tag(tag, attrs, ">")
# TODO Rename this here and in `handle_startendtag` and `handle_starttag`
def add_tag(self, tag, attrs, end_tag):
self.minified_html += f"<{tag}"
for attr in attrs:
self.minified_html += f' {attr[0]}'
if attr[1] is not None:
self.minified_html += f'="{attr[1]}"'
self.minified_html += end_tag
def handle_endtag(self, tag):
self.minified_html += f"</{tag}>"
def handle_data(self, data):
if self.lasttag == "style":
self.minified_html += minify_css(data).strip()
elif self.lasttag == "script":
self.minified_html += minify_js(data).strip()
elif self.lasttag in ["textarea", "pre", "code"]:
self.minified_html += data
else:
self.minified_html += data.strip()
parser = MinifyHTMLParser()
parser.feed(html_text)
return parser.minified_html