Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository.

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

tidy inputfile.html
Answer from jonjbar on Stack Overflow
🌐
Reddit
reddit.com › r/commandline › looking for html beautifier
r/commandline on Reddit: looking for html beautifier
April 4, 2021 -

I've been getting into making some basic bash scripts that pull in html through curl and parsing it with sed and grep. Some html I come across needs to be beautified so I use https://beautifytools.com/html-beautifier.php and it works perfectly. I would love a utility for the command line that could replicate the results so I could just pipe into it. I've tried pup and tidy but the results have been lackluster. Appreciate suggestions for a cli html beautifier. Thanks

Edit: I was able to format the specific html I was working with with an additional sed command to my script. Still interested in any beautifiers though!

🌐
Html-validate
html-validate.org › usage › cli.html
HTML-validate - Using CLI
Custom formatters can be used by specifying a package name: --formatter my-custom-formatter.
🌐
GitHub
github.com › ngeor › html-fmt
GitHub - ngeor/html-fmt: HTML formatter (pretty printer) supporting custom tags.
HTML formatter (pretty printer) supporting custom tags. Available as a CLI app and a Visual Studio Code extension.
Starred by 4 users
Forked by 2 users
Languages   TypeScript 89.7% | JavaScript 10.3% | TypeScript 89.7% | JavaScript 10.3%
🌐
JetBrains
jetbrains.com › help › phpstorm › command-line-formatter.html
Format files from the command line | PhpStorm Documentation
phpstorm64.exe format -s C:\Data\settings.xml -m *.xml,*.html C:\Data\src · PhpStorm includes a script for running the command-line code formatter.
🌐
JetBrains
jetbrains.com › help › idea › command-line-formatter.html
Format files from the command line | IntelliJ IDEA Documentation
idea64.exe format -s C:\Data\settings.xml -m *.xml,*.html C:\Data\src · IntelliJ IDEA includes a script for running the command-line code formatter.
Find elsewhere
🌐
Linux Man Pages
linux.die.net › man › 1 › tidy
tidy(1) - Linux man page
Tidy reads HTML, XHTML and XML files and writes cleaned up markup. For HTML variants, it detects and corrects many common coding errors and strives to produce visually equivalent markup that is both W3C compliant and works on most browsers. A common use of Tidy is to convert plain HTML to XHTML.
🌐
JetBrains
jetbrains.com › help › webstorm › command-line-formatter.html
Format files from the command line | WebStorm Documentation
webstorm64.exe format -s C:\Data\settings.xml -m *.xml,*.html C:\Data\src · WebStorm includes a script for running the command-line code formatter.
🌐
R-lib
cli.r-lib.org › articles › semantic-cli.html
Building a Semantic CLI • cli
The formatting of each element is specified separately, in one or more cli themes. cli comes with a builtin theme, and if you are satisfied with that, then you never need to worry about formatting. A semantic cli is similar to how HTML and CSS work together to create a web site.
🌐
Stillat
stillat.com › antlers-toolbox › formatting-cli
Formatting CLI » Antlers Toolbox » Stillat
The format command accepts the ... The htmlOptions object may be used to set the HTML formatting options used by the Antlers formatter....
🌐
GitHub
github.com › kristoff-it › superhtml
GitHub - kristoff-it/superhtml: HTML Validator, Formatter, LSP, and Templating Language Library · GitHub
The tool can be used either directly (for example by running it on save), or through a LSP client implementation. $ superhtml Usage: superhtml COMMAND [OPTIONS] Commands: check Check documents for errors. fmt Format documents. lsp Start the Language Server. help Show this menu and exit. version Print the version and exit. General Options: --help, -h Print command specific usage. --syntax-only Disable HTML element and attribute validation.
Starred by 1.2K users
Forked by 58 users
Languages   Zig 97.4% | C 1.2%
🌐
GitHub
github.com › sibprogrammer › xq
GitHub - sibprogrammer/xq: Command-line XML and HTML beautifier and content extractor
HTML content can be formatted and highlighted as well (using -m flag): xq -m test/data/html/formatted.html · Format multiple files at once: xq test/data/xml/unformatted.xml test/data/xml/unformatted2.xml · In place formatting is supported as well (using -i flag): xq -i test/data/xml/unformatted.xml ·
Starred by 1.1K users
Forked by 33 users
Languages   Go 92.6% | HTML 4.7% | Shell 2.2% | Dockerfile 0.5% | Go 92.6% | HTML 4.7% | Shell 2.2% | Dockerfile 0.5%
Top answer
1 of 4
2

Ok, for anyone else in need of this, I'm recording the suggestions made in this awesome thread (in case that link goes down, as per StackExchange guidelines):

  • HTB 2.0 - DOS based - http://www.digital-mines.com/htb/

  • Tabifier - supports CSS, HTML and C style syntax (including Javascript) - http://tools.arantius.com/tabifier

  • HTML-Kit - a full-featured free HTML editor running on Windows, you need to config TIDY options [Tools /Check code using Tidy /Add new config], uncheck all swithes except "Output only the body content" and "Convert non-breaking space to entities", then go to Actions /Tools /HTML Tidy /Indent Tags or beautify - http://www.chami.com/html-kit/

  • SCREEM - only for Linux -

  • NetBeans - " After openining an html file with NetBeans, click Source then select Format. That's it. " -

  • WebmasterGate's HTML / XHTML Beautifier - Online tool - http://www.webmastergate.com/html-beautifier/

  • Aptana Studio (Version 2.0.4) - "Select Edit > Format or press Ctrl-Shift F to format the html code. The format function can be configured from Windows > Preferrences, then select Aptana > Editors > HTML > Formatting, click Edit to add tags which should not take a new line then save it as a new preferrence." -

  • UniversalIndentGUI - Uses HTB Beautifier internally - While running Notepad++, go to Plugins > Plugin Manager > Show Plugin Manager, select UniversalIndentGUI from the available list to install it.

  • tidy with these options:

(filler text since the markdown engine seems to have problem when code directly follows bullets)

[HTML, XHTML, XML Options]
anchor-as-name:no
doctype:omit
drop-empty-paras:no
fix-backslash:no
fix-bad-comments:no
fix-uri:no
input-xml:yes
join-styles:no
lower-literals:no
preserve-entities:yes
quote-ampersand:no
quote-nbsp:no

[Diagnostics Options]
show-warnings:no

[Pretty Print Options]
indent:yes
indent-spaces:3
tab-size:3

[Miscellaneous Options]
quiet:yes

I'm yet to try out these options (the input-xml: yes and force-output: yes config suggestions to HTML tidy mentioned https://stackoverflow.com/questions/7151180/use-html-tidy-to-just-indent-html-code works for my immediate purpose), will update this answer if I do.

2 of 4
0

Run the file through HTML Tidy.

For example:

curl http://superuser.com | tidy -i | less

-i is for indentation of the input.

🌐
Shopware
developer.shopware.com › docs › products › cli › formatter.html
Formatter | Shopware Documentation
Shopware CLI includes a built-in code formatter for PHP, JavaScript, CSS, SCSS, and Admin Twig files. Use it to apply the Shopware Coding Standard automatically and keep your project consistent.
🌐
SysTutorials
systutorials.com › docs › linux › man › 1-tidy
tidy: check, correct, and pretty-print HTML(5) files - Linux Manuals (1)
Tidy reads HTML, XHTML, and XML files and writes cleaned-up markup. For HTML variants, it detects, reports, and corrects many common coding errors and strives to