Have a look at the HTML Tidy Project: http://www.html-tidy.org/
The granddaddy of HTML tools, with support for modern standards.
There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository.
Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.
For your needs, here is the command line to call Tidy:
tidy inputfile.html
Answer from jonjbar on Stack OverflowHave a look at the HTML Tidy Project: http://www.html-tidy.org/
The granddaddy of HTML tools, with support for modern standards.
There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository.
Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.
For your needs, here is the command line to call Tidy:
tidy inputfile.html
Update 2018: The homebrew/dupes is now deprecated, tidy-html5 may be directly installed.
brew install tidy-html5
Original reply:
Tidy from OS X doesn't support HTML5. But there is experimental branch on Github which does.
To get it:
brew tap homebrew/dupes
brew install tidy --HEAD
brew untap homebrew/dupes
That's it! Have fun!
I've been getting into making some basic bash scripts that pull in html through curl and parsing it with sed and grep. Some html I come across needs to be beautified so I use https://beautifytools.com/html-beautifier.php and it works perfectly. I would love a utility for the command line that could replicate the results so I could just pipe into it. I've tried pup and tidy but the results have been lackluster. Appreciate suggestions for a cli html beautifier. Thanks
Edit: I was able to format the specific html I was working with with an additional sed command to my script. Still interested in any beautifiers though!
I want a tool that formats HTML files from the command line
– I recommend js-beautify, as suggested in this answer.
I prefer to install it globally:
npm install --global js-beautify
I typically want to format all *.html files in the current folder. 1
js-beautify *.html --type html --replace --indent-size 1 --max-preserve-newlines 0
The above fulfills all eight requirements of the question, including the three optional ones.
References
- The documentation of the NPM package
js-beautify– version 1.14.9 - Answer using
js-beautify
1 The command may be written shorter as :
js-beautify --type html -r -s 1 -m 0 *.html
This formats the HTML to my liking. Check out the Options to find your favorite settings.
Try html tidy.
Tidy is a console application for macOS, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors .
It has many options and is available on many platforms, and online.
Installation on Windows and integration of its output into an editor such as Textpad was kind of tricky but it can be done . https://www.google.com/search?q=html+tidy
I am not sure about your requirement "indent 1 character" but I think this can be worked around by indenting with tabs, and then global substitution with of tab with 1 blank.
If its something related to pretty printing of file in xml aligned format, xmllint as suggested by @warl0ck is nice & here is what I tried & see:
$cat some.xml
<myRoot> <my-element><my-subelem myAttr="value"/></my-element></myRoot>
$ xmllint --format some.xml
<?xml version="1.0"?>
<myRoot>
<my-element>
<my-subelem myAttr="value"/>
</my-element>
</myRoot>
Try the xmllint program.
In VIM you could format the whole file in place, e.g 1,$!xmllint --format --recover -
If you like vim you can invoke its action from the command-line.
echo -e "G=gg\n:wq\n" | vim ./myfile.php
Warning the above command will modify your file without prompting. Do a backup before.
It's possible to find examples integrated with find to accomplish the same work on a bunch of files [0].
Looking beyond it's possible to find a lot of utilities build for this, and this number will continue to grow in time; you can search for their updated versions on internet and you can start for example from:
- Artistic Style [1],
astyle, for C, C++, C++/CLI, Objective‑C, C# and Java programming languages. tidy[2] for Html- IDE solutions invoked by command line starting from
kate[3] for which there exists just made plug-in; you can built your own indentation scripts [4] too, continuing withUniversalIndentGUI[5],eclipse[5]...
vim -c "execute 'normal! =G' | :wq! out.js" input.js
You can also use the alternative syntax + instead of -c. It's the same.
vim +"execute 'normal! =G' | :wq! out.js" input.js
It executes the normal command
=G, which will autoformat/indent (=) every line until the end of the file (G).Then
:wq! out.jswrites it to a file and quits vim. If you just want to overwrite the same file, then remove theout.js.
Also, you need to have this line in your ~/.vimrc in order for the autoformat indent plugin to work:
filetype plugin indent on
Ok, for anyone else in need of this, I'm recording the suggestions made in this awesome thread (in case that link goes down, as per StackExchange guidelines):
HTB 2.0 - DOS based - http://www.digital-mines.com/htb/
Tabifier - supports CSS, HTML and C style syntax (including Javascript) - http://tools.arantius.com/tabifier
HTML-Kit - a full-featured free HTML editor running on Windows, you need to config TIDY options [Tools /Check code using Tidy /Add new config], uncheck all swithes except "Output only the body content" and "Convert non-breaking space to entities", then go to Actions /Tools /HTML Tidy /Indent Tags or beautify - http://www.chami.com/html-kit/
SCREEM - only for Linux -
NetBeans - " After openining an html file with NetBeans, click Source then select Format. That's it. " -
WebmasterGate's HTML / XHTML Beautifier - Online tool - http://www.webmastergate.com/html-beautifier/
Aptana Studio (Version 2.0.4) - "Select Edit > Format or press Ctrl-Shift F to format the html code. The format function can be configured from Windows > Preferrences, then select Aptana > Editors > HTML > Formatting, click Edit to add tags which should not take a new line then save it as a new preferrence." -
UniversalIndentGUI - Uses HTB Beautifier internally - While running Notepad++, go to Plugins > Plugin Manager > Show Plugin Manager, select UniversalIndentGUI from the available list to install it.
tidy with these options:
(filler text since the markdown engine seems to have problem when code directly follows bullets)
[HTML, XHTML, XML Options]
anchor-as-name:no
doctype:omit
drop-empty-paras:no
fix-backslash:no
fix-bad-comments:no
fix-uri:no
input-xml:yes
join-styles:no
lower-literals:no
preserve-entities:yes
quote-ampersand:no
quote-nbsp:no
[Diagnostics Options]
show-warnings:no
[Pretty Print Options]
indent:yes
indent-spaces:3
tab-size:3
[Miscellaneous Options]
quiet:yes
I'm yet to try out these options (the input-xml: yes and force-output: yes config suggestions to HTML tidy mentioned https://stackoverflow.com/questions/7151180/use-html-tidy-to-just-indent-html-code works for my immediate purpose), will update this answer if I do.
Run the file through HTML Tidy.
For example:
curl http://superuser.com | tidy -i | less
-i is for indentation of the input.
js-beautify also works on HTML.
npm install js-beautify
js-beautify --type html file.html
Notice all this beautifying makes the file size increase substantially. The indentation is great for revision and editing, but not so much for hosting. For that reason, you might find html-minifier equally useful.
maybe what you are looking for is prettier, this also supports CLI, even you can also make config, see the complete documentation here. Prettier CLI
I hope this helps.