Use jsdom and jQuery (server-side).
With jQuery you can delete all scripts, styles, templates and the like and then you can extract the text.
Example
(This is not tested with jsdom and node, only in Chrome)
jQuery('script').remove()
jQuery('noscript').remove()
jQuery('body').text().replace(/\s{2,9999}/g, ' ')
Answer from hgoebl on Stack Overflow
» npm install html-to-text
Use jsdom and jQuery (server-side).
With jQuery you can delete all scripts, styles, templates and the like and then you can extract the text.
Example
(This is not tested with jsdom and node, only in Chrome)
jQuery('script').remove()
jQuery('noscript').remove()
jQuery('body').text().replace(/\s{2,9999}/g, ' ')
As another answer suggested, use JSDOM, but you don't need jQuery. Try this:
JSDOM.fragment(sourceHtml).textContent
» npm install @types/html-to-text
» npm install text_to_html
» npm install string-to-html
The following will wrap all parts that are separated by more than one newline in paragraphs (<p>...</p>) and insert breaks (<br>) where there is just one newline. A text block without any newlines will simply be wrapped in a paragraph.
template = '<p>' + template.replace(/\n{2,}/g, '</p><p>').replace(/\n/g, '<br>') + '</p>';
So for example, it will take this:
Title
First line.
Second line.
Footer
And convert it to this:
<p>Title</p><p>First line.<br>Second line.</p><p>Footer</p>
The simplest solution is you can replace the new line characters with <br>.
Try
text.split('\n').join('\n<br>\n')
then you are done.
» npm install convert-rich-text
» npm install @portabletext/to-html
» npm install js-to-html
» npm install html2plaintext
» npm install @wcj/markdown-to-html
» npm install markdown-to-html