EDIT (2024) : new method related to webview and Edge https://weblog.west-wind.com/posts/2024/Mar/26/Html-to-PDF-Generation-using-the-WebView2-Control
This will need Windows OS.
I have tested with ASP.net Core application and I used .net8.0-windows as target framework in order to use this nuget I also installed webview runtime
Also creator indicated windows desktop runtime as dependency too.
Install the nuget.
/// <summary>
/// Return raw data as PDF
/// </summary>
/// <returns></returns>
[HttpGet("rawpdfex")]
public async Task<IActionResult> RawPdf()
{
var file = Path.GetFullPath("./HtmlSampleFile-SelfContained.html");
var pdf = new HtmlToPdfHost();
var pdfResult = await pdf.PrintToPdfStreamAsync(file, new
WebViewPrintSettings { PageRanges = "1-10" });
if (pdfResult == null || !pdfResult.IsSuccess)
{
Response.StatusCode = 500;
return new JsonResult(new
{
isError = true,
message = pdfResult.Message
});
}
return new FileStreamResult(pdfResult.ResultStream, "application/pdf");
}
EDIT: New Suggestion HTML Renderer for PDF using PdfSharp
(After trying wkhtmltopdf and suggesting to avoid it)
HtmlRenderer.PdfSharp is a 100% fully C# managed code, easy to use, thread safe and most importantly FREE (New BSD License) solution.
Usage
Download HtmlRenderer.PdfSharp nuget package.
Use Example Method.
public static Byte[] PdfSharpConvert(String html) { Byte[] res = null; using (MemoryStream ms = new MemoryStream()) { var pdf = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.A4); pdf.Save(ms); res = ms.ToArray(); } return res; }
A very Good Alternate Is a Free Version of iTextSharp
Until version 4.1.6 iTextSharp was licensed under the LGPL licence and versions until 4.16 (or there may be also forks) are available as packages and can be freely used. Of course someone can use the continued 5+ paid version.
I tried to integrate wkhtmltopdf solutions on my project and had a bunch of hurdles.
I personally would avoid using wkhtmltopdf - based solutions on Hosted Enterprise applications for the following reasons.
- First of all wkhtmltopdf is C++ implemented not C#, and you will experience various problems embedding it within your C# code, especially while switching between 32bit and 64bit builds of your project. Had to try several workarounds including conditional project building etc. etc. just to avoid "invalid format exceptions" on different machines.
- If you manage your own virtual machine its ok. But if your project is running within a constrained environment like (Azure (Actually is impossible withing azure as mentioned by the TuesPenchin author) , Elastic Beanstalk etc) it's a nightmare to configure that environment only for wkhtmltopdf to work.
- wkhtmltopdf is creating files within your server so you have to manage user permissions and grant "write" access to where wkhtmltopdf is running.
- Wkhtmltopdf is running as a standalone application, so its not managed by your IIS application pool. So you have to either host it as a service on another machine or you will experience processing spikes and memory consumption within your production server.
- It uses temp files to generate the pdf, and in cases Like AWS EC2 which has really slow disk i/o it is a big performance problem.
- The most hated "Unable to load DLL 'wkhtmltox.dll'" error reported by many users.
--- PRE Edit Section ---
For anyone who want to generate pdf from html in simpler applications / environments I leave my old post as suggestion.
TuesPechkin
https://www.nuget.org/packages/TuesPechkin/
or Especially For MVC Web Applications (But I think you may use it in any .net application)
Rotativa
https://www.nuget.org/packages/Rotativa/
They both utilize the wkhtmtopdf binary for converting html to pdf. Which uses the webkit engine for rendering the pages so it can also parse css style sheets.
They provide easy to use seamless integration with C#.
Rotativa can also generate directly PDFs from any Razor View.
Additionally for real world web applications they also manage thread safety etc...
Answer from Anestis Kivranoglou on Stack OverflowEDIT (2024) : new method related to webview and Edge https://weblog.west-wind.com/posts/2024/Mar/26/Html-to-PDF-Generation-using-the-WebView2-Control
This will need Windows OS.
I have tested with ASP.net Core application and I used .net8.0-windows as target framework in order to use this nuget I also installed webview runtime
Also creator indicated windows desktop runtime as dependency too.
Install the nuget.
/// <summary>
/// Return raw data as PDF
/// </summary>
/// <returns></returns>
[HttpGet("rawpdfex")]
public async Task<IActionResult> RawPdf()
{
var file = Path.GetFullPath("./HtmlSampleFile-SelfContained.html");
var pdf = new HtmlToPdfHost();
var pdfResult = await pdf.PrintToPdfStreamAsync(file, new
WebViewPrintSettings { PageRanges = "1-10" });
if (pdfResult == null || !pdfResult.IsSuccess)
{
Response.StatusCode = 500;
return new JsonResult(new
{
isError = true,
message = pdfResult.Message
});
}
return new FileStreamResult(pdfResult.ResultStream, "application/pdf");
}
EDIT: New Suggestion HTML Renderer for PDF using PdfSharp
(After trying wkhtmltopdf and suggesting to avoid it)
HtmlRenderer.PdfSharp is a 100% fully C# managed code, easy to use, thread safe and most importantly FREE (New BSD License) solution.
Usage
Download HtmlRenderer.PdfSharp nuget package.
Use Example Method.
public static Byte[] PdfSharpConvert(String html) { Byte[] res = null; using (MemoryStream ms = new MemoryStream()) { var pdf = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.A4); pdf.Save(ms); res = ms.ToArray(); } return res; }
A very Good Alternate Is a Free Version of iTextSharp
Until version 4.1.6 iTextSharp was licensed under the LGPL licence and versions until 4.16 (or there may be also forks) are available as packages and can be freely used. Of course someone can use the continued 5+ paid version.
I tried to integrate wkhtmltopdf solutions on my project and had a bunch of hurdles.
I personally would avoid using wkhtmltopdf - based solutions on Hosted Enterprise applications for the following reasons.
- First of all wkhtmltopdf is C++ implemented not C#, and you will experience various problems embedding it within your C# code, especially while switching between 32bit and 64bit builds of your project. Had to try several workarounds including conditional project building etc. etc. just to avoid "invalid format exceptions" on different machines.
- If you manage your own virtual machine its ok. But if your project is running within a constrained environment like (Azure (Actually is impossible withing azure as mentioned by the TuesPenchin author) , Elastic Beanstalk etc) it's a nightmare to configure that environment only for wkhtmltopdf to work.
- wkhtmltopdf is creating files within your server so you have to manage user permissions and grant "write" access to where wkhtmltopdf is running.
- Wkhtmltopdf is running as a standalone application, so its not managed by your IIS application pool. So you have to either host it as a service on another machine or you will experience processing spikes and memory consumption within your production server.
- It uses temp files to generate the pdf, and in cases Like AWS EC2 which has really slow disk i/o it is a big performance problem.
- The most hated "Unable to load DLL 'wkhtmltox.dll'" error reported by many users.
--- PRE Edit Section ---
For anyone who want to generate pdf from html in simpler applications / environments I leave my old post as suggestion.
TuesPechkin
https://www.nuget.org/packages/TuesPechkin/
or Especially For MVC Web Applications (But I think you may use it in any .net application)
Rotativa
https://www.nuget.org/packages/Rotativa/
They both utilize the wkhtmtopdf binary for converting html to pdf. Which uses the webkit engine for rendering the pages so it can also parse css style sheets.
They provide easy to use seamless integration with C#.
Rotativa can also generate directly PDFs from any Razor View.
Additionally for real world web applications they also manage thread safety etc...
Last Updated: October 2020
This is the list of options for HTML to PDF conversion in .NET that I have put together (some free some paid)
GemBox.Document
- https://www.nuget.org/packages/GemBox.Document/
- Free (up to 20 paragraphs)
- $680 - https://www.gemboxsoftware.com/document/pricelist
- https://www.gemboxsoftware.com/document/examples/c-sharp-convert-html-to-pdf/307
PDF Metamorphosis .Net
- https://www.nuget.org/packages/sautinsoft.pdfmetamorphosis/
1078 - https://www.sautinsoft.com/products/pdf-metamorphosis/order.php
- https://www.sautinsoft.com/products/pdf-metamorphosis/convert-html-to-pdf-dotnet-csharp.php
HtmlRenderer.PdfSharp
- https://www.nuget.org/packages/HtmlRenderer.PdfSharp/1.5.1-beta1
- BSD-UNSPECIFIED License
PuppeteerSharp
- https://www.puppeteersharp.com/examples/index.html
- MIT License
- https://github.com/kblok/puppeteer-sharp
EO.Pdf
- https://www.nuget.org/packages/EO.Pdf/
- $799 - https://www.essentialobjects.com/Purchase.aspx?f=3
WnvHtmlToPdf_x64
- https://www.nuget.org/packages/WnvHtmlToPdf_x64/
1600 - http://www.winnovative-software.com/Buy.aspx
- demo - http://www.winnovative-software.com/demo/default.aspx
IronPdf
- https://www.nuget.org/packages/IronPdf/
1599 - https://ironpdf.com/licensing/
- https://ironpdf.com/examples/using-html-to-create-a-pdf/
Spire.PDF
- https://www.nuget.org/packages/Spire.PDF/
- Free (up to 10 pages)
1799 - https://www.e-iceblue.com/Buy/Spire.PDF.html
- https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/Convert-HTML-to-PDF-Customize-HTML-to-PDF-Conversion-by-Yourself.html
Aspose.Html
- https://www.nuget.org/packages/Aspose.Html/
1797 - https://purchase.aspose.com/pricing/html/net
- https://docs.aspose.com/html/net/html-to-pdf-conversion/
EvoPDF
- https://www.nuget.org/packages/EvoPDF/
1200 - http://www.evopdf.com/buy.aspx
ExpertPdfHtmlToPdf
- https://www.nuget.org/packages/ExpertPdfHtmlToPdf/
1200 - https://www.html-to-pdf.net/Pricing.aspx
Zetpdf
- https://zetpdf.com
599 - https://zetpdf.com/pricing/
- Is not a well know or supported library - ZetPDF - Does anyone know the background of this Product?
PDFtron
- https://www.pdftron.com/documentation/samples/cs/HTML2PDFTes
- $4000/year - https://www.pdftron.com/licensing/
WkHtmlToXSharp
- https://github.com/pruiz/WkHtmlToXSharp
- Free
- Concurrent conversion is implemented as processing queue.
SelectPDF
- https://www.nuget.org/packages/Select.HtmlToPdf/
- Free (up to 5 pages)
799 - https://selectpdf.com/pricing/
- https://selectpdf.com/pdf-library-for-net/
If none of the options above help you you can always search the NuGet packages:
https://www.nuget.org/packages?q=html+pdf
Videos
pandoc is a great command-line tool for file format conversion.
The disadvantage is for PDF output, youโll need LaTeX. The usage is
pandoc test.html -t latex -o test.pdf
If you don't have LaTeX installed, then I recommend htmldoc.
Cited from Creating a PDF
By default, pandoc will use LaTeX to create the PDF, which requires that a LaTeX engine be installed.
Alternatively, pandoc can use ConTeXt, pdfroff, or any of the following HTML/CSS-to-PDF-engines, to create a PDF: wkhtmltopdf, weasyprint or prince. To do this, specify an output file with a .pdf extension, as before, but add the --pdf-engine option or -t context, -t html, or -t ms to the command line (-t html defaults to --pdf-engine=wkhtmltopdf).
You can also try wkhtmltopdf, usage and installation is pretty straightforward.
Some Pretext:
HTML to PDF conversion is as it sounds, converting HTML to PDF.
By PDF generation I mean to generate a PDF without that conversion thing.
Context:
In my company, we allow users to export there data in the form of PDFs where there is user data and images. Now this process is slow for users that have a lot of data (I have seen exports running for 2-3 hrs for some 100 page PDF).
There might be multiple reasons to this which I suspect are (DB is not the issue here):
Fetching user images from S3 takes time.
We first generate the HTML from user data and then use pupeteer to convert that html to PDF.
Sometimes, an admin can request data for multiple users, which means this whole conversion and generation thing will run for each user.
The pupeteer code is written with asyncio (Python) which uses async await paradigm. But we still wait for the task to complete.
We have to use pupeteer because currently we have some custom templates, css and fontfaces for the template of PDF.
While looking for solutions I came across a reddit post https://www.reddit.com/r/webdev/comments/18zm20r/html_to_pdf_with_performances/ where user had similar issue. I tested wkhtmltopdf and its styling was little bit off as compared to pupeteer.
Since we are already using headless chromium here (which according to the post is fastes), is there anything else that can speed this up? In the above post, one user mentioned some "layout engine on top of a 2D graphics API" which I guess is a layout engine on top of something like OpenGL (only one I've used so no idea). I am not sure I'll be able to develop a whole layout thing since my CPP is rusty. Anything else that I can do to improve performance?
We currently create xml from our data and transform it to PDF with Apache Formatting Objects Processor.
I would like to know what's the current state of the art to generate PDF in .NET 6/7/8
In addition I want to mention that I need beside the PDF output additionally a HTML output which looks exactly like the PDF.
Maybe you can write the HTML first and generate the PDF from it?