Total Pageviews

Monday 12 May 2014

在linux vps上,用wkhtmltopdf把html文件转换为pdf文件

What is it?

文件 and wkhtmltoimage are open source (LGPL) command line tools to render HTML into PDF and various image formats using the QT Webkit rendering engine. These run entirely "headless" and do not require a display or display service.
There is also a C library, if you're into that kind of thing.

How do I use it?

  1. Download a precompiled binary or build from source
  2. Create your HTML document that you want to turn into a PDF (or image)
  3. Run your HTML document through the tool.
    For example, if I really like the treatment Google has done to their logo today and want to capture it forever as a PDF:
    wkhtmltopdf http://google.com google.pdf

Additional options

That's great, I've always wanted to turn Google's homepage into a PDF, but I want a table of contents as well.
There are plenty of command line options. Check out the auto-generated wkhtmltopdf manual.

from http://wkhtmltopdf.org/
--------------

HTML转换成PDF工具:wkhtmltopdf

有时候我们需要把HTML页面内存转换成PDF,当然可以截图做成Excel然后转换成PDF。下面介绍一下HTML转换成PDF的工具wkhtmltopdf(http://code.google.com/p/wkhtmltopdf/ )。这个工具可以在Linux和Windows等系统下运行。
下面以Windows为例子。先下载工具wkhtmltopdf-版本号-.exe。下载把文件放到自己个一个文件夹下面,比如D:\tool\htmltopdf。然后启动命令行,在命令行里面输入命令就可以执行转换了。
这边以CSDN主页为例子:wkhtmltopdf-0.8.3.exe www.csdn.com myhomepage.pdf
注意若遇到编码乱码的问题,需要修改页面的编码,用Dreamweaver修改网页的编码,编辑、页面属性里面。
这里wkhtmltopdf-0.8.3.exe下载的程序名,www.csdn.com是要输出为PDF的HTML的URL,myhomepage.pdf是输出的PDF文件名。


Linux中的命令是:wkhtmltopdf www . myhomepage . com myhomepage . pdf
还可以通过 wkhtmltopdf — help 命令来查看其它参数和帮助。

下面给出在C#下的调用方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
using System.Diagnostics;
    /// <summary>
    /// HTML生成PDF
    /// </summary>
    /// <param name="url">地址</param>
    /// <param name="path">PDF存放路径</param>
    public static bool HtmlToPdf(string url, string path)
    {
        try
        {
            if (string.IsNullOrEmpty(url) || string.IsNullOrEmpty(path))
                return false;
            Process p = new Process();
            string str = System.Web.HttpContext.Current.Server.MapPath("wkhtmltopdf.exe");
            if (!System.IO.File.Exists(str))
                return false;
            p.StartInfo.FileName = str;
            p.StartInfo.Arguments = " \"" + url + "\" " + path;
            p.StartInfo.UseShellExecute = false;
            p.StartInfo.RedirectStandardInput = true;
            p.StartInfo.RedirectStandardOutput = true;
            p.StartInfo.RedirectStandardError = true;
            p.StartInfo.CreateNoWindow = true;
            p.Start();
            System.Threading.Thread.Sleep(500);
 
            return true;
        }
        catch (Exception ex)
        {
            HttpContext.Current.Response.Write(ex);
        }
        return false;
    }
调用方法:HtmlToPdf(“网页URL”, Server.MapPath(“PDF存放路径”));
需要注意的是:使用wkhtmltopdf时,PDF保存的文件夹不能有非Ansi字符,如中文、日文等,且转换gb2312、韩文charset、日文charset等非utf-8\ansi等网页时,会出现乱码。
本地文件出现乱码,可用DW修改HTML的编码为UTF8,再用记事本打开,另存为,选择编码为UTF8。