如何使用 wkhtmltopdf 将简单的 html 转换为 pdf?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15693520/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 06:58:38  来源:igfitidea点击:

How to convert a simple html to pdf using wkhtmltopdf?

htmlpdfwkhtmltopdfhtml-to-pdf

提问by mark

Here is what I did:

这是我所做的:

  1. Created a linux virtual machine in the Amazon cloud.
  2. Followed the instructions from https://code.google.com/p/wkhtmltopdf/wiki/compilationto download and compile the source code of wkhtmltopdf-qt and of wkhtmltopdf. In the end I have a static build of wkhtmltopdf.
  3. Took this html (http://jsfiddle.net/mark69_fnd/8CtjB/):

    <html> <head> <style type="text/css">p{font-family: sans-serif;};</style> </head> <body> <p>Let's Test</p> </body> </html>

  4. Ran wkhtmltopdf test.html test.pdf

  5. Copied test.pdf to my Windows desktop, opened it and got this (https://docs.google.com/file/d/0B2pbsdBJxJI3MV8zby14cGk5VWs/edit?usp=sharing): enter image description here
  1. 在亚马逊云中创建了一个linux虚拟机。
  2. 按照https://code.google.com/p/wkhtmltopdf/wiki/compilation的说明下载并编译 wkhtmltopdf-qt 和 wkhtmltopdf 的源代码。最后我有一个 wkhtmltopdf 的静态构建。
  3. 拿了这个 html ( http://jsfiddle.net/mark69_fnd/8CtjB/):

    <html> <head> <style type="text/css">p{font-family: sans-serif;};</style> </head> <body> <p>让我们测试一下</p> </正文> </html>

  4. wkhtmltopdf test.html test.pdf

  5. 将 test.pdf 复制到我的 Windows 桌面,打开它并得到这个(https://docs.google.com/file/d/0B2pbsdBJxJI3MV8zby14cGk5VWs/edit?usp=sharing): 在此处输入图片说明

I followed the guide closely, the qt configuration options were taken from ../wkhtmltopdf/static_qt_conf_baseand ../wkhtmltopdf/static_qt_conf_linuxas the guide suggests.

我密切关注指南,qt 配置选项取自 ../wkhtmltopdf/static_qt_conf_base../wkhtmltopdf/static_qt_conf_linux按照指南的建议。

Needless to say I am a bit disappointed with the result. Can anyone explain me what am I doing wrong?

不用说,我对结果有点失望。谁能解释一下我做错了什么?

P.S.

聚苯乙烯

In reality I need to convert a much more complex HTML, but there is no point to talk about it when I fail to convert a trivial one.

实际上,我需要转换一个更复杂的 HTML,但是当我无法转换一个微不足道的 HTML 时,谈论它就没有意义了。

EDIT

编辑

I wish to emphasize that I do not work on Linux, I only open a terminal to an Amazon hosted Linux box. Meaning, I do not have an X11 environment.

我想强调一下,我不在 Linux 上工作,我只打开一个终端到亚马逊托管的 Linux 盒子。意思是,我没有 X11 环境。

This is what I get when I try using the predefined wkhtmltopdf package:

这是我尝试使用预定义的 wkhtmltopdf 包时得到的结果:

ubuntu@ip-10-245-78-162:~$ which wkhtmltopdf
ubuntu@ip-10-245-78-162:~$ /usr/bin/wkhtmltopdf
-bash: /usr/bin/wkhtmltopdf: No such file or directory
ubuntu@ip-10-245-78-162:~$ sudo apt-get install wkhtmltopdf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  wkhtmltopdf
0 upgraded, 1 newly installed, 0 to remove and 120 not upgraded.
Need to get 0 B/104 kB of archives.
After this operation, 303 kB of additional disk space will be used.
Selecting previously unselected package wkhtmltopdf.
(Reading database ... 36679 files and directories currently installed.)
Unpacking wkhtmltopdf (from .../wkhtmltopdf_0.9.9-3_amd64.deb) ...
Processing triggers for man-db ...
Setting up wkhtmltopdf (0.9.9-3) ...
ubuntu@ip-10-245-78-162:~$ l test.*
-rw-r--r-- 1 ubuntu ubuntu 123 Mar 30 12:46 test.html
ubuntu@ip-10-245-78-162:~$ cat test.html
<html> <head> <style type="text/css">p{font-family: sans-serif;};</style> </head> <body> <p>Let's Test</p> </body> </html>
ubuntu@ip-10-245-78-162:~$ /usr/bin/wkhtmltopdf test.html test.pdf
wkhtmltopdf: cannot connect to X server
ubuntu@ip-10-245-78-162:~$

EDIT2

编辑2

  1. I have downloaded ftp://rpmfind.net/linux/fedora/linux/development/rawhide/x86_64/os/Packages/u/urw-fonts-2.4-14.fc19.noarch.rpm
  2. Followed instructions from http://www.howtogeek.com/howto/ubuntu/install-an-rpm-package-on-ubuntu-linux/to convert the rpm to a deb format.
  3. Installed the deb
  4. Produced pdf, but still seeing just the squares.
  1. 我已经下载了ftp://rpmfind.net/linux/fedora/linux/development/rawhide/x86_64/os/Packages/u/urw-fonts-2.4-14.fc19.noarch.rpm
  2. 按照http://www.howtogeek.com/howto/ubuntu/install-an-rpm-package-on-ubuntu-linux/ 中的说明将 rpm 转换为 deb 格式。
  3. 安装了 deb
  4. 生成pdf,但仍然只看到正方形。

Here is the transcript:

这是抄本:

ubuntu@ip-10-245-78-162:~$ sudo alien urw-fonts-2.4-14.fc19.noarch.rpm --scripts
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
urw-fonts_2.4-15_all.deb generated
ubuntu@ip-10-245-78-162:~$ sudo dpkg -i urw-fonts_2.4-15_all.deb
Selecting previously unselected package urw-fonts.
(Reading database ... 38529 files and directories currently installed.)
Unpacking urw-fonts (from urw-fonts_2.4-15_all.deb) ...
Setting up urw-fonts (2.4-15) ...
Processing triggers for fontconfig ...
ubuntu@ip-10-245-78-162:~$  ./wkhtmltopdf/bin/wkhtmltopdf test.html test.pdf
Loading pages (1/6)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
ubuntu@ip-10-245-78-162:~$

EDIT3

编辑3

I have installed the xvfb-run package and now the default version (/usr/bin/wkhtmltopdf) can be run through it. Indeed, it is able to convert the simple test.html to pdf, however, it fails to do so for a complex html page with Javascript code. It appears as though /usr/bin/wkhtmltopdf is unable to run any Javascript code on the page being converted.

我已经安装了 xvfb-run 包,现在可以通过它运行默认版本(/usr/bin/wkhtmltopdf)。事实上,它能够将简单的 test.html 转换为 pdf,但是,对于带有 Javascript 代码的复杂 html 页面,它无法这样做。似乎 /usr/bin/wkhtmltopdf 无法在正在转换的页面上运行任何 Javascript 代码。

I am still puzzled why the compiled version does not work.

我仍然不明白为什么编译后的版本不起作用。

EDIT4

编辑4

I have been unjust with the default wkhtmltopdf version. It is capable to understand Javascript in the page, it successfully converts the following html:

我对默认的 wkhtmltopdf 版本不公平。它能够理解页面中的 Javascript,它成功地转换了以下 html:

<html>
  <head>
    <style type="text/css">
      body {
        font-family: sans-serif;
      }
    </style>
  </head>
  <body id='body'>
    <script>
      document.getElementById('body').innerHTML = 'Hello world!';
    </script>
  </body>
</html>

I will try to understand why does it fail with a real page, but I do not know how can I troubleshoot it except by trying to get a minimal failing page by throwing away pieces of the original one.

我将尝试理解为什么它会在真实页面上失败,但我不知道如何解决它,除非通过丢弃原始页面的碎片来尝试获得最小的失败页面。

EDIT5

编辑5

OK, here is the minimal example that does not work with the default wkhtmltopdf version:

好的,这是一个不适用于默认 wkhtmltopdf 版本的最小示例:

<!DOCTYPE html>
<html>
  <head>
    <style type="text/css">
        html, body {
                height: 100%;
                overflow: hidden;
        }
    </style>
  </head>
  <body>
    Hello World!
  </body>
</html>

The created pdf is empty. Here is the transcript:

创建的pdf是空的。这是抄本:

ubuntu@ip-10-245-78-162:~$ cat test2.html
<!DOCTYPE html>
<html>
  <head>
    <style type="text/css">
        html, body {
                height: 100%;
                overflow: hidden;
        }
    </style>
  </head>
  <body>
    Hello World!
  </body>
</html>
ubuntu@ip-10-245-78-162:~$ xvfb-run /usr/bin/wkhtmltopdf test2.html test2.pdf ; l test2.pdf
Loading page (1/2)
Printing pages (2/2)
Done
-rw-r--r-- 1 ubuntu ubuntu 1266 Mar 31 11:16 test2.pdf
ubuntu@ip-10-245-78-162:~$ cat test2.html |sed 6d | xvfb-run /usr/bin/wkhtmltopdf - test2.pdf ; l test2.pdf
Loading page (1/2)
Printing pages (2/2)
Done
-rw-r--r-- 1 ubuntu ubuntu 4284 Mar 31 11:16 test2.pdf
ubuntu@ip-10-245-78-162:~$

Notice how removing the 6th line (height: 100%;) changes the size of the created pdf file.

请注意删除第 6 行(高度:100%;)如何更改创建的 pdf 文件的大小。

EDIT6

编辑6

The custom version is linked statically, whereas the default one depends on quite a few of the WebKit shared libraries:

自定义版本是静态链接的,而默认版本依赖于相当多的 WebKit 共享库:

The custom version:

自定义版本:

ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$ l wkhtmltopdf
-rwxr-xr-x 1 ubuntu ubuntu 35020224 Mar 31 22:26 wkhtmltopdf
ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$ ldd !$
ldd wkhtmltopdf
        linux-vdso.so.1 =>  (0x00007fff195ff000)
        libXrender.so.1 => /usr/lib/x86_64-linux-gnu/libXrender.so.1 (0x00007fefc06db000)
        libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007fefc03a7000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fefc01a2000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fefbff9a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fefbfd7d000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fefbfa7c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fefbf780000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fefbf56a000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fefbf1aa000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fefc08ef000)
        libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007fefbef8c000)
        libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007fefbed88000)
        libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007fefbeb82000)
ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$

Now the default version:

现在默认版本:

ubuntu@ip-10-245-78-162:/usr/bin$ l wkhtmltopdf
-rwxr-xr-x 1 root root 233512 May  7  2011 wkhtmltopdf
ubuntu@ip-10-245-78-162:/usr/bin$ ldd wkhtmltopdf
        linux-vdso.so.1 =>  (0x00007fff031ff000)
        libQtWebKit.so.4 => /usr/lib/x86_64-linux-gnu/libQtWebKit.so.4 (0x00007f28a33bc000)
        libQtGui.so.4 => /usr/lib/x86_64-linux-gnu/libQtGui.so.4 (0x00007f28a26ee000)
        libQtNetwork.so.4 => /usr/lib/x86_64-linux-gnu/libQtNetwork.so.4 (0x00007f28a23a1000)
        libQtCore.so.4 => /usr/lib/x86_64-linux-gnu/libQtCore.so.4 (0x00007f28a1ecf000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f28a1bcf000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f28a19b8000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f28a15f9000)
        libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f28a1356000)
        libXrender.so.1 => /usr/lib/x86_64-linux-gnu/libXrender.so.1 (0x00007f28a114b000)
        libgstapp-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstapp-0.10.so.0 (0x00007f28a0f3f000)
        libgstinterfaces-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstinterfaces-0.10.so.0 (0x00007f28a0d2d000)
        libgstpbutils-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstpbutils-0.10.so.0 (0x00007f28a0b09000)
        libgstvideo-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstvideo-0.10.so.0 (0x00007f28a08ed000)
        libgstbase-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstbase-0.10.so.0 (0x00007f28a069a000)
        libgstreamer-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0 (0x00007f28a03b2000)
        libgobject-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 (0x00007f28a0163000)
        libglib-2.0.so.0 => /lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x00007f289fe6e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f289fc50000)
        libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f289f91c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f289f620000)
        libfontconfig.so.1 => /usr/lib/x86_64-linux-gnu/libfontconfig.so.1 (0x00007f289f3e9000)
        libaudio.so.2 => /usr/lib/x86_64-linux-gnu/libaudio.so.2 (0x00007f289f1d1000)
        libpng12.so.0 => /lib/x86_64-linux-gnu/libpng12.so.0 (0x00007f289efa9000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f289ed91000)
        libfreetype.so.6 => /usr/lib/x86_64-linux-gnu/libfreetype.so.6 (0x00007f289eaf5000)
        libSM.so.6 => /usr/lib/x86_64-linux-gnu/libSM.so.6 (0x00007f289e8ed000)
        libICE.so.6 => /usr/lib/x86_64-linux-gnu/libICE.so.6 (0x00007f289e6d2000)
        libXi.so.6 => /usr/lib/x86_64-linux-gnu/libXi.so.6 (0x00007f289e4c3000)
        libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f289e2b2000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f289e0ad000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f289dea5000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f28a517e000)
        liborc-0.4.so.0 => /usr/lib/x86_64-linux-gnu/liborc-0.4.so.0 (0x00007f289dc29000)
        libgmodule-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgmodule-2.0.so.0 (0x00007f289da25000)
        libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f289d6ca000)
        libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f289d4c1000)
        libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f289d284000)
        libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f289d065000)
        libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f289ce3b000)
        libXt.so.6 => /usr/lib/x86_64-linux-gnu/libXt.so.6 (0x00007f289cbd5000)
        libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f289c9d1000)
        libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f289c7cc000)
        libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f289c5c5000)
ubuntu@ip-10-245-78-162:/usr/bin$

EDIT7

编辑7

Guys, I do not understand how wkhtmltopdf works for you. I have started from scratch, totally:

伙计们,我不明白 wkhtmltopdf 如何为您工作。我从头开始,完全:

  1. Created a brand new Ubuntu Amazon micro instance (free tier)
  2. sudo apt-get update
  3. sudo apt-get upgrade
  4. sudo apt-get install libx11-dev
  5. sudo apt-get install libfontconfig1-dev
  6. wget https://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  7. tar xjf wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  8. Created test2.html with the contents from EDIT5(see the EDIT5transcript)
  9. Ran wkhtmltopdf-amd64 on test2.html. The produced pdf is empty!
  10. Remove line 6 or 7 from the test2.html (CSS property width or overflow) and suddenly it works!
  1. 创建了一个全新的 Ubuntu Amazon 微型实例(免费套餐)
  2. sudo apt-get 更新
  3. sudo apt-get 升级
  4. 须藤 apt-get 安装 libx11-dev
  5. 须藤 apt-get 安装 libfontconfig1-dev
  6. wget https://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  7. 焦油 xjf wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  8. 使用EDIT5 中的内容创建 test2.html (请参阅EDIT5成绩单)
  9. 在 test2.html 上运行 wkhtmltopdf-amd64。生成的pdf是空的!
  10. 从 test2.html(CSS 属性宽度或溢出)中删除第 6 行或第 7 行,然后它突然起作用了!

Can anyone retrace my steps and confirm it?

任何人都可以追溯我的步骤并确认它吗?

EDIT8

编辑8

Installed CentOS 6.4 in a VMWare VM on my laptop. Same results. wkhtmltopdf does not work on the aforementioned trivial html file.

在我的笔记本电脑上的 VMWare 虚拟机中安装了 CentOS 6.4。结果一样。wkhtmltopdf 不适用于上述琐碎的 html 文件。

回答by sepulchered

Try to set charset declaration in your html head tag like this:

尝试在您的 html head 标签中设置字符集声明,如下所示:

<head>
  <meta charset="utf-8">
  ...
</head>