Html 如何将html页面保存为一个文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16169744/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 07:47:19  来源:igfitidea点击:

How to save html pages as one file?

htmlsavearchivewebarchive

提问by Dimitri Vorontzov

I want to be able to save / archive HTML pages as one file (without those pesky external folders).

我希望能够将 HTML 页面保存/存档为一个文件(没有那些讨厌的外部文件夹)。

I want the resulting file to contain all styles, images, and links (videos and Flash would be nice, too, but not as crucial).

我希望生成的文件包含所有样式、图像和链接(视频和 Flash 也不错,但不是那么重要)。

I want the resulting file to be searchable, and editable.

我希望生成的文件可搜索和编辑。

Microsoft's MHT is one of such tools, but unfortunately, it's not searchable under Linux. MHT is good, but I don't want to be locked under one operating system or one company. What would be a good alternative – or perhaps there's some entirely different solution I wasn't thinking about?

微软的 MHT 就是这样的工具之一,但不幸的是,它在 Linux 下是不可搜索的。MHT 很好,但我不想被锁定在一种操作系统或一家公司下。什么是好的选择——或者也许有一些我没有考虑过的完全不同的解决方案?

Thank you in advance for your suggestions!

预先感谢您的建议!

采纳答案by Alexis Tyler

回答by banb

Viewing and creating MHTML files in current versions of Google Chrome is supported by toggling the "Save Page as MHTML" option on the chrome://flags page.

通过在 chrome://flags 页面上切换“将页面另存为 MHTML”选项,支持在当前版本的 Google Chrome 中查看和创建 MHTML 文件。

type chrome://flags in your url box

在您的网址框中输入 chrome://flags

However, enabling this experimental option disables saving pages as HTML-only or HTML Complete files. From the chrome://flags page:

但是,启用此实验性选项会禁用将页面保存为纯 HTML 或 HTML 完整文件。从 chrome://flags 页面:

回答by zTrix

The SingleFile chrome extensionis a good solution.

SingleFile chrome 扩展是一个很好的解决方案。

I have also written my own python tool to solve this problem which I would recommend giving a try: https://github.com/zTrix/webpage2html

我还编写了自己的 python 工具来解决这个问题,我建议您尝试一下:https: //github.com/zTrix/webpage2html

回答by afeique

Extending upon zTrix's answer, I would suggest avoiding the Chrome extension (which did not work for me at all) and instead going with one of these options:

扩展 zTrix 的回答,我建议避免使用 Chrome 扩展程序(它根本不适合我),而是使用以下选项之一:

  • Node.js: remy's inliner
    • Easy to install using npm
    • Many options, including flags for disabling minification/compression, maintaining external images, skipping videos, and more.
    • Caveat: (22 September 2017) fails to maintain styling and JavaScript functionality when compiling Slate builds. This won't affect most people directly, but it means that inliner will probably have issues with other pages. See this issue
    • Caveat: no options to "leave things alone": will either minify/uglify CSS/JS or beautify, but will not simply embed original source into HTML.
  • Python 2: zTrix's webpage2html
    • More conservative than inliner; works well for most cases.
    • zTrix fixed a bug (that inliner also seems to have) which ensures JavaScript/CSS functionality when compiling Slate builds. See this issue. (updated 29 September 2017)
    • Can be converted to Python 3relatively painlessly
    • Caveat: cannot handle CSS @import
  • Node.js:remy 的内联
    • 易于安装使用 npm
    • 许多选项,包括用于禁用缩小/压缩、维护外部图像、跳过视频等的标志。
    • 警告:(2017 年 9 月 22 日)在编译Slate 构建时无法保持样式和 JavaScript 功能。这不会直接影响大多数人,但这意味着内联程序可能会对其他页面产生问题。看到这个问题
    • 警告:没有选项可以“不理会”:将缩小/丑化 CSS/JS 或美化,但不会简单地将原始源代码嵌入到 HTML 中。
  • Python 2:zTrix 的网页 2html
    • 比内联更保守;大多数情况下效果很好。
    • zTrix 修复了一个错误(该内联程序似乎也有),该错误在编译Slate 构建时确保 JavaScript/CSS 功能。看到这个问题(2017 年 9 月 29 日更新)
    • 可以相对轻松地转换为 Python 3
    • 警告:无法处理 CSS @import

回答by Sunshine

You can use this tool: https://github.com/Y2Z/monolith, it seems to be doing exactly what you need.

您可以使用此工具:https://github.com/Y2Z/monolith,它似乎完全符合您的需求。

There's also a browser extension for Chrome made straight out of that program, can be found here: https://chrome.google.com/webstore/detail/monolith/koalogomkahjlabefiglodpnhhkokekg

还有一个直接由该程序制作的 Chrome 浏览器扩展程序,可以在这里找到:https: //chrome.google.com/webstore/detail/monolith/koalogomkahjlabefiglodpnhhkokekg

回答by Cyril CCT

Usually, it's possible to create one HTML file that contains all his common children files (css, jpg, js, svg, ...)
You must rewrite the HTML file by replacing "src" attributes' value, "url()" functions and insert HTML tag like "<script></script>" for JavaScript files, "<style></style>" for CSS files and "<svg></svg>" for SVG image.

通常,可以创建一个包含所有常见子文件(css、jpg、js、svg 等)
的 HTML 文件您必须通过替换“ src”属性值、“ url()”函数并插入 HTML 标记来重写 HTML 文件像“ <script></script>”表示 JavaScript 文件,“ <style></style>”表示 CSS 文件,“ <svg></svg>”表示 SVG 图像。

For example a GIF image file in CSS called by the "url()" function.

例如,由“ url()”函数调用的 CSS 中的 GIF 图像文件。

  1. download the image from his URL.
  2. encode this image into Base64.
  3. replace "url('https://en.wikipedia.org/wiki/File:TPB_Magnet_Icon.gif')" by "url('')" with the Base64 encoded GIF image, prefixed by "data:image/gif;base64,"
  1. 从他的 URL 下载图像。
  2. 将此图像编码为 Base64。
  3. 用Base64 编码的 GIF 图像替换“ url('https://en.wikipedia.org/wiki/File:TPB_Magnet_Icon.gif')” 为“ url('')”,前缀为“ data:image/gif;base64,

You can do the same thing for the "src" attribute's value. This solution may be used for other binary files. You must adapt the right "data" prefix to corresponding to the encoded object.

您可以对 " src" 属性的值执行相同的操作。此解决方案可用于其他二进制文件。您必须data根据编码对象调整正确的“ ”前缀。

回答by Kingsley

Late answer: The "SingleFile" Plugin for Firefox works well. Even big documents like Project Gutenberg books with images save well.

迟到的答案:Firefox 的“SingleFile”插件运行良好。即使是像古腾堡计划这样的带有图像的大文件也能很好地保存。

Ref: https://addons.mozilla.org/en-US/firefox/addon/single-file/

参考:https: //addons.mozilla.org/en-US/firefox/addon/single-file/