Total Pageviews

Sunday 9 April 2023

SingleFile把完整的网页保存为HTML文件

 

SingleFile 是 FirefoxChrome 的 Web 扩展,它可以帮助您将完整的网页保存到单个 HTML 文件中。

我们都知道直接在网站中右键另存为网页,可以将网页保存到本地。但仅保存网页的话,所引用的资源都是外部链接。

一般想要保存某个网站页面,大多都是防止该页面出现网络不稳定其他原因无法访问。才会想要去保存这个页面,不然直接存书签不就好了吗?

通过保存网页全部内容的方式是可以下载所有的网页资源,但是不便于携带/保存。

通常会保存为一个文件夹 + 网页的形式,有很多场合都不适用,例如分享给别人你需要先打包成压缩包的形式分享给别人。

而直接存储为 html 格式你可以直接通过邮件、上传自己的服务器,上传到网盘等。别人下载后都可以直接打开。

有朋友可能会说了,那保存为 pdf 格式的文件不就好了吗?当网页中有树状目录锚文本的时候 pdf 并不适用。

SingleFile 可以很好的制作一个 html 页面里面同时包含了样式、图片等静态资源。可以完好的离线显示你所保存的页面。

https://github.com/gildas-lormeau/SingleFile

该作者还开发了一个进阶版 SingleFileZ,这个属于是 SingleFile 的升级版。将网页保存为.html 格式文件,可以直接在谷歌浏览器中打开,同时支持解压。

选中文件通过压缩软件可以直接解压成类似你通过‘保存网页全部内容’所得到的文件。

( 浏览器插件SingleFileZ

SingleFileZ是可以把网页包完整打包压缩为 HTML 的浏览器插件,支持Chrome和Firefox。原理为打包成 zip 压缩包,再用单一的 HTML 封存,配上能让浏览器自解压的脚本, 同时可以直接使用压缩工具打开由 SingleFileZ 生成的 HTML 文件。SingleFileZ遵守AGPL v3开源协议。

使用时注意:Chrome 启动时需要允许插件访问URL, 或启动时添加 –allow-file-access-from-files 参数。此外需要允许JavaScrip运行。

[repo owner=”gildas-lormeau” name=”SingleFileZ”] )

---------------------------------------------------------

 Web Extension and CLI tool for saving a faithful copy of an entire web page in a single HTML file.

SingleFile

SingleFile is a Web Extension (and a CLI tool) compatible with Chrome, Firefox (Desktop and Mobile), Microsoft Edge, Safari, Vivaldi, Brave, Waterfox, Yandex browser, and Opera. It helps you to save a complete web page into a single HTML file.

Table of Contents

Demo

Install

SingleFile can be installed on:

You can also download the zip file (https://github.com/gildas-lormeau/SingleFile/archive/master.zip) of the project and install it manually by unzipping it somewhere on your disk and following these instructions:

Getting started

  • Click on the SingleFile button in the extension toolbar to save the page.
  • You can click again on the button to cancel the action when processing a page.

Additional notes

  • Open the context menu by right-clicking the SingleFile button in the extension toolbar or on the webpage. It allows you to save:
    • the current tab,
    • the selected content,
    • the selected frame.
  • You can also process multiple tabs in one click and save:
    • the selected tabs,
    • the unpinned tabs,
    • all the tabs.
  • Select "Annotate and save the page..." in the context menu to:
    • highlight text,
    • add notes,
    • remove content.
  • The context menu also allows you to activate the auto-save of:
    • the current tab,
    • the unpinned tabs,
    • all the tabs.
  • With auto-save active, pages are automatically saved every time after being loaded (or before being unloaded if not).
  • Right-click on the SingleFile button and select "Manage extension" (Firefox) / "Options" (Chrome) to open the options page.
  • Enable the option "Destination > save to Google Drive" or "Destination > upload to GitHub" to upload pages to Google Drive or GitHub respectively.
  • Enable the option "Misc. > add proof of existence" to prove the existence of saved pages by linking the SHA256 of the pages into the blockchain.
  • You can use the customizable shortkey Ctrl+Shift+Y to save the current tab or the selected tabs. Go to about:addons and select "Manage extension shortcuts" in the cogwheel menu to change it in Firefox. Go to chrome://extensions/shortcuts to change it in Chrome.
  • The default save folder is the download folder configured in your browser, cf. about:addons in Firefox and chrome://settings in Chrome.
  • See the extension help in the options page for more detailed information about the options and technical notes.

FAQ

See https://github.com/gildas-lormeau/SingleFile/blob/master/faq.md

Release notes

See https://addons.mozilla.org/firefox/addon/single-file/versions/

Known Issues

  • All browsers:
    • For security reasons, you cannot save pages hosted on https://chrome.google.com, https://addons.mozilla.org and some other Mozilla domains. When this happens, 🛇 is displayed on top of the SingleFile icon.
    • For security reasons, SingleFile is sometimes unable to save the image representation of canvas and snapshots of video elements.
    • The last saved path cannot be remembered by default. To circumvent this limitation, disable the option "Misc > save pages in background".
    • The following characters are replaced with _ in file names: ~, +, \, ?, %, *, :, |, ", <, >
  • Chromium-based browsers:
    • You must enable the option "Allow access to file URLs" in the extension page to display the infobar when viewing a saved page, and to save or to annotate a page stored on the filesystem.
    • If the file name of a saved page looks like "56833935-156b-4d8c-a00f-19599c6513d3.html", disable the option "Misc > save pages in background". Reinstalling the browser may also fix this issue. You can find more info about this bug here.
    • Disabling the option "File name > open the "Save as" dialog to confirm the file name" will work if and only if the option "Ask where to save each file before downloading" is disabled in chrome://settings/downloads.
  • Firefox:
    • The "File name > file name conflict resolution" option does not work if set to "prompt for a name"
    • Sometimes, SingleFile is unable to save the contents of sandboxed iframes because of this bug.
    • When processing a page from the filesystem, external resources (e.g. images, stylesheets, fonts etc.) will not be embedded into the saved page. You can find more info about this bug here. This bug has been closed by Mozilla as "WontFix". But there is a simple workaround proposed here.
  • Waterfox Classic
    • User interface elements displayed in the page (progress bar, logs panel) won't be displayed unless dom.webcomponents.enabled is enabled in about:config.
    • When opening pages saved with the option "Images > group duplicate images together" enabled, some duplicate images might not displayed. It is recommended to disable this option.

Troubleshooting unknown issues

Please follow these steps if you find an unknown issue:

  • Save the page in incognito.
  • If saving page in incognito did not fix the issue, reset SingleFile options.
  • If resetting options did not fix the issue, restart the browser.
  • If restarting the browser did not fix the issue, try to disable all other extensions to see if there is a conflict.
  • If there is a conflict then try to determine against which extension(s).
  • Please report the issue with a short description on how to reproduce it here: https://github.com/gildas-lormeau/SingleFile/issues.

Command Line Interface (SingleFile CLI)

You can save web pages to HTML from the command line interface. See here for more info: https://github.com/gildas-lormeau/single-file-cli.

Integration with user scripts

You can execute a user script just before (and after) SingleFile saves a page. For more info, see https://github.com/gildas-lormeau/SingleFile/wiki/How-to-execute-a-user-script-before-a-page-is-saved.

SingleFileZ

SingleFileZ is a fork of SingleFile that allows you to save a webpage as a self-extracting HTML file. This HTML file is also a valid ZIP file which contains the resources (images, fonts, stylesheets and frames) of the saved page. This ZIP file can be unzipped on the filesystem in order, for example, to view the page in a browser that would not support pages saved with SingleFileZ.

More info here: https://github.com/gildas-lormeau/SingleFileZ

File format comparison


HTML (SingleFile) HTML (SingleFileZ) MAFF MHTML Webarchive (Safari) HTML+folder
Pages are saved as a single file
HTML and styles are minified



Unused HTML and styles are removed from files



Binary resources are not encoded in base 64

Files are compressed



Files can be viewed without installing any extension ✓¹
✓² ✓³
Files can be viewed without running JavaScript
Files can be unzipped to extract resources and view pages


n/a
Files contains the text of the page (plain or formatted) which can be indexed ✓⁴

Footnotes:

¹ A switch must be passed from the command line in Chromium-based browsers, and an option must be enabled in Safari.

² Only in Chromium-based browsers, and Internet Explorer.

³ Only in Safari.

⁴ An option must be enabled in the extension.

Projects using/compatible with SingleFile

Privacy Policy

See https://github.com/gildas-lormeau/SingleFile/blob/master/privacy.md

Contributors

Code derived from third party projects

from  https://github.com/gildas-lormeau/SingleFile 

------------------------------------------------------------------

右键“存储为”来离线保存网页的方式,对网页的一些图片等素材不太友好,甚至有时候保存的离线资源打开还有各种的问题。

这不,小妹在 Github 上面发现一款开源的浏览器扩展 - SingleFile,可以完美的解决这个问题,非常优秀!
项目简介

SingleFile,是一个浏览器的插件,支持 Chrome Firefox Edge等等常见的浏览器。

这个插件可以一键将当前网页保存为单个文件,且能够很好的解决离线打开文件时图片等素材的显示问题。

目前 SingleFile 已经有 8K+ Star,Chrome 应用商店显示也超过 10万+ 的用户在使用。
使用方法

1、保存网页

安装完成后,网页点击右键,即可将网页保存为 html 文件,并自动下载到本地中了,如下图:

2、一键保存多个网页

通过Ctrl键,依次选择需要保存的网页。

再单击 SingleFile 插件的图标,点击三个点,选择“保存标签页”-“保存选中的标签”。

另外,插件还支持一些高级功能例如:自动保存、定期保存、自定义设置等等。大家可以自行探索。
小结

总的来说,SingleFile操作非常简单,容易上手,一键就可以解决存档、书签、搜索三大问题。

 

 

 

 

No comments:

Post a Comment