Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best "file format" for saving complete web pages (images, etc.) in a single archive? [closed]

I'm working on a project which stores single images and text files in one place, like a time capsule. Now, most every project can be saved as one file, like DOC, PPT, and ODF. But complete web pages can't -- they're saved as a separate HTML file and data folder. I want to save a web page in a single archive, and while there are several solutions, there's no "standard". Which is the best format for HTML archives?

  • Microsoft has MHTML -- basically a file encoded exactly as a MIME HTML email message. It's already based on an existing standard, and MHTML as its own was proposed as rfc2557. This is a great idea and it's been around forever, except it's been a "proposed standard" since 1999. Plus, implementations other than IE's are just cumbersome. IE and Opera support it; Firefox and Safari with a cumbersome extension.

  • Mozilla has Mozilla Archive Format -- basically a ZIP file with the markup and images, with metadata saved as RDF. It's an awesome idea -- Winamp does this for skins, and ODF and OOXML for their embedded images. I love this, except, 1. Nobody else except Mozilla uses it, 2. The only extension supporting it wasn't updated since Firefox 1.5.

  • Data URIs are becoming more popular. Instead of referencing an external location a la MHTML or MAF, you encode the file straight into the HTML markup as base64. Depending on your view, it's streamlined since the files are right where the markup is. However, support is still somewhat weak. Firefox, Opera, and Safari support it without gaffes; IE, the market leader, only started supporting it at IE8, and even then with limits.

  • Then of course, there's "Save complete webpage" where the HTML markup is saved as "savedpage.html" and the files in a separate "savedpage_files" folder. Afaik, everyone does this. It's well supported. But having to handle two separate elements is not simple and streamlined at all. My project needs to have them in a single archive.

Keeping in mind browser support and ease of editing the page, what do you think's the best way to save web pages in a single archive? What would be best as a "standard"? Or should I just buckle down and deal with the HTML file and separate folder? For the sake of my project, I could support that, but I'd best avoid it.

like image 731
Marco Avatar asked Nov 03 '08 21:11

Marco


People also ask

What file format should Web pages be saved as?

To resume, when you save your Web page, you must save it in text only format with either the . htm or . html extension.

What is the best format to store data?

CSV (Comma Separated Values) CSV is one of the most common file formats for storing textual data. These files can be opened using a wide variety of programs including Notepad. The reason behind using this format over others is its ability to store complex data in a simple and readable way.

What are the different image file formats and when is the best time to use?

The three file types that you should use for web are JPEGs, PNGs, and GIFs. JPEGs are ideal for photography because they're the best of the three for color preservation. PNGs should be used for art that needs to be saved at a higher quality or requires a transparent background — recommended for logos.

What is the best image format to be used in your projects?

The . PNG file format is what we would most often use in digital design projects. The . PNG, or Portable Network Graphic, tends to have a larger file size than the other two formats, but it also preserves hard edges and can handle a high amount of colors while keeping the load time fast enough to use online.


2 Answers

My favourite is the ZIP format. Because:

  • It is very well sutied for the purpose
  • It is well documented
  • There a a lot of implementations available for creating or reading them
  • A user can easily extract single files, change them and put them back in the archive
  • Almost every major Operating System (Windows, Mac and most linux) have a ZIP program built in

The alternatives all have some flaw:

  • With MHTMl, you can not easily edit.
  • With data URI's, I don't know how difficult the implementation would be. (With ZIP, even I could do it in PHP, 3 years ago...)
  • The option to store things as seperate files just has far too many things that could go wrong and mess up your archive.
like image 136
Treb Avatar answered Sep 23 '22 16:09

Treb


PDFs are supported on nearly all browsers on nearly all platforms and store content and images in a single file. They can be edited with the right tools. This is almost definitely not ideal, but it's an option to consider.

like image 22
Joel Anair Avatar answered Sep 23 '22 16:09

Joel Anair