Short form:

.zip is an archive format using, usually, the Deflate compression method. The .gz gzip format is for single files, also using the Deflate compression method. Often gzip is used in combination with tar to make a compressed archive format, .tar.gz. The zlib library provides Deflate compression and decompression code for use by zip, gzip, png (which uses the zlib wrapper on deflate data), and many other applications.

Long form:

The ZIP format was developed by Phil Katz as an open format with an open specification, where his implementation, PKZIP, was shareware. It is an archive format that stores files and their directory structure, where each file is individually compressed. The file type is .zip. The files, as well as the directory structure, can optionally be encrypted.

The ZIP format supports several compression methods:

    0 - The file is stored (no compression)
    1 - The file is Shrunk
    2 - The file is Reduced with compression factor 1
    3 - The file is Reduced with compression factor 2
    4 - The file is Reduced with compression factor 3
    5 - The file is Reduced with compression factor 4
    6 - The file is Imploded
    7 - Reserved for Tokenizing compression algorithm
    8 - The file is Deflated
    9 - Enhanced Deflating using Deflate64(tm)
   10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
   11 - Reserved by PKWARE
   12 - File is compressed using BZIP2 algorithm
   13 - Reserved by PKWARE
   14 - LZMA
   15 - Reserved by PKWARE
   16 - IBM z/OS CMPSC Compression
   17 - Reserved by PKWARE
   18 - File is compressed using IBM TERSE (new)
   19 - IBM LZ77 z Architecture 
   20 - deprecated (use method 93 for zstd)
   93 - Zstandard (zstd) Compression 
   94 - MP3 Compression 
   95 - XZ Compression 
   96 - JPEG variant
   97 - WavPack compressed data
   98 - PPMd version I, Rev 1
   99 - AE-x encryption marker (see APPENDIX E)

Methods 1 to 7 are historical and are not in use. Methods 9 through 98 are relatively recent additions and are in varying, small amounts of use. The only method in truly widespread use in the ZIP format is method 8, Deflate, and to some smaller extent method 0, which is no compression at all. Virtually every .zip file that you will come across in the wild will use exclusively methods 8 and 0, likely just method 8. (Method 8 also has a means to effectively store the data with no compression and relatively little expansion, and Method 0 cannot be streamed whereas Method 8 can be.)

The ISO/IEC 21320-1:2015 standard for file containers is a restricted zip format, such as used in Java archive files (.jar), Office Open XML files (Microsoft Office .docx, .xlsx, .pptx), Office Document Format files (.odt, .ods, .odp), and EPUB files (.epub). That standard limits the compression methods to 0 and 8, as well as other constraints such as no encryption or signatures.

Around 1990, the Info-ZIP group wrote portable, free, open-source implementations of zip and unzip utilities, supporting compression with the Deflate format, and decompression of that and the earlier formats. This greatly expanded the use of the .zip format.

In the early '90s, the gzip format was developed as a replacement for the Unix compress utility, derived from the Deflate code in the Info-ZIP utilities. Unix compress was designed to compress a single file or stream, appending a .Z to the file name. compress uses the LZW compression algorithm, which at the time was under patent and its free use was in dispute by the patent holders. Though some specific implementations of Deflate were patented by Phil Katz, the format was not, and so it was possible to write a Deflate implementation that did not infringe on any patents. That implementation has not been so challenged in the last 20+ years. The Unix gzip utility was intended as a drop-in replacement for compress, and in fact is able to decompress compress-compressed data (assuming that you were able to parse that sentence). gzip appends a .gz to the file name. gzip uses the Deflate compressed data format, which compresses quite a bit better than Unix compress, has very fast decompression, and adds a CRC-32 as an integrity check for the data. The header format also permits the storage of more information than the compress format allowed, such as the original file name and the file modification time.

Though compress only compresses a single file, it was common to use the tar utility to create an archive of files, their attributes, and their directory structure into a single .tar file, and then compress it with compress to make a .tar.Z file. In fact, the tar utility had and still has the option to do the compression at the same time, instead of having to pipe the output of tar to compress. This all carried forward to the gzip format, and tar has an option to compress directly to the .tar.gz format. The tar.gz format compresses better than the .zip approach, since the compression of a .tar can take advantage of redundancy across files, especially many small files. .tar.gz is the most common archive format in use on Unix due to its very high portability, but there are more effective compression methods in use as well, so you will often see .tar.bz2 and .tar.xz archives.

Unlike .tar, .zip has a central directory at the end, which provides a list of the contents. That and the separate compression provides random access to the individual entries in a .zip file. A .tar file would have to be decompressed and scanned from start to end in order to build a directory, which is how a .tar file is listed.

Shortly after the introduction of gzip, around the mid-1990s, the same patent dispute called into question the free use of the .gif image format, very widely used on bulletin boards and the World Wide Web (a new thing at the time). So a small group created the PNG losslessly compressed image format, with file type .png, to replace .gif. That format also uses the Deflate format for compression, which is applied after filters on the image data expose more of the redundancy. In order to promote widespread usage of the PNG format, two free code libraries were created. libpng and zlib. libpng handled all of the features of the PNG format, and zlib provided the compression and decompression code for use by libpng, as well as for other applications. zlib was adapted from the gzip code.

All of the mentioned patents have since expired.

The zlib library supports Deflate compression and decompression, and three kinds of wrapping around the deflate streams. Those are no wrapping at all ("raw" deflate), zlib wrapping, which is used in the PNG format data blocks, and gzip wrapping, to provide gzip routines for the programmer. The main difference between zlib and gzip wrapping is that the zlib wrapping is more compact, six bytes vs. a minimum of 18 bytes for gzip, and the integrity check, Adler-32, runs faster than the CRC-32 that gzip uses. Raw deflate is used by programs that read and write the .zip format, which is another format that wraps around deflate compressed data.

zlib is now in wide use for data transmission and storage. For example, most HTTP transactions by servers and browsers compress and decompress the data using zlib, specifically HTTP header Content-Encoding: deflate means deflate compression method wrapped inside the zlib data format.

Different implementations of deflate can result in different compressed output for the same input data, as evidenced by the existence of selectable compression levels that allow trading off compression effectiveness for CPU time. zlib and PKZIP are not the only implementations of deflate compression and decompression. Both the 7-Zip archiving utility and Google's zopfli library have the ability to use much more CPU time than zlib in order to squeeze out the last few bits possible when using the deflate format, reducing compressed sizes by a few percent as compared to zlib's highest compression level. The pigz utility, a parallel implementation of gzip, includes the option to use zlib (compression levels 1-9) or zopfli (compression level 11), and somewhat mitigates the time impact of using zopfli by splitting the compression of large files over multiple processors and cores.

Answer from Mark Adler on Stack Overflow
Top answer
1 of 3
3283

Short form:

.zip is an archive format using, usually, the Deflate compression method. The .gz gzip format is for single files, also using the Deflate compression method. Often gzip is used in combination with tar to make a compressed archive format, .tar.gz. The zlib library provides Deflate compression and decompression code for use by zip, gzip, png (which uses the zlib wrapper on deflate data), and many other applications.

Long form:

The ZIP format was developed by Phil Katz as an open format with an open specification, where his implementation, PKZIP, was shareware. It is an archive format that stores files and their directory structure, where each file is individually compressed. The file type is .zip. The files, as well as the directory structure, can optionally be encrypted.

The ZIP format supports several compression methods:

    0 - The file is stored (no compression)
    1 - The file is Shrunk
    2 - The file is Reduced with compression factor 1
    3 - The file is Reduced with compression factor 2
    4 - The file is Reduced with compression factor 3
    5 - The file is Reduced with compression factor 4
    6 - The file is Imploded
    7 - Reserved for Tokenizing compression algorithm
    8 - The file is Deflated
    9 - Enhanced Deflating using Deflate64(tm)
   10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
   11 - Reserved by PKWARE
   12 - File is compressed using BZIP2 algorithm
   13 - Reserved by PKWARE
   14 - LZMA
   15 - Reserved by PKWARE
   16 - IBM z/OS CMPSC Compression
   17 - Reserved by PKWARE
   18 - File is compressed using IBM TERSE (new)
   19 - IBM LZ77 z Architecture 
   20 - deprecated (use method 93 for zstd)
   93 - Zstandard (zstd) Compression 
   94 - MP3 Compression 
   95 - XZ Compression 
   96 - JPEG variant
   97 - WavPack compressed data
   98 - PPMd version I, Rev 1
   99 - AE-x encryption marker (see APPENDIX E)

Methods 1 to 7 are historical and are not in use. Methods 9 through 98 are relatively recent additions and are in varying, small amounts of use. The only method in truly widespread use in the ZIP format is method 8, Deflate, and to some smaller extent method 0, which is no compression at all. Virtually every .zip file that you will come across in the wild will use exclusively methods 8 and 0, likely just method 8. (Method 8 also has a means to effectively store the data with no compression and relatively little expansion, and Method 0 cannot be streamed whereas Method 8 can be.)

The ISO/IEC 21320-1:2015 standard for file containers is a restricted zip format, such as used in Java archive files (.jar), Office Open XML files (Microsoft Office .docx, .xlsx, .pptx), Office Document Format files (.odt, .ods, .odp), and EPUB files (.epub). That standard limits the compression methods to 0 and 8, as well as other constraints such as no encryption or signatures.

Around 1990, the Info-ZIP group wrote portable, free, open-source implementations of zip and unzip utilities, supporting compression with the Deflate format, and decompression of that and the earlier formats. This greatly expanded the use of the .zip format.

In the early '90s, the gzip format was developed as a replacement for the Unix compress utility, derived from the Deflate code in the Info-ZIP utilities. Unix compress was designed to compress a single file or stream, appending a .Z to the file name. compress uses the LZW compression algorithm, which at the time was under patent and its free use was in dispute by the patent holders. Though some specific implementations of Deflate were patented by Phil Katz, the format was not, and so it was possible to write a Deflate implementation that did not infringe on any patents. That implementation has not been so challenged in the last 20+ years. The Unix gzip utility was intended as a drop-in replacement for compress, and in fact is able to decompress compress-compressed data (assuming that you were able to parse that sentence). gzip appends a .gz to the file name. gzip uses the Deflate compressed data format, which compresses quite a bit better than Unix compress, has very fast decompression, and adds a CRC-32 as an integrity check for the data. The header format also permits the storage of more information than the compress format allowed, such as the original file name and the file modification time.

Though compress only compresses a single file, it was common to use the tar utility to create an archive of files, their attributes, and their directory structure into a single .tar file, and then compress it with compress to make a .tar.Z file. In fact, the tar utility had and still has the option to do the compression at the same time, instead of having to pipe the output of tar to compress. This all carried forward to the gzip format, and tar has an option to compress directly to the .tar.gz format. The tar.gz format compresses better than the .zip approach, since the compression of a .tar can take advantage of redundancy across files, especially many small files. .tar.gz is the most common archive format in use on Unix due to its very high portability, but there are more effective compression methods in use as well, so you will often see .tar.bz2 and .tar.xz archives.

Unlike .tar, .zip has a central directory at the end, which provides a list of the contents. That and the separate compression provides random access to the individual entries in a .zip file. A .tar file would have to be decompressed and scanned from start to end in order to build a directory, which is how a .tar file is listed.

Shortly after the introduction of gzip, around the mid-1990s, the same patent dispute called into question the free use of the .gif image format, very widely used on bulletin boards and the World Wide Web (a new thing at the time). So a small group created the PNG losslessly compressed image format, with file type .png, to replace .gif. That format also uses the Deflate format for compression, which is applied after filters on the image data expose more of the redundancy. In order to promote widespread usage of the PNG format, two free code libraries were created. libpng and zlib. libpng handled all of the features of the PNG format, and zlib provided the compression and decompression code for use by libpng, as well as for other applications. zlib was adapted from the gzip code.

All of the mentioned patents have since expired.

The zlib library supports Deflate compression and decompression, and three kinds of wrapping around the deflate streams. Those are no wrapping at all ("raw" deflate), zlib wrapping, which is used in the PNG format data blocks, and gzip wrapping, to provide gzip routines for the programmer. The main difference between zlib and gzip wrapping is that the zlib wrapping is more compact, six bytes vs. a minimum of 18 bytes for gzip, and the integrity check, Adler-32, runs faster than the CRC-32 that gzip uses. Raw deflate is used by programs that read and write the .zip format, which is another format that wraps around deflate compressed data.

zlib is now in wide use for data transmission and storage. For example, most HTTP transactions by servers and browsers compress and decompress the data using zlib, specifically HTTP header Content-Encoding: deflate means deflate compression method wrapped inside the zlib data format.

Different implementations of deflate can result in different compressed output for the same input data, as evidenced by the existence of selectable compression levels that allow trading off compression effectiveness for CPU time. zlib and PKZIP are not the only implementations of deflate compression and decompression. Both the 7-Zip archiving utility and Google's zopfli library have the ability to use much more CPU time than zlib in order to squeeze out the last few bits possible when using the deflate format, reducing compressed sizes by a few percent as compared to zlib's highest compression level. The pigz utility, a parallel implementation of gzip, includes the option to use zlib (compression levels 1-9) or zopfli (compression level 11), and somewhat mitigates the time impact of using zopfli by splitting the compression of large files over multiple processors and cores.

2 of 3
65

ZIP is a file format used for storing an arbitrary number of files and folders together with lossless compression. It makes no strict assumptions about the compression methods used, but is most frequently used with DEFLATE.

Gzip is both a compression algorithm based on DEFLATE but less encumbered with potential patents et al, and a file format for storing a single compressed file. It supports compressing an arbitrary number of files and folders when combined with tar. The resulting file has an extension of .tgz or .tar.gz and is commonly called a tarball.

zlib is a library of functions encapsulating DEFLATE in its most common LZ77 incarnation.

🌐
Baeldung
baeldung.com › home › algorithms › data compression: zlib vs. gzip vs. zip
Data Compression: ZLib vs. GZip vs. Zip | Baeldung on Computer Science
March 18, 2024 - The main drawback of zlib is that it doesn’t have any checksum mechanism to maintain the integrity of data. gzip is a popular data compression and decompression method. It’s mainly used to compress a single file and is found in Unix/Linux systems. Additionally, we can also utilize gzip to compress the HTTP content.
🌐
DEV Community
dev.to › biellls › compression-clearing-the-confusion-on-zip-gzip-zlib-and-deflate-15g1
Compression: Clearing the Confusion on ZIP, GZIP, Zlib and DEFLATE - DEV Community
January 21, 2022 - I was surprised to find out that GZIP, zlib or even ZIP are not compression algorithms, they are actually file formats that can permit different compression algorithms. Even more surprising, virtually every implementation of those three actually use the same lossless data compression algorithm.
🌐
Encode
encode.su › threads › 3176-gzip-vs-zlib-benchmarking-considerations
gzip vs zlib; benchmarking considerations
Actually did a test and compression with zlib, from library, is 20%(level 1) - 50%(level 6), 60%(level 9) slower than gzip. Decompression is 10% faster, which is surprising, although, maybe not that much as, afaiu gzip uses slower crc32 (slice by 1) when zlib uses bigger slices but it's very complicated code.
🌐
GitHub
github.com › zlib-ng › zlib-ng › discussions › 871
2.0.0 Benchmark comparisons · zlib-ng/zlib-ng · Discussion #871
Gzip compression is about twice as fast as zlib, but zlib compresses slightly better than gzip (due to minigzip using a bare minimum of headers?), and decompression with gzip takes about 30% less time.
Author   zlib-ng
Find elsewhere
🌐
TutorialsPoint
tutorialspoint.com › compression-compatible-with-gzip-in-python-zlib
Compression compatible with gzip in Python (zlib)
June 25, 2020 - Python's standard library has a rich collection of modules for data compression and archiving. One can select whichever is suitable for his job. There are following modules related to data compression −
🌐
Google Groups
groups.google.com › g › boost-developers-archive › c › GE4LclG4mMs
[boost] [iostreams][gzip][zlib] zlib vs gzip, and linking against external libraries
January 17, 2015 - > I think gzip and zlib are two different variants of the DEFLATE algorithm and they differ in their header information.
🌐
Medium
aminshamim.xyz › gzip-deflate-brotli-and-zstd-which-compression-algorithm-should-you-use-for-your-website-033ca5cfa7ca
Gzip, Deflate, Brotli, and Zstd: Which Compression Algorithm Should You Use for Your Website? | by Md Aminul Islam Sarker | Medium
November 28, 2025 - When it comes to web performance optimization, reducing the size of data sent from your server to the client is crucial. One of the most effective ways to achieve this is through compression. But with several algorithms to choose from — including Gzip, Deflate, Brotli, and Zstandard (Zstd) — how do you know which one is best for your website?
🌐
Dlecocq
dlecocq.github.io › blog › 2011 › 12 › 16 › python's-zlib-and-gzip,-performance,-and-you
Python's zlib and gzip, Performance, and You - My Octopress Blog
December 16, 2011 - Gzip is actually just a file format, apparently most commonly used with zlib’s compression. It provides a file header and a footer and a little bit of metadata, but it really is merely a wrapper around zlib. However, while python’s zlib module is a compiled C extension, the gzip module is a pure python implementation that makes calls to zlib.
Top answer
1 of 1
17

I had a use case where I needed to pack a bunch of files into one

Ah, you need an archive of files

And all above commands does the same.

Not at all! Some are archivers, some are compressors, some are decompressors, some a combination.

  • ar: very archaic, use cases are very specific. Pretty certain you don't ever want to use ar yourself.
  • gzip / gunzip: Not an archiver. Can take a single stream of data and compress it (or decompress it, in case of gunzip). You can use this together with an archiver. Gzip is very old and slow and inefficient, there's alternatives that achieve much higher compression or higher speed, or any mixture of that (e.g., zstd, lz4)
  • tar: Short for tape archiver; a very common archiver that you can also tell to compress stuff. For example:
tar cf archive.tar file1 file2 file3

creates an uncompressed archive containing file1, file2 and file3. However, adding the z option to the create command (I know, tar's syntax is hellish):

tar czf archive.tar.gz file1 file2 file3

will make tar use gzip internally and create a tar archive that's been compressed.
You can also just pipe the result through any compressor of your choice to get compressed archives, e.g.

tar cf - file1 file2 file3 | gzip > archive.tar.gz # or
tar cf - file1 file2 file3 | zstd > archive.tar.zst # or
tar cf - file1 file2 file3 | lz4 - archive.tar.xz # or
tar cf - file1 file2 file3 | xz > archive.tar.xz

You get the idea.
As common as tar is, it's a very old program and format(s), and it leaves a lot to be desired. But it does correctly deal with Linux file ownership, permissions, links, special files…

  • zip is a compressing archiver. Works very nicely with windows, as well, but can not deal with file permissions. Hence, not usable for backups!
  • 7z is like zip, a compressing archiver, which cannot deal with user and permission information. Hence, not usable for backups!
  • mksquashfs is a kind of an archiver, meant for very neatly packed archives, that can also be used like normal file systems. It can use modern, on request very fast or very strong compression.

Now some would say you would save some time while transferring files on network using compression but unzipping and decompression compensates for the time that I would have saved in transfer.

And those people would be right! If you use a modern, speed-optimized compression, you'd be faster than reading or writing from an SSD with decompression. And much, much faster than your network would ever be (unless you are looking at datacenter-grade networking).

So, if speed is your concern, use something that makes use of a fast compressor. As said, gzip is probably not the compressor of choice in 2023, so

tar cf - srces/ | zstd -1 > archive.tar.zst

achieves an archival rate of roughly 3 Gb/s (in case you planned to put this through network, and thought the compressor would be a bottleneck) in my test that uses a mixture of source code, binary files. It makes 1.4 GB out of the original 4.97 GB. Using -2 instead of -1 makes the result another 10% smaller, and reduces the speed to 2.5 Gb/s. Which is still faster than most SATA SSDs could write. And this was single-threaded. Use zstd -2 -T0 to make use of all CPU cores, and my humble PC does 6.5 Gb/s; zstd -4 -T0 still does 2.5 Gb/s, so more than most of my network cards could do, and gets the size down to 1.2 GB :)

So:

  • Need to archive files, but fast, for sending them to other people who might not have the same software as you? tar cv - files… | zstd -4 -T0 > archive.tar.zst is what you want
  • Need to archive files, but strongly compressed, for sending them to other people who might not have the same software as you? tar cv - files… | zstd -13 -T0 > archive.tar.zst is slower, but gives very high compression ratios already.
  • Need to archive files, want to read them later on, without having do un-archive things? mksquashfs files… archive.squashfs -comp=zstd; add -Xcompression-level 4 to the end for higher speed at the expense of size.

The resulting archive.tar.zst files can be directly unarchived with modern GNU tar, tar xf archive.tar.zst; the archive.squashfs can either be mounted directly udisksctl loop-setup -f archive.squashfs and used like a DVD (i.e., you can directly browse the files on it), or de-archived using unsquashfs archive.squashfs

🌐
Grokipedia
grokipedia.com › zlib
zlib — Grokipedia
January 14, 2026 - During decompression of zlib-formatted streams, an ADLER-32 checksum is automatically computed on the uncompressed output and compared against the value stored in the stream trailer; mismatches trigger a Z_DATA_ERROR. For gzip-wrapped deflate data, a CRC-32 checksum is used instead and validated against the gzip trailer.
🌐
Aditya Thebe
adityathebe.com › gzip-zip-overview
An overview of the gzip & zip file formats | Aditya Thebe
On the other hand, zip compresses one file at a time and then creates an archive. Since it's only compressing one file at a time, it doesn't do as good of a job in compression. Usually, gzip produces smaller output.
🌐
CRAN
cran.r-project.org › web › packages › zlib › zlib.pdf pdf
Package ‘zlib’ July 21, 2025 Version 1.0.3 Type Package
October 18, 2023 - This function takes a file path as input and checks if it’s a valid gzip-compressed file. It reads the · file in chunks and tries to decompress it using the zlib library.
🌐
Wikipedia
en.wikipedia.org › wiki › Gzip
gzip - Wikipedia
April 14, 2026 - OpenBSD's version of gzip is actually ... for the gzip format was added in OpenBSD 3.4. The "g" in this specific version stands for gratis. FreeBSD, DragonFly BSD and NetBSD use a BSD-licensed implementation instead of the GNU version; it is actually a command-line interface for zlib intended to ...
🌐
Alibaba Cloud
developer.aliyun.com › article › 640519
zlib and gzip-阿里云开发者社区
July 3, 2020 - zlib同时又是一种数据格式,使用zlib库压缩后的数据会在deflate数据的头和尾添加信息,形成zlib格式的数据。 · gzip也是一种数据压缩格式,可以大体分为头部,数据部和尾部三个部分,其中头部和尾部主要是一些文档属性和校验信息(rfc1952),数据部主要是用deflate方法压缩得到的数据。 zlib库默认的压缩方法并不是gzip的,而是zlib的,因此使用zlib压缩得到gzip格式的数据有两种方法: