xz.txt   xz.txt 
XZ(1) XZ Utils XZ (1) XZ(1) XZ Utils XZ (1)
NAME NAME
xz, unxz, xzcat, lzma, unlzma, lzcat - Compress or decompress .xz and xz, unxz, xzcat, lzma, unlzma, lzcat - Compress or decompress .xz and
.lzma files .lzma files
SYNOPSIS SYNOPSIS
xz [option]... [file]... xz [option...] [file...]
COMMAND ALIASES
unxz is equivalent to xz --decompress. unxz is equivalent to xz --decompress.
xzcat is equivalent to xz --decompress --stdout. xzcat is equivalent to xz --decompress --stdout.
lzma is equivalent to xz --format=lzma. lzma is equivalent to xz --format=lzma.
unlzma is equivalent to xz --format=lzma --decompress. unlzma is equivalent to xz --format=lzma --decompress.
lzcat is equivalent to xz --format=lzma --decompress --stdout. lzcat is equivalent to xz --format=lzma --decompress --stdout.
When writing scripts that need to decompress files, it is recommen ded When writing scripts that need to decompress files, it is recommen ded
to always use the name xz with appropriate arguments (xz -d or xz - dc) to always use the name xz with appropriate arguments (xz -d or xz - dc)
instead of the names unxz and xzcat. instead of the names unxz and xzcat.
skipping to change at line 317 skipping to change at line 318
is the default, since it is slightly better than CRC32 at is the default, since it is slightly better than CRC32 at
detecting damaged files and the speed difference is n eg- detecting damaged files and the speed difference is n eg-
ligible. ligible.
sha256 Calculate SHA-256. This is somewhat slower than CR C32 sha256 Calculate SHA-256. This is somewhat slower than CR C32
and CRC64. and CRC64.
Integrity of the .xz headers is always verified with CRC32. It Integrity of the .xz headers is always verified with CRC32. It
is not possible to change or disable it. is not possible to change or disable it.
--ignore-check
Don't verify the integrity check of the compressed data w
hen
decompressing. The CRC32 values in the .xz headers will st
ill
be verified normally.
Do not use this option unless you know what you are doing. P
os-
sible reasons to use this option:
o Trying to recover data from a corrupt .xz file.
o Speeding up decompression. This matters mostly with SHA-
256
or with files that have compressed extremely well. It's r
ec-
ommended to not use this option for this purpose unless
the
file integrity is verified externally in some other way.
-0 ... -9 -0 ... -9
Select a compression preset level. The default is -6. If m ul- Select a compression preset level. The default is -6. If m ul-
tiple preset levels are specified, the last one takes effe ct. tiple preset levels are specified, the last one takes effe ct.
If a custom filter chain was already specified, setting a c om- If a custom filter chain was already specified, setting a c om-
pression preset level clears the custom filter chain. pression preset level clears the custom filter chain.
The differences between the presets are more significant t han The differences between the presets are more significant t han
with gzip(1) and bzip2(1). The selected compression setti ngs with gzip(1) and bzip2(1). The selected compression setti ngs
determine the memory requirements of the decompressor, t hus determine the memory requirements of the decompressor, t hus
using a too high preset level might make it painful to dec om- using a too high preset level might make it painful to dec om-
skipping to change at line 396 skipping to change at line 412
o CompMem contains the compressor memory requirements in the o CompMem contains the compressor memory requirements in the
single-threaded mode. It may vary slightly between xz v er- single-threaded mode. It may vary slightly between xz v er-
sions. Memory requirements of some of the future mul ti- sions. Memory requirements of some of the future mul ti-
threaded modes may be dramatically higher than that of the threaded modes may be dramatically higher than that of the
single-threaded mode. single-threaded mode.
o DecMem contains the decompressor memory requirements. T hat o DecMem contains the decompressor memory requirements. T hat
is, the compression settings determine the memory requi re- is, the compression settings determine the memory requi re-
ments of the decompressor. The exact decompressor mem ory ments of the decompressor. The exact decompressor mem ory
usage is slighly more than the LZMA2 dictionary size, but usage is slightly more than the LZMA2 dictionary size,
the but
values in the table have been rounded up to the next f the values in the table have been rounded up to the next f
ull ull
MiB. MiB.
-e, --extreme -e, --extreme
Use a slower variant of the selected compression preset le vel Use a slower variant of the selected compression preset le vel
(-0 ... -9) to hopefully get a little bit better compress ion (-0 ... -9) to hopefully get a little bit better compress ion
ratio, but with bad luck this can also make it worse. Dec om- ratio, but with bad luck this can also make it worse. Dec om-
pressor memory usage is not affected, but compressor mem ory pressor memory usage is not affected, but compressor mem ory
usage increases a little at preset levels -0 ... -3. usage increases a little at preset levels -0 ... -3.
Since there are two presets with dictionary sizes 4 MiB and Since there are two presets with dictionary sizes 4 MiB and
skipping to change at line 436 skipping to change at line 452
-6, -5e, and -6e. -6, -5e, and -6e.
--fast --fast
--best These are somewhat misleading aliases for -0 and -9, resp ec- --best These are somewhat misleading aliases for -0 and -9, resp ec-
tively. These are provided only for backwards compatibil ity tively. These are provided only for backwards compatibil ity
with LZMA Utils. Avoid using these options. with LZMA Utils. Avoid using these options.
--block-size=size --block-size=size
When compressing to the .xz format, split the input data i nto When compressing to the .xz format, split the input data i nto
blocks of size bytes. The blocks are compressed independen tly blocks of size bytes. The blocks are compressed independen tly
from each other. from each other, which helps with multi-threading and makes l
im-
ited random-access decompression possible. This option is ty
pi-
cally used to override the default block size in multi-threa
ded
mode, but this option can be used in single-threaded mode too
.
In multi-threaded mode about three times size bytes will
be
allocated in each thread for buffering input and output.
The
default size is three times the LZMA2 dictionary size or 1 M
iB,
whichever is more. Typically a good value is 2-4 times the s
ize
of the LZMA2 dictionary or at least 1 MiB. Using size less t
han
the LZMA2 dictionary size is waste of RAM because then the LZ
MA2
dictionary buffer will never get fully used. The sizes of
the
blocks are stored in the block headers, which a future vers
ion
of xz will use for multi-threaded decompression.
In single-threaded mode no block splitting is done by defau
lt.
Setting this option doesn't affect memory usage. No size inf
or-
mation is stored in block headers, thus files created in sing
le-
threaded mode won't be identical to files created in mul
ti-
threaded mode. The lack of size information also means tha
t a
future version of xz won't be able decompress the files
in
multi-threaded mode.
--block-list=sizes --block-list=sizes
When compressing to the .xz format, start a new block after the When compressing to the .xz format, start a new block after the
given intervals of uncompressed data. given intervals of uncompressed data.
The uncompressed sizes of the blocks are specified as a com The uncompressed sizes of the blocks are specified as a com
ma- ma-
separated list. Omitting a size (two or more consecutive c separated list. Omitting a size (two or more consecutive c
om- om-
mas) is a shorthand to use the size of the previous block. mas) is a shorthand to use the size of the previous block.
A
special value of 0 may be used as the last value to indic If the input file is bigger than the sum of sizes, the l
ate ast
that the rest of the file should be encoded as a single block value in sizes is repeated until the end of the file. A spec
. ial
value of 0 may be used as the last value to indicate that
Currently this option is badly broken if used together w the
ith rest of the file should be encoded as a single block.
--block-size or with multithreading.
If one specifies sizes that exceed the encoder's block s
ize
(either the default value in threaded mode or the value spe
ci-
fied with --block-size=size), the encoder will create additio
nal
blocks while keeping the boundaries specified in sizes.
For
example, if one specifies --block-size=10
MiB
--block-list=5MiB,10MiB,8MiB,12MiB,24MiB and the input file
is
80 MiB, one will get 11 blocks: 5, 10, 8, 10, 2, 10, 10, 4,
10,
10, and 1 MiB.
In multi-threaded mode the sizes of the blocks are stored in
the
block headers. This isn't done in single-threaded mode, so
the
encoded output won't be identical to that of the multi-threa
ded
mode.
--flush-timeout=timeout
When compressing, if more than timeout milliseconds (a posit
ive
integer) has passed since the previous flush and reading m
ore
input would block, all the pending input data is flushed f
rom
the encoder and made available in the output stream. This
can
be useful if xz is used to compress data that is streamed ove
r a
network. Small timeout values make the data available at
the
receiving end with a small delay, but large timeout values g
ive
better compression ratio.
This feature is disabled by default. If this option is spe
ci-
fied more than once, the last one takes effect. The spec
ial
timeout value of 0 can be used to explicitly disable this f
ea-
ture.
This feature is not available on non-POSIX systems.
This feature is still experimental. Currently xz is unsuita
ble
for decompressing the stream in real time due to how xz d
oes
buffering.
--memlimit-compress=limit --memlimit-compress=limit
Set a memory usage limit for compression. If this option is Set a memory usage limit for compression. If this option is
specified multiple times, the last one takes effect. specified multiple times, the last one takes effect.
If the compression settings exceed the limit, xz will adjust the If the compression settings exceed the limit, xz will adjust the
settings downwards so that the limit is no longer exceeded settings downwards so that the limit is no longer exceeded
and and
display a notice that automatic adjustment was done. S display a notice that automatic adjustment was done. S
uch uch
adjustments are not made when compressing with --format=raw adjustments are not made when compressing with --format=raw
or or
if --no-adjust has been specified. In those cases, an error if --no-adjust has been specified. In those cases, an error
is is
displayed and xz will exit with exit status 1. displayed and xz will exit with exit status 1.
The limit can be specified in multiple ways: The limit can be specified in multiple ways:
o The limit can be an absolute value in bytes. Using an in o The limit can be an absolute value in bytes. Using an in
te- te-
ger suffix like MiB can be useful. Example: --memlimit-c ger suffix like MiB can be useful. Example: --memlimit-c
om- om-
press=80MiB press=80MiB
o The limit can be specified as a percentage of total physi cal o The limit can be specified as a percentage of total physi cal
memory (RAM). This can be useful especially when setting the memory (RAM). This can be useful especially when setting the
XZ_DEFAULTS environment variable in a shell initializat XZ_DEFAULTS environment variable in a shell initializat
ion ion
script that is shared between different computers. That script that is shared between different computers. That
way way
the limit is automatically bigger on systems with more m the limit is automatically bigger on systems with more m
em- em-
ory. Example: --memlimit-compress=70% ory. Example: --memlimit-compress=70%
o The limit can be reset back to its default value by sett o The limit can be reset back to its default value by sett
ing ing
it to 0. This is currently equivalent to setting the li it to 0. This is currently equivalent to setting the li
mit mit
to max (no memory usage limit). Once multithreading supp to max (no memory usage limit). Once multithreading supp
ort ort
has been implemented, there may be a difference between 0 and has been implemented, there may be a difference between 0 and
max for the multithreaded case, so it is recommended to us e 0 max for the multithreaded case, so it is recommended to us e 0
instead of max until the details have been decided. instead of max until the details have been decided.
See also the section Memory usage. See also the section Memory usage.
--memlimit-decompress=limit --memlimit-decompress=limit
Set a memory usage limit for decompression. This also affe Set a memory usage limit for decompression. This also affe
cts cts
the --list mode. If the operation is not possible with the --list mode. If the operation is not possible with
out out
exceeding the limit, xz will display an error and decompress exceeding the limit, xz will display an error and decompress
ing ing
the file will fail. See --memlimit-compress=limit for possi the file will fail. See --memlimit-compress=limit for possi
ble ble
ways to specify the limit. ways to specify the limit.
-M limit, --memlimit=limit, --memory=limit -M limit, --memlimit=limit, --memory=limit
This is equivalent to specifying --memlimit-compress=li mit This is equivalent to specifying --memlimit-compress=li mit
--memlimit-decompress=limit. --memlimit-decompress=limit.
--no-adjust --no-adjust
Display an error and exit if the compression settings exceed the Display an error and exit if the compression settings exceed the
memory usage limit. The default is to adjust the settings do wn- memory usage limit. The default is to adjust the settings do wn-
wards so that the memory usage limit is not exceeded. Automa tic wards so that the memory usage limit is not exceeded. Automa tic
adjusting is always disabled when creating raw streams (--f or- adjusting is always disabled when creating raw streams (--f or-
mat=raw). mat=raw).
-T threads, --threads=threads -T threads, --threads=threads
Specify the number of worker threads to use. Setting threads to Specify the number of worker threads to use. Setting threads to
a special value 0 makes xz use as many threads as there are a special value 0 makes xz use as many threads as there are
CPU CPU
cores on the system. The actual number of threads can be l cores on the system. The actual number of threads can be l
ess ess
than threads if the input file is not big enough for thread than threads if the input file is not big enough for thread
ing ing
with the given settings or if using more threads would exc with the given settings or if using more threads would exc
eed eed
the memory usage limit. the memory usage limit.
Currently the only threading method is to split the input i Currently the only threading method is to split the input i
nto nto
blocks and compress them independently from each other. blocks and compress them independently from each other.
The The
default block size depends on the compression level and can default block size depends on the compression level and can
be be
overriden with the --block-size=size option. overriden with the --block-size=size option.
It is possible that the details of this option change before
the
next stable XZ Utils release. This may include the meaning
of
the special value 0.
Custom compressor filter chains Custom compressor filter chains
A custom filter chain allows specifying the compression settings A custom filter chain allows specifying the compression settings
in in
detail instead of relying on the settings associated to the preset l detail instead of relying on the settings associated to the prese
ev- ts.
els. When a custom filter chain is specified, the compression pre When a custom filter chain is specified, preset options (-0 ... -9
set and
level options (-0 ... -9 and --extreme) are silently ignored. --extreme) earlier on the command line are forgotten. If a pre
set
option is specified after one or more custom filter chain options,
the
new preset takes effect and the custom filter chain options specif
ied
earlier are forgotten.
A filter chain is comparable to piping on the command line. When c om- A filter chain is comparable to piping on the command line. When c om-
pressing, the uncompressed input goes to the first filter, whose out put pressing, the uncompressed input goes to the first filter, whose out put
goes to the next filter (if any). The output of the last filter g ets goes to the next filter (if any). The output of the last filter g ets
written to the compressed file. The maximum number of filters in the written to the compressed file. The maximum number of filters in the
chain is four, but typically a filter chain has only one or two f il- chain is four, but typically a filter chain has only one or two f il-
ters. ters.
Many filters have limitations on where they can be in the filter cha in: Many filters have limitations on where they can be in the filter cha in:
some filters can work only as the last filter in the chain, some o nly some filters can work only as the last filter in the chain, some o nly
skipping to change at line 548 skipping to change at line 619
A custom filter chain is specified by using one or more filter opti ons A custom filter chain is specified by using one or more filter opti ons
in the order they are wanted in the filter chain. That is, the or der in the order they are wanted in the filter chain. That is, the or der
of filter options is significant! When decoding raw streams (--f or- of filter options is significant! When decoding raw streams (--f or-
mat=raw), the filter chain is specified in the same order as it was mat=raw), the filter chain is specified in the same order as it was
specified when compressing. specified when compressing.
Filters take filter-specific options as a comma-separated list. Ex tra Filters take filter-specific options as a comma-separated list. Ex tra
commas in options are ignored. Every option has a default value, so commas in options are ignored. Every option has a default value, so
you need to specify only those you want to change. you need to specify only those you want to change.
To see the whole filter chain and options, use xz -vv (that is,
use
--verbose twice). This works also for viewing the filter chain opti
ons
used by presets.
--lzma1[=options] --lzma1[=options]
--lzma2[=options] --lzma2[=options]
Add LZMA1 or LZMA2 filter to the filter chain. These filt ers Add LZMA1 or LZMA2 filter to the filter chain. These filt ers
can be used only as the last filter in the chain. can be used only as the last filter in the chain.
LZMA1 is a legacy filter, which is supported almost solely due LZMA1 is a legacy filter, which is supported almost solely due
to the legacy .lzma file format, which supports only LZM A1. to the legacy .lzma file format, which supports only LZM A1.
LZMA2 is an updated version of LZMA1 to fix some practi cal LZMA2 is an updated version of LZMA1 to fix some practi cal
issues of LZMA1. The .xz format uses LZMA2 and doesn't supp ort issues of LZMA1. The .xz format uses LZMA2 and doesn't supp ort
LZMA1 at all. Compression speed and ratios of LZMA1 and LZ MA2 LZMA1 at all. Compression speed and ratios of LZMA1 and LZ MA2
are practically the same. are practically the same.
LZMA1 and LZMA2 share the same set of options: LZMA1 and LZMA2 share the same set of options:
preset=preset preset=preset
Reset all LZMA1 or LZMA2 options to preset. Preset c on- Reset all LZMA1 or LZMA2 options to preset. Preset c on-
sist of an integer, which may be followed by single-l et- sist of an integer, which may be followed by single-l et-
ter preset modifiers. The integer can be from 0 to 9, ter preset modifiers. The integer can be from 0 to 9,
matching the command line options -0 ... -9. The o nly matching the command line options -0 ... -9. The o nly
supported modifier is currently e, which matc hes supported modifier is currently e, which matc hes
--extreme. The default preset is 6, from which --extreme. If no preset is specified, the default val
the ues
default values for the rest of the LZMA1 or LZMA2 opti of LZMA1 or LZMA2 options are taken from the preset 6.
ons
are taken.
dict=size dict=size
Dictionary (history buffer) size indicates how many by tes Dictionary (history buffer) size indicates how many by tes
of the recently processed uncompressed data is kept of the recently processed uncompressed data is kept
in in
memory. The algorithm tries to find repeating b memory. The algorithm tries to find repeating b
yte yte
sequences (matches) in the uncompressed data, and repl ace sequences (matches) in the uncompressed data, and repl ace
them with references to the data currently in the dict io- them with references to the data currently in the dict io-
nary. The bigger the dictionary, the higher is the nary. The bigger the dictionary, the higher is the
chance to find a match. Thus, increasing dictionary s ize chance to find a match. Thus, increasing dictionary s ize
usually improves compression ratio, but a dictionary b ig- usually improves compression ratio, but a dictionary b ig-
ger than the uncompressed file is waste of memory. ger than the uncompressed file is waste of memory.
Typical dictionary size is from 64 KiB to 64 MiB. Typical dictionary size is from 64 KiB to 64 MiB.
The The
minimum is 4 KiB. The maximum for compression is c minimum is 4 KiB. The maximum for compression is c
ur- ur-
rently 1.5 GiB (1536 MiB). The decompressor already s up- rently 1.5 GiB (1536 MiB). The decompressor already s up-
ports dictionaries up to one byte less than 4 GiB, wh ich ports dictionaries up to one byte less than 4 GiB, wh ich
is the maximum for the LZMA1 and LZMA2 stream formats. is the maximum for the LZMA1 and LZMA2 stream formats.
Dictionary size and match finder (mf) together determ ine Dictionary size and match finder (mf) together determ ine
the memory usage of the LZMA1 or LZMA2 encoder. The s ame the memory usage of the LZMA1 or LZMA2 encoder. The s ame
(or bigger) dictionary size is required for decompress ing (or bigger) dictionary size is required for decompress ing
that was used when compressing, thus the memory usage that was used when compressing, thus the memory usage
of of
the decoder is determined by the dictionary size u the decoder is determined by the dictionary size u
sed sed
when compressing. The .xz headers store the diction when compressing. The .xz headers store the diction
ary ary
size either as 2^n or 2^n + 2^(n-1), so these sizes size either as 2^n or 2^n + 2^(n-1), so these sizes
are are
somewhat preferred for compression. Other sizes will get somewhat preferred for compression. Other sizes will get
rounded up when stored in the .xz headers. rounded up when stored in the .xz headers.
lc=lc Specify the number of literal context bits. The mini lc=lc Specify the number of literal context bits. The mini
mum mum
is 0 and the maximum is 4; the default is 3. In ad is 0 and the maximum is 4; the default is 3. In ad
di- di-
tion, the sum of lc and lp must not exceed 4. tion, the sum of lc and lp must not exceed 4.
All bytes that cannot be encoded as matches are enco All bytes that cannot be encoded as matches are enco
ded ded
as literals. That is, literals are simply 8-bit by as literals. That is, literals are simply 8-bit by
tes tes
that are encoded one at a time. that are encoded one at a time.
The literal coding makes an assumption that the high The literal coding makes an assumption that the high
est est
lc bits of the previous uncompressed byte correlate w lc bits of the previous uncompressed byte correlate w
ith ith
the next byte. E.g. in typical English text, an upp the next byte. E.g. in typical English text, an upp
er- er-
case letter is often followed by a lower-case letter, and case letter is often followed by a lower-case letter, and
a lower-case letter is usually followed by another low er- a lower-case letter is usually followed by another low er-
case letter. In the US-ASCII character set, the high case letter. In the US-ASCII character set, the high
est est
three bits are 010 for upper-case letters and 011 three bits are 010 for upper-case letters and 011
for for
lower-case letters. When lc is at least 3, the lite lower-case letters. When lc is at least 3, the lite
ral ral
coding can take advantage of this property in the unc coding can take advantage of this property in the unc
om- om-
pressed data. pressed data.
The default value (3) is usually good. If you want ma xi- The default value (3) is usually good. If you want ma xi-
mum compression, test lc=4. Sometimes it helps a litt le, mum compression, test lc=4. Sometimes it helps a litt le,
and sometimes it makes compression worse. If it makes it and sometimes it makes compression worse. If it makes it
worse, test e.g. lc=2 too. worse, test e.g. lc=2 too.
lp=lp Specify the number of literal position bits. The mini mum lp=lp Specify the number of literal position bits. The mini mum
is 0 and the maximum is 4; the default is 0. is 0 and the maximum is 4; the default is 0.
Lp affects what kind of alignment in the uncompres sed Lp affects what kind of alignment in the uncompres sed
data is assumed when encoding literals. See pb below for data is assumed when encoding literals. See pb below for
more information about alignment. more information about alignment.
pb=pb Specify the number of position bits. The minimum i s 0 pb=pb Specify the number of position bits. The minimum is 0
and the maximum is 4; the default is 2. and the maximum is 4; the default is 2.
Pb affects what kind of alignment in the uncompres Pb affects what kind of alignment in the uncompres
sed sed
data is assumed in general. The default means four-b data is assumed in general. The default means four-b
yte yte
alignment (2^pb=2^2=4), which is often a good choice w hen alignment (2^pb=2^2=4), which is often a good choice w hen
there's no better guess. there's no better guess.
When the aligment is known, setting pb accordingly may When the aligment is known, setting pb accordingly may
reduce the file size a little. E.g. with text files h av- reduce the file size a little. E.g. with text files h av-
ing one-byte alignment (US-ASCII, ISO-8859-*, UTF- 8), ing one-byte alignment (US-ASCII, ISO-8859-*, UTF- 8),
setting pb=0 can improve compression slightly. For setting pb=0 can improve compression slightly. For
UTF-16 text, pb=1 is a good choice. If the alignment UTF-16 text, pb=1 is a good choice. If the alignment
is is
an odd number like 3 bytes, pb=0 might be the b an odd number like 3 bytes, pb=0 might be the b
est est
choice. choice.
Even though the assumed alignment can be adjusted with pb Even though the assumed alignment can be adjusted with pb
and lp, LZMA1 and LZMA2 still slightly favor 16-b and lp, LZMA1 and LZMA2 still slightly favor 16-b
yte yte
alignment. It might be worth taking into account w alignment. It might be worth taking into account w
hen hen
designing file formats that are likely to be often c designing file formats that are likely to be often c
om- om-
pressed with LZMA1 or LZMA2. pressed with LZMA1 or LZMA2.
mf=mf Match finder has a major effect on encoder speed, mem mf=mf Match finder has a major effect on encoder speed, mem
ory ory
usage, and compression ratio. Usually Hash Chain ma usage, and compression ratio. Usually Hash Chain ma
tch tch
finders are faster than Binary Tree match finders. finders are faster than Binary Tree match finders.
The The
default depends on the preset: 0 uses hc3, 1-3 use h default depends on the preset: 0 uses hc3, 1-3 use h
c4, c4,
and the rest use bt4. and the rest use bt4.
The following match finders are supported. The mem The following match finders are supported. The mem
ory ory
usage formulas below are rough approximations, which usage formulas below are rough approximations, which
are are
closest to the reality when dict is a power of two. closest to the reality when dict is a power of two.
hc3 Hash Chain with 2- and 3-byte hashing hc3 Hash Chain with 2- and 3-byte hashing
Minimum value for nice: 3 Minimum value for nice: 3
Memory usage: Memory usage:
dict * 7.5 (if dict <= 16 MiB); dict * 7.5 (if dict <= 16 MiB);
dict * 5.5 + 64 MiB (if dict > 16 MiB) dict * 5.5 + 64 MiB (if dict > 16 MiB)
hc4 Hash Chain with 2-, 3-, and 4-byte hashing hc4 Hash Chain with 2-, 3-, and 4-byte hashing
Minimum value for nice: 4 Minimum value for nice: 4
skipping to change at line 692 skipping to change at line 766
dict * 9.5 + 64 MiB (if dict > 16 MiB) dict * 9.5 + 64 MiB (if dict > 16 MiB)
bt4 Binary Tree with 2-, 3-, and 4-byte hashing bt4 Binary Tree with 2-, 3-, and 4-byte hashing
Minimum value for nice: 4 Minimum value for nice: 4
Memory usage: Memory usage:
dict * 11.5 (if dict <= 32 MiB); dict * 11.5 (if dict <= 32 MiB);
dict * 10.5 (if dict > 32 MiB) dict * 10.5 (if dict > 32 MiB)
mode=mode mode=mode
Compression mode specifies the method to analyze the d ata Compression mode specifies the method to analyze the d ata
produced by the match finder. Supported modes are f ast produced by the match finder. Supported modes are f ast
and normal. The default is fast for presets 0-3 and n or- and normal. The default is fast for presets 0-3 and n or-
mal for presets 4-9. mal for presets 4-9.
Usually fast is used with Hash Chain match finders and Usually fast is used with Hash Chain match finders and
normal with Binary Tree match finders. This is also w hat normal with Binary Tree match finders. This is also w hat
the presets do. the presets do.
nice=nice nice=nice
Specify what is considered to be a nice length for a Specify what is considered to be a nice length fo r a
match. Once a match of at least nice bytes is found, the match. Once a match of at least nice bytes is found, the
algorithm stops looking for possibly better matches. algorithm stops looking for possibly better matches.
Nice can be 2-273 bytes. Higher values tend to give b et- Nice can be 2-273 bytes. Higher values tend to give b et-
ter compression ratio at the expense of speed. The ter compression ratio at the expense of speed. The
default depends on the preset. default depends on the preset.
depth=depth depth=depth
Specify the maximum search depth in the match find Specify the maximum search depth in the match find
er. er.
The default is the special value of 0, which makes The default is the special value of 0, which makes
the the
compressor determine a reasonable depth from mf and ni ce. compressor determine a reasonable depth from mf and ni ce.
Reasonable depth for Hash Chains is 4-100 and 16-1000 for Reasonable depth for Hash Chains is 4-100 and 16-1000 for
Binary Trees. Using very high values for depth can m Binary Trees. Using very high values for depth can m
ake ake
the encoder extremely slow with some files. Avoid s the encoder extremely slow with some files. Avoid s
et- et-
ting the depth over 1000 unless you are prepared ting the depth over 1000 unless you are prepared
to to
interrupt the compression in case it is taking far interrupt the compression in case it is taking far
too too
long. long.
When decoding raw streams (--format=raw), LZMA2 needs only the When decoding raw streams (--format=raw), LZMA2 needs only the
dictionary size. LZMA1 needs also lc, lp, and pb. dictionary size. LZMA1 needs also lc, lp, and pb.
--x86[=options] --x86[=options]
--powerpc[=options] --powerpc[=options]
--ia64[=options] --ia64[=options]
--arm[=options] --arm[=options]
--armthumb[=options] --armthumb[=options]
--sparc[=options] --sparc[=options]
Add a branch/call/jump (BCJ) filter to the filter chain. Th Add a branch/call/jump (BCJ) filter to the filter chain. Th
ese ese
filters can be used only as a non-last filter in the fil filters can be used only as a non-last filter in the fil
ter ter
chain. chain.
A BCJ filter converts relative addresses in the machine code A BCJ filter converts relative addresses in the machine code
to to
their absolute counterparts. This doesn't change the size their absolute counterparts. This doesn't change the size
of of
the data, but it increases redundancy, which can help LZMA2 the data, but it increases redundancy, which can help LZMA2
to to
produce 0-15 % smaller .xz file. The BCJ filters are alw produce 0-15 % smaller .xz file. The BCJ filters are alw
ays ays
reversible, so using a BCJ filter for wrong type of data does n't reversible, so using a BCJ filter for wrong type of data does n't
cause any data loss, although it may make the compression ra tio cause any data loss, although it may make the compression ra tio
slightly worse. slightly worse.
It is fine to apply a BCJ filter on a whole executable; ther It is fine to apply a BCJ filter on a whole executable; ther
e's e's
no need to apply it only on the executable section. Applyin no need to apply it only on the executable section. Applying
g a a
BCJ filter on an archive that contains both executable and n BCJ filter on an archive that contains both executable and n
on- on-
executable files may or may not give good results, so it gen executable files may or may not give good results, so it gen
er- er-
ally isn't good to blindly apply a BCJ filter when compress ally isn't good to blindly apply a BCJ filter when compress
ing ing
binary packages for distribution. binary packages for distribution.
These BCJ filters are very fast and use insignificant amount These BCJ filters are very fast and use insignificant amount
of of
memory. If a BCJ filter improves compression ratio of a fi memory. If a BCJ filter improves compression ratio of a fi
le, le,
it can improve decompression speed at the same time. This it can improve decompression speed at the same time. This
is is
because, on the same hardware, the decompression speed of LZ because, on the same hardware, the decompression speed of LZ
MA2 MA2
is roughly a fixed number of bytes of compressed data per s is roughly a fixed number of bytes of compressed data per s
ec- ec-
ond. ond.
These BCJ filters have known problems related to the compress ion These BCJ filters have known problems related to the compress ion
ratio: ratio:
o Some types of files containing executable code (e.g. obj o Some types of files containing executable code (e.g. obj
ect ect
files, static libraries, and Linux kernel modules) have files, static libraries, and Linux kernel modules) have
the the
addresses in the instructions filled with filler valu addresses in the instructions filled with filler valu
es. es.
These BCJ filters will still do the address conversion, wh ich These BCJ filters will still do the address conversion, wh ich
will make the compression worse with these files. will make the compression worse with these files.
o Applying a BCJ filter on an archive containing multiple si mi- o Applying a BCJ filter on an archive containing multiple si mi-
lar executables can make the compression ratio worse than not lar executables can make the compression ratio worse than not
using a BCJ filter. This is because the BCJ filter does using a BCJ filter. This is because the BCJ filter does
n't n't
detect the boundaries of the executable files, and does detect the boundaries of the executable files, and does
n't n't
reset the address conversion counter for each executable. reset the address conversion counter for each executable.
Both of the above problems will be fixed in the future in a Both of the above problems will be fixed in the future in a
new new
filter. The old BCJ filters will still be useful in embed filter. The old BCJ filters will still be useful in embed
ded ded
systems, because the decoder of the new filter will be big systems, because the decoder of the new filter will be big
ger ger
and use more memory. and use more memory.
Different instruction sets have have different alignment: Different instruction sets have have different alignment:
Filter Alignment Notes Filter Alignment Notes
x86 1 32-bit or 64-bit x86 x86 1 32-bit or 64-bit x86
PowerPC 4 Big endian only PowerPC 4 Big endian only
ARM 4 Little endian only ARM 4 Little endian only
ARM-Thumb 2 Little endian only ARM-Thumb 2 Little endian only
IA-64 16 Big or little endian IA-64 16 Big or little endian
SPARC 4 Big or little endian SPARC 4 Big or little endian
Since the BCJ-filtered data is usually compressed with LZM Since the BCJ-filtered data is usually compressed with LZM
A2, A2,
the compression ratio may be improved slightly if the LZ the compression ratio may be improved slightly if the LZ
MA2 MA2
options are set to match the alignment of the selected BCJ f options are set to match the alignment of the selected BCJ f
il- il-
ter. For example, with the IA-64 filter, it's good to set p ter. For example, with the IA-64 filter, it's good to set p
b=4 b=4
with LZMA2 (2^4=16). The x86 filter is an exception; it's u with LZMA2 (2^4=16). The x86 filter is an exception; it's u
su- su-
ally good to stick to LZMA2's default four-byte alignment w ally good to stick to LZMA2's default four-byte alignment w
hen hen
compressing x86 executables. compressing x86 executables.
All BCJ filters support the same options: All BCJ filters support the same options:
start=offset start=offset
Specify the start offset that is used when convert ing Specify the start offset that is used when convert ing
between relative and absolute addresses. The offset m ust between relative and absolute addresses. The offset m ust
be a multiple of the alignment of the filter (see the ta- be a multiple of the alignment of the filter (see the ta-
ble above). The default is zero. In practice, ble above). The default is zero. In practice,
the the
default is good; specifying a custom offset is alm default is good; specifying a custom offset is alm
ost ost
never useful. never useful.
--delta[=options] --delta[=options]
Add the Delta filter to the filter chain. The Delta filter can Add the Delta filter to the filter chain. The Delta filter can
be only used as a non-last filter in the filter chain. be only used as a non-last filter in the filter chain.
Currently only simple byte-wise delta calculation is support Currently only simple byte-wise delta calculation is support
ed. ed.
It can be useful when compressing e.g. uncompressed bit It can be useful when compressing e.g. uncompressed bit
map map
images or uncompressed PCM audio. However, special purp images or uncompressed PCM audio. However, special purp
ose ose
algorithms may give significantly better results than Delt algorithms may give significantly better results than Delta
a + +
LZMA2. This is true especially with audio, which compres LZMA2. This is true especially with audio, which compres
ses ses
faster and better e.g. with flac(1). faster and better e.g. with flac(1).
Supported options: Supported options:
dist=distance dist=distance
Specify the distance of the delta calculation in byt es. Specify the distance of the delta calculation in byt es.
distance must be 1-256. The default is 1. distance must be 1-256. The default is 1.
For example, with dist=2 and eight-byte input A1 B1 A2 B3 For example, with dist=2 and eight-byte input A1 B1 A2 B3
A3 B5 A4 B7, the output will be A1 B1 01 02 01 02 01 0 2. A3 B5 A4 B7, the output will be A1 B1 01 02 01 02 01 0 2.
Other options Other options
-q, --quiet -q, --quiet
Suppress warnings and notices. Specify this twice to suppr ess Suppress warnings and notices. Specify this twice to suppr ess
errors too. This option has no effect on the exit status. T hat errors too. This option has no effect on the exit status. T hat
is, even if a warning was suppressed, the exit status to in di- is, even if a warning was suppressed, the exit status to in di-
cate a warning is still used. cate a warning is still used.
-v, --verbose -v, --verbose
Be verbose. If standard error is connected to a terminal, Be verbose. If standard error is connected to a terminal,
xz xz
will display a progress indicator. Specifying --verbose tw will display a progress indicator. Specifying --verbose tw
ice ice
will give even more verbose output. will give even more verbose output.
The progress indicator shows the following information: The progress indicator shows the following information:
o Completion percentage is shown if the size of the input f ile o Completion percentage is shown if the size of the input f ile
is known. That is, the percentage cannot be shown in pipe s. is known. That is, the percentage cannot be shown in pipe s.
o Amount of compressed data produced (compressing) or consu med o Amount of compressed data produced (compressing) or consu med
(decompressing). (decompressing).
o Amount of uncompressed data consumed (compressing) or p ro- o Amount of uncompressed data consumed (compressing) or p ro-
duced (decompressing). duced (decompressing).
o Compression ratio, which is calculated by dividing the amo unt o Compression ratio, which is calculated by dividing the amo unt
of compressed data processed so far by the amount of unc om- of compressed data processed so far by the amount of unc om-
pressed data processed so far. pressed data processed so far.
o Compression or decompression speed. This is measured as o Compression or decompression speed. This is measured as
the the
amount of uncompressed data consumed (compression) or p amount of uncompressed data consumed (compression) or p
ro- ro-
duced (decompression) per second. It is shown after a duced (decompression) per second. It is shown after a
few few
seconds have passed since xz started processing the file. seconds have passed since xz started processing the file.
o Elapsed time in the format M:SS or H:MM:SS. o Elapsed time in the format M:SS or H:MM:SS.
o Estimated remaining time is shown only when the size of o Estimated remaining time is shown only when the size of
the the
input file is known and a couple of seconds have alre input file is known and a couple of seconds have alre
ady ady
passed since xz started processing the file. The time passed since xz started processing the file. The time
is is
shown in a less precise format which never has any colo shown in a less precise format which never has any colo
ns, ns,
e.g. 2 min 30 s. e.g. 2 min 30 s.
When standard error is not a terminal, --verbose will make xz When standard error is not a terminal, --verbose will make xz
print the filename, compressed size, uncompressed size, compr es- print the filename, compressed size, uncompressed size, compr es-
sion ratio, and possibly also the speed and elapsed time on a sion ratio, and possibly also the speed and elapsed time o n a
single line to standard error after compressing or decompress ing single line to standard error after compressing or decompress ing
the file. The speed and elapsed time are included only when the the file. The speed and elapsed time are included only when the
operation took at least a few seconds. If the operation did operation took at least a few seconds. If the operation did
n't n't
finish, e.g. due to user interruption, also the completion p finish, e.g. due to user interruption, also the completion p
er- er-
centage is printed if the size of the input file is known. centage is printed if the size of the input file is known.
-Q, --no-warn -Q, --no-warn
Don't set the exit status to 2 even if a condition worth a wa rn- Don't set the exit status to 2 even if a condition worth a wa rn-
ing was detected. This option doesn't affect the verbos ing was detected. This option doesn't affect the verbos
ity ity
level, thus both --quiet and --no-warn have to be used to level, thus both --quiet and --no-warn have to be used to
not not
display warnings and to not alter the exit status. display warnings and to not alter the exit status.
--robot --robot
Print messages in a machine-parsable format. This is inten Print messages in a machine-parsable format. This is inten
ded ded
to ease writing frontends that want to use xz instead to ease writing frontends that want to use xz instead
of of
liblzma, which may be the case with various scripts. The out put liblzma, which may be the case with various scripts. The out put
with this option enabled is meant to be stable across xz with this option enabled is meant to be stable across xz
releases. See the section ROBOT MODE for details. releases. See the section ROBOT MODE for details.
--info-memory --info-memory
Display, in human-readable format, how much physical mem Display, in human-readable format, how much physical mem
ory ory
(RAM) xz thinks the system has and the memory usage limits (RAM) xz thinks the system has and the memory usage limits
for for
compression and decompression, and exit successfully. compression and decompression, and exit successfully.
-h, --help -h, --help
Display a help message describing the most commonly u sed Display a help message describing the most commonly u sed
options, and exit successfully. options, and exit successfully.
-H, --long-help -H, --long-help
Display a help message describing all features of xz, and e xit Display a help message describing all features of xz, and e xit
successfully successfully
-V, --version -V, --version
Display the version number of xz and liblzma in human reada Display the version number of xz and liblzma in human reada
ble ble
format. To get machine-parsable output, specify --robot bef format. To get machine-parsable output, specify --robot bef
ore ore
--version. --version.
ROBOT MODE ROBOT MODE
The robot mode is activated with the --robot option. It makes the o ut- The robot mode is activated with the --robot option. It makes the o ut-
put of xz easier to parse by other programs. Currently --robot is s up- put of xz easier to parse by other programs. Currently --robot is s up-
ported only together with --version, --info-memory, and --list. ported only together with --version, --info-memory, and --list.
It It
will be supported for normal compression and decompression in will be supported for compression and decompression in the future.
the
future.
Version Version
xz --robot --version will print the version number of xz and liblzma in xz --robot --version will print the version number of xz and liblzma in
the following format: the following format:
XZ_VERSION=XYYYZZZS XZ_VERSION=XYYYZZZS
LIBLZMA_VERSION=XYYYZZZS LIBLZMA_VERSION=XYYYZZZS
X Major version. X Major version.
skipping to change at line 1210 skipping to change at line 1283
NOTES NOTES
Compressed output may vary Compressed output may vary
The exact compressed output produced from the same uncompressed in put The exact compressed output produced from the same uncompressed in put
file may vary between XZ Utils versions even if compression options are file may vary between XZ Utils versions even if compression options are
identical. This is because the encoder can be improved (faster or b et- identical. This is because the encoder can be improved (faster or b et-
ter compression) without affecting the file format. The output can ter compression) without affecting the file format. The output can
vary even between different builds of the same XZ Utils version, if vary even between different builds of the same XZ Utils version, if
different build options are used. different build options are used.
The above means that implementing --rsyncable to create rsyncable The above means that once --rsyncable has been implemented, the resu
.xz lt-
files is not going to happen without freezing a part of the enco ing files won't necessarily be rsyncable unless both old and new fi
der les
implementation, which can then be used with --rsyncable. have been compressed with the same xz version. This problem can
be
fixed if a part of the encoder implementation is frozen to keep rsyn
ca-
ble output stable across xz versions.
Embedded .xz decompressors Embedded .xz decompressors
Embedded .xz decompressor implementations like XZ Embedded don't nec es- Embedded .xz decompressor implementations like XZ Embedded don't nec es-
sarily support files created with integrity check types other than n one sarily support files created with integrity check types other than n one
and crc32. Since the default is --check=crc64, you must use and crc32. Since the default is --check=crc64, you must use
--check=none or --check=crc32 when creating files for embedded syste ms. --check=none or --check=crc32 when creating files for embedded syste ms.
Outside embedded systems, all .xz format decompressors support all the Outside embedded systems, all .xz format decompressors support all the
check types, or at least are able to decompress the file without ve ri- check types, or at least are able to decompress the file without ve ri-
fying the integrity check if the particular check is not supported. fying the integrity check if the particular check is not supported.
skipping to change at line 1310 skipping to change at line 1385
-0 ... -9 and --extreme are useful when customizing LZMA2 prese ts. -0 ... -9 and --extreme are useful when customizing LZMA2 prese ts.
Here are the relevant parts collected from those two tables: Here are the relevant parts collected from those two tables:
Preset CompCPU Preset CompCPU
-0 0 -0 0
-1 1 -1 1
-2 2 -2 2
-3 3 -3 3
-4 4 -4 4
-5 5 -5 5
-6 6 -6 6
-5e 7 -5e 7
-6e 8 -6e 8
If you know that a file requires somewhat big dictionary (e.g. 32 M iB) If you know that a file requires somewhat big dictionary (e.g. 32 M iB)
to compress well, but you want to compress it quicker than xz -8 wo uld to compress well, but you want to compress it quicker than xz -8 wo uld
do, a preset with a low CompCPU value (e.g. 1) can be modified to us e a do, a preset with a low CompCPU value (e.g. 1) can be modified to us e a
bigger dictionary: bigger dictionary:
xz --lzma2=preset=1,dict=32MiB foo.tar xz --lzma2=preset=1,dict=32MiB foo.tar
With certain files, the above command may be faster than xz -6 wh ile With certain files, the above command may be faster than xz -6 wh ile
skipping to change at line 1398 skipping to change at line 1473
number of bytes per pixel. number of bytes per pixel.
SEE ALSO SEE ALSO
xzdec(1), xzdiff(1), xzgrep(1), xzless(1), xzmore(1), gzip( 1), xzdec(1), xzdiff(1), xzgrep(1), xzless(1), xzmore(1), gzip( 1),
bzip2(1), 7z(1) bzip2(1), 7z(1)
XZ Utils: <http://tukaani.org/xz/> XZ Utils: <http://tukaani.org/xz/>
XZ Embedded: <http://tukaani.org/xz/embedded.html> XZ Embedded: <http://tukaani.org/xz/embedded.html>
LZMA SDK: <http://7-zip.org/sdk.html> LZMA SDK: <http://7-zip.org/sdk.html>
Tukaani 2012-07-03 XZ (1) Tukaani 2014-12-16 XZ (1)
 End of changes. 85 change blocks. 
314 lines changed or deleted 437 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/