lossy compression, which is used for graphics files or streaming video and audio files, does result in
information loss, though that loss is often imperceptible to our senses.
Most data-compression techniques use a code, which is a mapping of the basic units (or symbols) in the
source to a code alphabet. For example, all the spaces in a text file could be replaced by
a single code word
and the number of spaces. A compression algorithm is used to set up the mapping and then create a new file
using the code alphabet; the compressed file will be smaller than the original and thus easier to transmit or
store. Here are some of the categories into which common lossless-compression techniques fall:
•
Run-length compression, which replaces repeating characters with a code and a
value representing the
number of repetitions of that character (examples: Pack-Bits and PCX).
•
Minimum redundancy coding or simple entropy coding, which assigns codes on the basis of probability, with
the most frequent symbols receiving the shortest codes (examples: Huffman coding and arithmetic coding).
•
Dictionary coders, which use a dynamically updated symbol dictionary to represent patterns (examples:
Lempel-Ziv, Lempel-Ziv-Welch, and DEFLATE).
•
Block-sorting compression, which reorganizes characters rather
than using a code alphabet; run-length
compression can then be used to compress the repeating strings (example: Burrows-Wheeler transform).
•
Prediction by partial mapping, which uses a set of symbols in the uncompressed file to predict how often the
next symbol in the file appears.
4.
Murray Gell-Mann, "What Is Complexity?" in
Complexity
, vol. 1 (New York: John Wiley and Sons, 1995).
5.
The human genetic code has approximately six billion (about 10
10
) bits, not considering the possibility of
compression. So the 10
27
bits that theoretically can be stored in a one-kilogram rock is greater
than the genetic
code by a factor of 10
17
. See note 57 below for a discussion of genome compression.
6.
Of course, a human, who is also composed of an enormous number of particles, contains an amount of
information comparable to a rock of similar weight when NOTES 509 we consider the properties of all the
particles. As with the rock, the bulk of this information is not needed to characterize the state of the person. On
the other hand, much more information is needed to characterize a person than a rock.
7.
See note 175 in chapter 5 for an algorithmic description of genetic algorithms.
8.
Humans, chimpanzees, gorillas, and orangutans are all included in the scientific
classification of hominids
(family
Do'stlaringiz bilan baham: