Block Arithmetic Coding Source Compression: Abstruct
Block Arithmetic Coding Source Compression: Abstruct
Block Arithmetic Coding Source Compression: Abstruct
5, SEPTEMBER 1993
Abstruct- We introduce “Block Arithmetic Coding” (BAC), a average at least H(X)output bits; for V to F encoders, each
technique for entropy coding that combines many of the advan- output bit can on average represent at most ,1/H(X) input
tages of ordinary stream arithmetic coding with the simplicity symbols.
of block codes. The code is variable length in to fixed out (V
to F), unlike Huffman coding which is fixed in to variable out The most important entropy compressor is Huffman coding
(F to V). We develop two versions of the coder: 1) an optimal [9]. Huffman codes are F to V block codes. They block
encoder based on dynamic programming arguments, and 2) a the input into strings of q letters and encode these strings
suboptimal heuristic based on arithmetic coding. The optimal with variable length output strings. It is well-known that the
coder is oetimal over all V to F complete and proper block codes. number of output bits of a Huffman code is bounded above
We show that the suboptimal coder achieves compression that is
within a constant of a perfect entropy coder for independent and +
by q H ( X ) 1 for all block sizes. In special cases, this bound
identically distributed inputs. BAC is easily implemented, even can be tightened [5], [17]. As a result the relative efficiency of
with large codebooks, because the algorithms for coding and Huffman codes can be made as high as desired by taking the
decoding are regular. For instance, codebooks with 232 entries block size to be large enough. In practice, however, large block
are feasible. BAC also does not suffer catastrophic failure in the sizes are difficult to implement. Codebooks are usually stored
presence of channel errors. Decoding e m r s am confined to the
block in question. The encoding is in practice reasonably efficient. and must be searched for each input string. This storage and
With i.i.d. binary inputs with P(1) = 0.95 and 16 bit codes, searching limits block sizes to small values. (However, see
entropy arguments indicate at most 55.8 bits can be encoded; the Jakobsson [111.)
BAC heuristic achieves 53.0 and the optimal BAC achieves 53.5. Huffman codes can be implemented adaptively, but it is
Finally, BAC appears to be much faster than ordinary arithmetic
coding. difficult to do so. The process of generating Huffman code
trees is sufficiently cumbersome that adapting the tree on a
Index Terms-Arithmetic codes, block codes, variable to fixed symbol by symbol basis is rarely computationally efficient.
codes, entropy compression.
In recent years, a class of entropy coders called arithmetic
coders has been studied and developed [SI, [13], [MI-[21],
[25]. Arithmetic coders are stream coders. They take in an ar-
I. INTRODUCTION
bitrarily long input and output a corresponding output stream.
Other V to F entropy coders are runlength [7] and Ziv- 11. DEVELOPMENT
OF BLOCKARITHMETIC CODING
Lempel [14], [24], [27]. Runlength codes count the runs The BAC encoder parses the input into variable length
of single symbols and encode the runs by transmitting the input strings and encodes each with single fixed length output
number in each run. They work well when the probabilities are codewords.
highly skewed or when there is significant positive correlation The following definitions are taken from Jelinek and Schnei-
between symbols. Runlength codes are often used for small der [12]: Let W ( K ) = {wi : 1 5 i 5 K } be a K element
alphabets, especially binary inputs. Then the runlength code is collection of words, wi, with each word a substring of X.
particularly simple since, by necessity, the runs alternate be- W ( K )is proper if, for any two words, w; and wj with i # j ,
tween l’s and 0’s. Runlength codes are occasionally combined wi is not a prefix of wj. W ( K )is complete if every infinite
with Huffman to form a variable in to variable out (VtoV) length input string has a prefix in W . The characterization in
scheme. For example, consider the CCI’IT facsimile code [lo] [15] is succinct: a code is complete and proper if every infinite
or the more general VtoV runlength codes of Elias [6]. length input string has one and only one prefix in W ( K ) .
Ziv-Lempel codes work on different principles than BAC
It is useful to think of complete and proper from the
codes or the other codes considered so far. They find sub-
viewpoint of parsing trees. Proper means that the codewords
strings of input symbols that appear frequently and encode
are all leafs of the tree; complete means that all the nodes in the
each with a fixed length codeword. Ziv-Lempel codes work
tree are either leafs or they are internal nodes with m children.
particularly well on English text, as they can discover complex
The number of codewords in complete and proper sets is
dependencies between the symbols.
One advantage of most fixed output length block coders,
+
1 L(m - 1) for L = 1 , 2 , . . . [12]. This result is also easy to
including BAC, is that the effects of transmission errors derive from the viewpoint of trees. The root has m children
(symbols received incorrectly) are confined to the block in (L = 1).Each time a leaf node is replaced by m children, a
question. The .decoder does not lose synchronism with the net gain of m - 1 leafs result. Below, we will allow L = 0 so
incoming bit stream. While the corrupted block may be de- that one codeword can be allowed in appropriate sets.
coded incorrectly, remaining blocks will be decoded correctly. We assume that to each output codeword is assigned an
However, there may be insertions or deletions in the decoded integer index i with i = 1 , 2 , . . . ,Ks. Let S be the set
stream. These insertions and deletions can usually be dealt of output codewords. Consider a subset of codewords with
with at the application level. In contrast, Huffman codes and contiguous indices. Denote the first index by A and the last by
arithmetic codes can lose synchronism and suffer catastrophic B. The number of codewords in the subset is K = B - A 1. +
failure in the presence of channel errors. (It is often argued BAC proceeds by recursively splitting S into disjoint sub-
that Huffman codes are self-synchronizing, but this is not sets. With each input symbol, the current subset is split into m
guaranteed. Various techniques have been proposed to combat nonempty, disjoint subsets, one for each possible input letter.
this problem. See, e.g., [16].) The new subset corresponding to the actual letter is chosen
One subtle point is that fixed output length codes can and the process continues recursively. When the subset has
still suffer catastrophic failure if implemented in an adaptive fewer than m codewords, BAC stops splitting, outputs any
fashion. If the probability estimates depend on previously of the codewords in the subset, reinitializes, and continues.
transmitted symbols, then a single error can affect other blocks. The ideal situation is that each final subset contain only 1
In particular, Ziv-Lempel codes are usually implemented in an codeword; otherwise, the “extra” codewords are wasted.
adaptive fashion and are susceptible to catastrophic failure if Denote the number of codewords in each of the m disjoint
their dictionaries get corrupted. BAC codes will be vulner-
subsets by K l ( K ) ,where K is the number of codewords in
able to the same catastrophic failure if the probabilities are
the current set. Usually, we will suppress the dependence on
estimated adaptively based on prior symbols.
K and. denote the number simply by Kl.
Finally, there is a growing recognition that V to F codes can
The question of how the Kl should be selected is central to
outperform F to V codes with strong dependencies between
the remainder of this paper. We can identify five criteria that
symbols [MI, [26]. The idea is that, for a fixed number of
are necessary or desirable:
symbols, V to F codes can match long input strings. For
example, a runlength coder with K codewords can match a 1) Kl > 0.
run of up to K - 1 symbols. This run can be encoded with 2) C E I K l L K .
approximately log K bits, resulting in an exponential gain. In 3) C;”=,Ki = K .
contrast, Huffman codes only give arithmetic gains. For K +
4) Kl = 1 L l ( m - 1) for some LI = 0 , 1 , 2 , .. ..
large enough, the V to F coder is more efficient. 5) If p j > pi, then Kj 2 Ki.
This paper is organized as follows: Section I1 develops the Criteria Q1 and Q2 are necessary to ensure that the parsing
optimal and heuristic BAC’s. In Section 111, we present the is complete and proper and that the encoding can be decoded.
principal theoretical result of this paper: that BAC’s are within Each subset must have at least one codeword and the sum of
an additive constant of the entropy for all code sizes. Section all the subsets cannot exceed the total available.
IV presents computations illustrating the efficiency of BAC’s. Criteria Q3, Q4, and Q5 are desirable in that meeting each
Section V discusses extensions to time varying and Markov results in a more efficient encoding. In effect, Q3 says that one
symbols, Section VI discusses the EOF problem and proposes ought to use all the codewords that are available. Q4 assures
two new solutions, and Section VI1 presents conclusions and that the number of codewords in each subset is equal to one
discussion. (a leaf) or is the number in a complete and proper set. For
1548 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 5, SEPTEMBER 1993
independence, Pr(w;) = Pr(a1,;) Pr(wi). Thus, /* Good Heuristic to determine the L’s (Kt= 1 + Ll(m - 1)) */
K Sort the symbols so that p1 5 p1 5 . 5 pm.
q = 1.0;
i=l f,=&
K for ( I = 1; l 5 m; I++)
{
i=l fi = P1/%
111
number of additions; using the fast heuristic, it only requires enough. See, e.g., Fig. 3.) In this section, we will take all
two multiplications and a fixed number of additions and “logs” to be natural logs to the base e.
comparisons per input symbol. The decocler does the same Theorem 1 For independent and identically distributed
computations of Kl (or L I ) as the encoder, but also has to inputs,
search for the proper interval. This searching can be done in
O(1og m) operations. Thus, the encoder can be implemented logK - H(X)Nh(K) 5 c, (17)
in 0 ( 1 ) or O ( m )operations per input symbol and the decoder for all K and some constant C. (C depends on the proba-
in O(1ogm) or O ( m ) operations, depending on whether the bilities, the quantization, and on the size of the alphabet, but
fast or the good heuristic is used. In the special case of not on K.)
binary inputs, there is little difference, both in complexity and Proofi We will actually show a stronger intermediate
performance, between the two heuristics. After the Kl’s have result, namely that
been computed, the optimal BAC can be implemented with no D
multiplications and a similar number of additions to the fast logK - H(X)Nh(K) I G(K) = C - -, (18)
K
heuristic. However, the one time cost of computing the optimal
Kl’smay be high (a brute force search requires approximately where D > 0. Clearly C - D/K 5 C.
O(mLS)operations if m is large, where KS = l+Ls(m-l)). The proof is inductive. There are three constants to be
In some applications, it is convenient to combine the optimal determined: C and D, and a splitting constant, Kc. First,
and heuristic approaches in one hybrid approach: use the opti- select any KC such that KC > 2m/pl, where p l > 0 is the
mal sizes for small values of K and use the heuristic for large minimum of the pl’s.
values. As we argue in Section IV, both encoders incur their The basis is as follows: For all K 5 K c , and for any
greatest inefficiency for small values of K. For instance, an D > 0, we can choose C so that
encoder can begin with the heuristic until 256-1024 codewords
are left and then switch to the optimal.
c 2 logK - H ( X ) N h ( K ) D +
Letting N h ( K ) represent the expected number of input sym- D
bols encoded with K output codewords under the heuristic, the
2 log K c - 0 -,
1
+ (19)
expression for N ( K ) becomes where, clearly, N h ( K ) 2 0 and D/K 5 D.At this point, K c
m has been determined and C has been specified in terms of D
+
Nh(K) = 1 C P l N h ( b l K 1 ) . (14) and the proposition holds for all K 5 KC.
1=1 For the inductive step, we now assume that (18) applies for
For the moment, if we ignore the quantizing above, relax all K I K’ for some K‘ 2 Kc. Then, we find a condition
the boundary conditions, and consider N ( . ) to be a function on D so that (18) applies for all K. Let K = K’ 1, then +
of a continuous variable, say s, we get the following equation: logK - H(X)Nh(K)
where the subscript e refers to “entropy”. The entropy rate for - H(X) - C P l W ) N h ( b l K l )
V to F codes, 1=1
If 61 2 0, then
entropy -
50 - optimal -------
heuristic .-.-....
4o - runlength --
30 -
z
Similarly, if 61 < 0,
20 -
rpi K
0 2 4 6 8 10 12 14 16
log K
+
(By construction, plK 61 > 0.)
Fig. 4. Calculated N ( K ) versus log K for binary inputs with p = 0.95.
The first term of (20) can be bounded as follows:
m Combining (21), (27), and (28) and letting D' = ( m -
CPl(log(PlK)- log(b1KI)) 1)2" maxj IPjl, we get the following:
log K - H ( X ) N h ( K )I G ( K - 1 ) + D ' K - ~
1=1
IG(K). (29)
Substituting in for G(.), we need to select D so that the
inequality in (29) is valid:
D
C--
K-1
+KD'- 2<- c - -KD '
or, equivalently,
-1 K-1
f (1 +Z a 1 j K - j -DK + D'-
K <-DK+D, (31)
3=1 which is valid for all D > D'. This completes the proof. 0
(25) Clearly, the optimal approach is better than the heuristic and
worse than entropy. So, combining with the theorem above,
where ai3 are coefficients that are polynomial functions Of the we have the following ordering:
61's. They do not otherwise depend on K . Since E;"=, 61 = 0,
log K
-- logK-C
we get the following: H(X)- Ne(K) L No(K) 2 N h ( K ) L . (32)
m
H(X)
IV. COMPUTATIONAL
RESULTS
In this section, we compute BAC efficiencies for the heuris-
tic and optimal implementations and compare these with
entropy values. We also compare an implementation of BAC
and the arithmetic coder of Witten, Neal, and Cleary [25],both
operating on a binary input, and show that BAC is much faster.
Source code for the tests presented here can be found in [4].
In Fig. 4, we present calculations of N ( K ) for a hypo-
(26) thetical entropy coder, the optimal BAC, the good heuristic
Where P j = E;"=, 6lalj are polynomial functions of the 61's BAC, and a simple V to F runlength coder versus log K =
and do not otherwise depend on K . Now we can bound each 0 , 1 , . . . ,16. The input is binary with p = Pr(z = 1) = 0.95.
+
term separately. By construction, 1 Si/(p;K) > 1/2. Then, Note that the optimal and heuristic curves closely follow the
entropy line. The maximum difference between both curves
and entropy is 4.4 and occurs at log K = 3. In contrast, for
log K = 16, the optimal is within 2.3 input symbols of the
entropy and the heuristic within 2.8. This behavior seems to
The second term of (26) can be bounded as follows: be consistent over a wide range of values for p . The maximum
difference occurs for a small value of logK. The coders
m-1 improve slightly in an absolute sense as log K grows. At some
- PjK-j I ( m - l)K-'m+x IPjl. (28) point the daerence seems to level off. The relative efficiency
3
j=1 increases as the number of codewords grow. At log K = 3, it
1552 IEEE TRANSACTIONS ON I N F O W O N THEORY, VOL. 39, NO. 5, SEITEMBER 1993
TABLE rI
TIMES
EXECUTION FORBAC AND WNC FORA 1 MILLION
BIT FILE
5 - entropy - Version Time (secs)
optimal -------
good ... - BAC Encoder 4.2
4 - fast BAC Decoder 3.7
WNC Encoder 21.6
3 - WNC Decoder 29.5
WNC Encoder (fast) 22.2
2 -
WNC Decoder (fast) 29.8
IO Speed (encoder) 2.0
IO Speed (decoder) 1.7
1 -
can see that both the heuristic and optimal BAC’s are efficient N ( K ( j ) , j )= 1 + Cp(Wvqj.+
l > ,+
j I), (34)
across the whole range. For instance, with p = 0.80, both 1=1
exceed 98.7% of entropy; with p = 0.95, both exceed 95.0%; where K l ( j )is the number of codewords assigned to a1 at time
and with p = 0.99, both exceed 91.5%. j . The heuristic is easy: Choose K l ( j + l ) = b ( Z , j ) K ( j ) ]We
.
The computational results support an approximation. For can also present an optimal dynamic programming solution:
all K large enough, H ( X ) N h ( K )M log K - C. In general, it
seems to be that < C, where C is taken from Theorem = 1+
No(K(j),j)
m
2. Also in Table I are computed values of C = logK -
H ( X ) N h ( K )for K = 216.
As another experiment, we implemented BAC and the coder
This problem can be solved backward in time, from a maxi-
of Witten, Neal, and Cleary (referred to as WNC) [25] to assess
the relative speeds. Source code for WNC appears in their
+
mum j = K - m 1 to a minimum j = 1.
As another example consider a time invariant, first order
paper, so it makes a good comparison. Speed comparisons are
Markov source. Let p(ilZ) = Pr(x(j) = ailx(j - 1) = a).
admittedly tricky. To make the best comparison, both BAC
Then the recursion for N ( K ) splits into two parts. The first
and WNC were optimized for a binary input and the encoders
is for the first input symbol; the second is for all other input
for both share the same input routine and the decoders the
symbols:
same output routine. There are two versions of WNC, a
m
straightforward C version, and an optimized C version. We
implemented both and found that, after optimizing for binary +
N ( K )= 1 x p i N ( K i ( Z ) (36)
1=1
inputs, the straightforward version was actually slightly faster m
than the optimized version. Both were tested on the same input (37)
file, 220 independent and identically distributed bits with a i=l
BONOCELET BLOCK ARITHMETIC CODING FOR SOURCE COMPRESSION 1553
where N(KIZ) is the number of input symbols encoded using inefficiency as follows:
K codewords given that the current input is al. The heuristic
again is easy: Choose Ki = Ip(ilZ)K].The optimal solution is
log K - C’- log(K - 1) + C’ M (K(1og K - Cl))-’ (38)
harder, although it can be done with dynamic programming. log K - C‘
For every 1, the optimal N,(KIZ) can be found because each For K large enough, this overhead is negligible.
Ki < K . In [3], we show a similar optimality result to
Theorem 2 for a class of first order Markov sources. WI. CONCLUSIONS AND DISCUSSION
In some applications it is desirable to adaptively estimate the
We believe that BAC represents an interesting alternative in
probabilities. As with stream arithmetic coding, BAC encodes
entropy coding. BAC is simple to implement, even for large
the input in a first-in-first-out fashion. The only requirement is
block sizes, because the algorithm is top-down and regular.
that the adaptive formula depends only on previous symbols,
In contrast, Huffman’s and Tunstall’s algorithms are not top-
not on present or future symbols.
down and generally require storing and searching a codebook.
BAC is efficient, though probably slightly less efficient than
ordinary arithmetic coding. The comparison with Huffman is
a little more complicated. The asymptotic result for BAC is,
VI. THE EOF PROBLEM on the surface, weaker than for Huffman. The BAC bound
uses C which must be computed on a case by case basis,
One practical problem that BAC and other arithmetic coders while the Huffman bound is 1. Both coders can be made as
have is denoting the end of the input sequence. In particular, relatively efficient as desired by selecting the block size large
the last codeword may be only partially selected. enough. However, BAC can use much larger block sizes than
The simplest solution to this problem is to count the number is practical for Huffman. The BAC asymptotic result is much
of symbols to be encoded and to send this count before stronger than that for Tunstall’s coder. For BAC, the difference
encoding any. It is an easy matter for the decoder to count the between entropy and the heuristic rates is bounded. Tunstall
number of symbols decoded and stop when the appropriate shows only that the ratio of rates of entropy and his coder
number is reached. However, this scheme requires that the approaches 1 as K + 03.
encoder process the data twice and incurs a transmission In the proof for Theorem 2, we have shown that a constant,
overhead to send the count. C, exists that bounds the difference between entropy and the
The EOF solution proposed in the stream arithmetic coder heuristic BAC for all K. We have made no effort to actually
of Witten, Neal, and Cleary [25] and used in FIXARI [22] evaluate the constant from the conditions given in the proof.
is to create an artificial letter with minimal probability. This This is because such an evaluation would be worthless in
letter is only transmitted once, denoting the end of the input. evaluating the performance of BAC. As shown in Section IV,
When the decoder decodes this special letter, it stops. We the constant in practice is quite small. One important need for
computed BAC’s lost efficiency for binary inputs for a variety future research is to provide tight bounds for C and, perhaps,
of probabilities and codebook sizes and found that it averages to characterize the difference between the entropy rate and
about 7%. For the English text example, the loss averaged BAC more accurately.
about 3.5% over a wide range of codebook sizes. One advantage of BAC compared to Huffman and stream
We propose two alternatives to the simple schemes above arithmetic coding is that BAC uses fixed length output code-
(see also [2]). The first assumes the channel can tell the words. In the presence of channel errors, BAC will not suffer
decoder that no more codewords remain and that the decoder catastrophic failure. The other two might.
is able to “lookahead” a modest number of bits. The idea We have also argued that BAC can accommodate more
is for the encoder to append to the input sequence as many complicated situations. Certainly the heuristic can handle
least probable symbols as necessary (possibly zero) to flush time varying and Markov probabilities. It can estimate the
out the last codeword. After transmitting the last codeword, probabilities adaptively. It remains for future work to prove
the encoder transmits the number of encoded symbols to the optimality results for these more complicated situations.
decoder. Even with 232 codewords, at most 31 extra input
symbols are needed. This number can be encoded with 5 bits. REFERENCES
The decoder looks ahead 5 bits until it detects that no more
symbols are present. It then discards the appropriate number [l] R. E. Blahut, Principles and Practice of Information Theory. Reading,
MA: Addison-Wesley, 1987.
of appended symbols. The overhead caused by this scheme is [2] C. G. Boncelet, Jr., “Extensions to block arithmetic coding,” in Proc.
modest: 5 bits at the end and one possibly wasteful codeword. 1992 ConfInform. Sci. and Syst., Princeton NJ, Mar. 1992, pp. 691-695. ,
The second scheme assumes the channel can not tell the [3] -, “Block arithmetic coding for Markov sources,” in Proc. 1993
Inform. T h o v Symp., San Antonio, TX,Jan. 1993.
decoder that no more codewords remain. We suggest devoting [4] -, “Experiments in block arithmetic coding,” Univ. Delaware, Dep.
one codeword to specify the end of the codeword sequence. Elec. Eng., Tech. Rep. 93-2-1,1993.
[5] R. M. Capocelli and A. De Santis, “New bounds on the redundancy of
(Note, we are not suggesting an extra input letter, but one Hufhnan codes,” IEEE Trans. Inform. Theory, vol. 37, pp. 1095-1104,
of the K codewords.) Then append the extra number of bits July 1991.
as above. The inefficiency here includes the 5 bits above and [6] R. Elias, “Universal codeword sets and representation of the integers,”
IEEE Trans. Inform. Theory, vol. IT-21, pp. 194-203, Mar. 1975.
the loss due to one less codeword. Using the approximation [7] S. W. Golomb, “Run-length encodings,” IEEE Transhaform. Theory,
discussed in Section IV above, one computes the relative vol. IT-12, pp. 399401, July 1966.
1554 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 5, SEPTEMBER 1993
[8] M. Guazzo, “A general minimum-redundancy source-coding algorithm,” “An overview of the basic principles of the q-coder adaptive binary
IEEE Trans. Inform. Theory, vol. IT-26, pp. 15-25, 1980. arithmetic coder,” IBM J. Res. Develop., vol. 32, no. 6 pp. 717-726,
[9] D. A. Huffman, “A method for the construction of minimum-redundancy Nov. 1988.
codes,” Proc. IRE, pp. 1098-1101, Sept. 1952. [191 J. Rissanen, “Generalized kraft inequality and arithmetic coding.”IBM
[lo] R. Hunter and A. H. Robinson, “International digital facsimile coding J. Res. Develop., pp. 198-203, May 1976.
standards,” Proc. IEEE, vol. 68, pp. 854-867, July 1980. [20] J. Rissanen and G. G. Langdon, “Arithmetic coding,” IBM J . Res.
[ll] M. Jakobsson, “Huffman coding in bit-vector compression.” In& Proc. Develop., vol. 23, no. 2, pp. 149-162, Mar. 1979.
Left., vol. 7, no. 6, pp. 304-307, Oct. 1978. [21] J. Rissanen and K. M. Mohiuddin, “A multiplication-free multialphabet
[12] F. Jelinek and K. Schneider, “Onvariable-length-to-block coding,”IEEE arithmetic code.” IEEE Trans. Commun., vol. 37, pp. 93-98, Feb. 1989.
Trans. Inform. Theory, vol. IT-18, pp. 765-774, Nov. 1972. [22] J. Teuhola and T. Raita, “Piecewise arithmetic coding.” in Proc. Data
[13] G. G. Langdon, “An introduction to arithmetic coding,” IBM J. Res. Compression Con& 1991, J. A. Storer and J. H. Reif, Fds., Snow Bird,
Develop., vol. 28, pp. 135-149, Mar. 1984. UT, Apr. 1991, pp. 3 3 4 2 ,
[14] A. Lempel and J. Ziv, “On the complexity of finite sequences,” IEEE [23] B. P. Tunstall, “Synthesis of noiseless compression codes,” Ph.D.
Trans. Inform. Theory, vol. IT-22, pp. 75-81, Jan. 1976. dissertation, Georgia Inst. Techno]., 1967.
[15] N. Merhav and D. L. Neuhoff, “Variable-to-fixed length codes provide [24] T. A. Welch, “A technique for high-performance data compression,”
better large deviations performance than fixed-to-variable length codes,” IEEE Compuf. Mag., pp. 8-19, June 1984.
IEEE Trans. Inform. Theory, vol. 38, pp. 135-140, Jan. 1992. [25] I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data
(161 B. L. Montgomery and J. Abraham, “Synchronization of binary source compression.” Commun. ACM, vol. 30, pp. 520-540, June 1987.
codes,” IEEE Trans. Inform Theory, vol. IT-32, pp. 8494354, Nov. [26] J. Ziv, “Variable-to-ked length codes are better than fixed-to-variable
1986. length codes for Markov sources,” IEEE Tram. Inform. Theory, vol. 36,
[17] -, “On the redundancy of optimal binary prefix-condition codes pp. 861-863, July 1990.
for finite and infinite sources,” IEEE Trans. Inform. Theory, vol. IT-33, [27] J. Ziv and A. Lempel, “Compression of individual sequences via
pp. 156-160, Jan. 1987. variable-rate coding,” IEEE Trans. Inform. Theory, vol. IT-24, pp.
[18] W. B. Pennebaker, J. L. Mitchell, Jr. G. G. Langdon, and R. B. Arps, 530-536, Sept. 1978.