Maximum A Posteriori Decoding For

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Mobile Netw Appl (2018) 23:318–325

https://doi.org/10.1007/s11036-017-0949-z

Maximum a Posteriori Decoding for KMV-Cast


Pseudo-Analog Video Transmission
Xiao-Wei Tang1 · Xiao-Ning Huan1 · Xin-Lin Huang1

Published online: 14 October 2017


© Springer Science+Business Media, LLC 2017

Abstract The existing noise in video/image will not only 1 Introduction


reduce visual quality, but also will adversely affect the
subsequent processing as compression, encoding, trans- Nowadays, with the rapid development of mobile Internet,
mission and storage. Hence, the denoising technology for huge video applications are emerging. According to the
video/image is significant in the whole media industry. In latest report from Cisco Visual Networking Index Mobile
this paper, a maximum a posteriori (MAP) decoding for Forecast 2017 [1], IP video traffic will be 82% of all
KMV-Cast pseudo-analog video transmission has been pro- consumer Internet traffic by 2021, up from 73% in 2016
posed to further eliminate the residual noise in the received globally. Internet video traffic will grow fourfold from 2016
video/image. First, a noise decomposition model based on to 2021, and Internet video-to-TV traffic will be 26% of
multidimensional plane has been proposed. Then, the resid- consumer Internet video traffic by 2021, up from 24% in
ual noise in KMV-Cast scheme has been shown to obey 2016. Video/image is gradually becoming the main way
Gaussian distribution. Finally, the estimation of the resid- for people to get information and communicate with each
ual noise has been derived for the purpose of maximizing other. However, we obtain various video/image signals that
the PSNR of the reconstructed video/image. The simulation are often mixed with noise. As a result, video/image appli-
results have shown that the proposed decoding method has cations have the requirement to eliminate the noise [2].
the best performance compared with other two algorithms, As the pre-processing part of video/image applications, the
such as KMV-Cast and SoftCast. video/image denoising technology has the direct impact on
the quality of video.
Keywords KMV-Cast · Video transmission · MAP The video/image denoising methods can be classified
decoding · Denoising into several types according to signal transformation meth-
ods adopted, such as spatial domain, frequency domain,
wavelet domain and time domain [3–7]. The spatial domain
denoising contains two typical algorithms, which are mean
filtering and median filtering [8]. Mean filtering and median
 Xin-Lin Huang
filtering aims at two kinds of common noise in video/image,
[email protected] such as additive white Gaussian noise and impulse noise
[9, 10]. The mean filtering is a typical linear filtering algo-
Xiao-Wei Tang
[email protected] rithm which is simple and fast. However, the mean filtering
algorithm will cause serious blur in the processed image,
Xiao-Ning Huan
[email protected] especially at the parts of edges and details in the image
[9]. The median filtering is a nonlinear filtering technol-
1 Department of Information and Communication Engineering, ogy with wide applications. It does not need to compute
Tongji University, Shanghai 201804, China the statistical features of video/image. However, its filtering
Mobile Netw Appl (2018) 23:318–325 319

performance will be affected by the shape and size of the At the base station, we evenly divide each video frame
selected filter window, which may destroy the structure of into blocks. Afterwards, we use 2D-DCT transform to com-
the image and the neighborhood information of the space press each block. The DCT coefficients closing to zero will
[10]. In order to further improve the filtering effect, mod- be discarded for limited bandwidth budget. The DCT coef-
ified median filtering algorithms were proposed such as ficients will be reshaped and scaled into a m × 1 normalized
weighted median filtering and switching median filtering vector θ . The original DCT coefficients can be repre-
[11–15]. By exploiting the distinguished features between sented as αθ. α is the amplitude of the pixel block. Since
video/image and noise, the transform domain filtering algo- most part of video energy concentrates in low frequency,
rithms were proposed to divide the signal and noise in the burst interference will reduce the video transmission qual-
transformed domain. According to different transformation ity dramatically. Hence, the scaled DCT coefficients can
used, denoising methods can be classified into frequency be multiplied by a m × m unitary matrix φ to overcome
domain and wavelet domain [16, 17]. The filtering method burst interference and reduce peak-to-average power ratio.
in frequency domain is similar to that in spatial domain. The The video signals are transmitted through a channel with
wavelet domain denoising is first to transform video/image additive white Gaussian noise. The received signal y using
from the spatial domain to wavelet domain, and then try to pseudo-analog modulation can be represented as,
remove the noise in the wavelet domain.
y = αφθ + v (1)
In video/image transmission systems, the above denoising
methods can be used to remove noise effectively. However, where v is a m×1 vector which obeys a zero-mean Gaussian
the decoding model can be further considered to understand distribution with known variance σ02 , and its components are
the features of residual noise in the video/image. Recently, independent and identical.
a pseudo-analog mobile video transmission scheme called With the help of the designed hierarchical Bayesian
KMV-Cast was proposed in [18]. Different from traditional based model and the prior knowledge extraction model, the
video transmission schemes, the KMV-Cast can make full reconstructed DCT coefficients 
θ of θ is represented as,
use of the correlated information with cloud support to

→− →T
reconstruct the video at the receiver with higher quality [18, α 2 σ0−2 rφ T v α 2 σ0−2 θi θi (φ T v)

θ = ρθ + + (2)
19]. However, noise in the reconstructed video/image does Cr + 1 (Cr + 1)(Cr + C + 1)
not clearly removed. In this paper, a MAP decoding algo-
rithm will be proposed for KMV-Cast to further remove the where C  α 2 σ0−2 and ρ is a scaling factor. The parameters
noise from the reconstructed video/image. r and t are denoted as,

The rest of the paper is organized as follows. In Section 2, −(C + 1) − (C + 1)2 + 4Ct
we will take a brief review of the KMV-Cast framework with r= (3)
2C
cloud support. In Section 3, we will focus on further elim-
inating the noise in the reconstructed video/image signals 1
for KMV-Cast scheme. In Section 4, simulation results will t=  1 (4)
(m−1)K 2 4
be demonstrated, compared with other two schemes includ- −1 − (1−K 2 )
ing KMV-Cast [18] and SoftCast [20], in terms of PSNR

→ −

of reconstructed video/image. In Section 5, we will make a where K = ( θ Ti θ) is the correlation coefficient. θi repre-
conclusion of this paper. sents the DCT-coefficients of correlated information stored
in the cloud.


Since the cloud information θi is known at the receiver,
2 Brief review of KMV-Cast scheme we can get the following equation from Eq. 2, that is,

→T 
In the scenario of wireless video broadcasting, the digital −
→T T θi θ − ρK
θi (φ v) = (5)
video transmission system has some drawbacks in scalabil- α 2 σ0−2 r α 2 σ0−2
Cr+1 + (Cr+1)(Cr+c+1)
ity and robustness. In order to overcome these flaws and
take full advantage of the large amount of correlated infor- Then, the third item in Eq. 2 can be removed from the
mation stored in the cloud, KMV-Cast was proposed in the decoded signals. However, there is still noise existed in the
previous work [18]. reconstructed video/image.
KMV-Cast was proposed as a knowledge-enhanced wire- Compared with traditional video transmission schemes,
less video transmission scheme based on the pseudo analog the KMV-Cast system connects the large amount of cor-
video transmission technology. It can greatly improve the related information in the cloud to the transmitting signal,
quality of the reconstructed video through making full use which can greatly improve the quality of the reconstructed
of the correlated information. video/image.
320 Mobile Netw Appl (2018) 23:318–325

3 The proposed MAP decoding

In this Section, a noise decomposition model based on the


three-dimensional plane (as an example) will be considered
first. Then, we will show that the residual noise in Eq. 2
obeys Gaussian distribution. Finally, the estimation of the
residual noise will be derived.

3.1 Noise decomposition model

In the KMV-Cast scheme mentioned in Section 2, we can


improve the quality of reconstructed video/image through
removing the third item in Eq. 2. However, one can see from
Eq. 2 that the noise can not be eliminated totally. Next, we Fig. 1 Noise decomposition diagram
will go to further eliminate the residual noise for KMV-Cast


system. According to Eq. 5, θi T (φ T v) is known as a con-
eliminating noise completely, we will further consider the
stant at the receiver. For the second item in Eq. 2, we define
optimal point on the circle that represents vector W . Thus,

→T ρθ can be further represented as,
α 2 σ0−2 r θi (φ T v)
T  (6) −→ −→ −−→
Cr + 1 ρθ = O θ = P1 
θ − P1 O
where T is a constant number. For simplicity, we define W −−→   
 P1 P2 −−→
as, = θ − P1 − −−→ |P1 P2 | − ρ 1 − K 2 (12)
|P1 P2 |
α 2 σ0−2 r(φ T v) −−→
W  (7) where P1 P2 can be represented as,
Cr + 1
Therefore, Eq. 6 can be rewritten as, −
→T
−−→ − → →T   T − θi 
− θ

→ P1 P2 = θi (T − θi θ) − θ( ) (13)
T = θi T W (8) −
→T 
θi θ
According to Eq. 2, the estimated value of the received
signal can be rewritten as, 3.2 Noise estimation


θ = ρθ + W (9) Next, we will derive the estimation of the residual noise

→T from viewpoint of maximizing the posteriori probability.
One can see from Eq. 8 that θi W constructs a hyper-
plane for W . For the sake of visualization, we use a three- We first plot the probability density function of W obtained
dimensional diagram to demonstrate it as shown in Fig. 1. from experiment as shown in Fig. 2.


∂ is the plane constructed by θi T W = T and  θ is the red
straight line going through the plane ∂. 
θ intersects with the
plane ∂ at the point P1 whose coordinate can be represented
as,

→T
  T − θi  θ
P1 = θ + θ( ) (10)

→T 
θi θ
The projection point of  θ on the plane ∂ is P2 whose
coordinate can be represented as,

→ −

P2 =  θ + θi T (T − θi T θ) (11)
We draw a cone with  θ as the conical tip and ρθ as
the edge. Thus, the cross part of the cone on the plane
∂ is a √circle whose center is P2 and radius is R where
R = ρ 1 − K 2 . We assume that the intersection between
the line P1 P2 and the circle is O. From Fig. 1, one can see
that the vector W is a point on the circle. For the purpose of Fig. 2 The probability density function of residual noise
Mobile Netw Appl (2018) 23:318–325 321

From Fig. 2, one can conclude that the residual noise where μ is the mean value if we only consider θ as random
W obeys normal distribution. Thus, it is a additive white variable in the joint Gaussian distribution. p(θ, W ) can be
Gaussian noise in the decoded video/image. represented as,
According to Fig. 1, the equation of the circle can be ⎡ ⎤
defined by combining the equation of the plane ∂ with the ρ(Cr+1)2 −1 (θ − ρ(Cr+1)
2
⎢ (θ − Cr 2
θ̂ )T Cr 2
θ̂) ⎥
spherical equation as, −⎣ ⎦
2 4
(Cr+1)2 (2θ̂ T θ̂ +τ )
− ρ (Cr+1) θ̂ T θ̂ +
⎧− p(θ, W ) ∝ e C 2r4 Cr 2

⎪ →T

⎪ θi W = T (21)
⎨  
 −
→ →T  T
−  −
→ −
→T  −
→−→T
⎪ W − θ + θi (T − θi θ) W − θ + θi (T − θi θ) θ θ

⎪ Acccording to [18], we have −1 = Ir − (r+1)r j j
, and
⎩ = ρ 2 (1−K 2 )
substitute it into Eq. 21. p(θ, W ) can be further represented
(14) as,

2
θT θ
From Eq. 14, we can get, − θ T Ir θ − 2ρ(Cr+1)2 +τ1
p(θ, W ) ∝ e Cr
(22)


W W = 2θ W + ρ + T − ρ K + ( θi T 
T T 2
θ)2 2 2 2
where τ1 can be denoted as follows,


−θT 
θ − 2( θi T 
θ)T (15)

→T

→ −
→ 2ρK(Cr + 1)2 θi  θ K2
where τ  +ρ2 T 2 − ρ 2 K 2 + ( θi T  θT 
θ)2 −  θ − 2( θi T 
θ)T . τ1 = − −
Cr 3 r(r + 1)
Thus, Eq. 15 can be rewritten as, −
→ T −
→− →T
2ρK(Cr + 1)2 θi (rI + θi θi ) θ
+
W T W = 2
θT W + τ (16) Cr 3 (r + 1)
(Cr + 1)2 (2θT 
θ + τ)
Equation 15 is the relationship between W and  θ . Fur- + 2
(23)
Cr
thermore, both θ and W obey Gaussian distribution.
Obviously, we need to find the θ that makes the posteri-
θ ∼ N(0, ) ori probability density P (θ, W ) maximum. Thus, the goal
Cr 2 function can be represented as,
W ∼ N(0, ) (17)  
(Cr + 1)2
2ρ(Cr + 1)2 θT θ
Max − τ1 (24)
Since θ and W are independent to each other, the joint Cr 2
probability density of them can be represented as:
If we substitute Eq. 9 into Eq. 24, the goal function can
WT W be rewritten as:

Cr 2
p(θ, W ) ∝ e (Cr+1)2 e−θ
T −1 θ
 
⎛ ⎞ Min  θT W (25)
WT W
−⎝ +θ T −1 θ ⎠
Cr 2
=e (Cr+1)2 (18) subject to,


→T
If we substitute Eq. 16 into Eq. 18, Eq. 18 can be θi W = T
(26)
rewritten as, W T W = 2θT W + τ
⎛ ⎞
2ρ
Using the Lagrange multiplier method, we can obtain,
θT θ θT 
2
−⎝θ T −1 θ − + θ +τ ⎠
Cr 2 Cr 2 
p(θ, W ) ∝ e (Cr+1)2 (Cr+1)2
  −
→T
 F (θ , W ) = θ W + β1 θi W − T
T T
−1 θ − 2ρ(Cr+1)  θT 
2 θ T θ (Cr+1)2 (2 θ +τ )
− θT
Cr 2
+
Cr 2  
=e (19) +β2 W T W − 2 θT W + τ (27)
From Eq. 19, we can get the mean of the Gaussian distribu- where β1 and β2 are two parameters in the Lagrange
tion, multiplier method. W can be further simplified as,
ρ ρ(Cr + 1)2   −

μ= 
θ= θ (20) θ + β1 θi
Cr 2 Cr 2 W =
θ− (28)
(Cr+1)2 2β2
322 Mobile Netw Appl (2018) 23:318–325

β1 and β2 can be solved as,




  −
→T 2 
 θT 
θ − ( θi θ) −
→T  −

β1 = ±  θi θ − T − θi T 
θ
 −
→ T 2
θT 
θ + τ − ( θi  θ −T)
(29)



  −
→T 2
1 θT θ − ( θi 
θ)
β2 = ±  (30)
2 T  −
→T  2
θ θ + τ − ( θi θ − T )
According to Eq. 28, Eq. 29 and Eq. 30, the final esti-
mation value of θg of transmitted signal can be represented
Fig. 3 The reference frame(# 10)
as,
 −

θ + β1 θi
ρθg = From Figs. 5–8, one can see that the proposed decoding
2β2 algorithm performs the best among these schemes.

→T 2

 θT 
θ −( θi 
θ) −
→−→T  − → −
→− →T  In order to optimize the PSNR of the reconstructed
θ+ −
→T 2 θi θi θ − θi T − θi θi θ video/image, SoftCast distributes the transmission power

θT 
θ +τ −( θi 
θ −T )
= according to the power allocation scheme. Different from
 −
→T 2
θT 
θ −( θi 
θ) SoftCast, KMV-Cast scheme uses correlated information
  −
→T  2
in the cloud to assist the transmission and reconstruction
θ θ +τ −( θi θ −T )
T

(31) of video sequence. However, one can see from the Eq. 2
that the residual noise is not completely eliminated in the
From Eq. 31, one can see that the final decoded video/image original KMV-Cast. Therefore, this paper further eliminates
signal is different from the estimated signal in the original the residual noise for the KMV-Cast system by removing
KMV-Cast scheme. In order to improve the quality of the the noise with MAP decoding. Next, the simulation results
reconstructed video/image, the proposed MAP decoding will be analyzed under two conditions: some pixel blocks
should be added on the basis of the KMV-Cast system at are transmitted and all pixel blocks are transmitted, respec-
the receiver to remove the residual noise. tively.
In order to verify the correctness of the proposed algo-
rithm, simulation results will be shown in the next Section. 1) Only several original DCT coefficients are transmit-
ted, which corresponds strong correlation between the
transmitted video/image and information in the cloud.
4 Performance analysis From Figs. 5–6, one can see that in the case of high
channel SNR (10 dB), the proposed algorithm has
This Section will analyze the performance of different about 0.5 dB, and 6.5 dB gains in terms of PSNR,
schemes using the standard test video sequence “Carphone”. compared with the traditional KMV-Cast and SoftCast,
The transmitted video sequence has a resolution of 176 ×
144 and 8-level pixel depth (i.e., the pixel value ranges
from 0 to 255). We assume that both the transmitter and
receiver only have the 10th video frame of Carphone shown
in Fig. 3, which is considered as the reference frame. The
180th frame of Carphone is considered as the original frame
which will be transmitted, as shown in Fig. 4. Both the
reference frame and the transmitted frame will be evenly
divided into 8 × 8 pixel blocks (totally 22 × 18 = 396) as


{ θj |j = 1, 2, · · ·, 396}.
The performance of three schemes are compared includ-
ing the proposed decoding algorithm for KMV-Cast, the
original KMV-Cast [18], and SoftCast [20]. We make com-
parisons from the point of visual effect and PSNR respec-
tively. Simulation results are shown in Figs. 5, 6, 7, and 8. Fig. 4 The original frame (# 180)
Mobile Netw Appl (2018) 23:318–325 323

Fig. 5 Reconstructed video quality comparisons (some blocks transmitted): channel SNR: 10 dB: a The proposed algorithm (39.43 dB); b
KMV-Cast (38.97 dB); c SoftCast (32.90 dB)

Fig. 6 Reconstructed video quality comparisons (some blocks transmitted): channel SNR: − 10 dB: a The proposed algorithm (28.86 dB); b
KMV-Cast (28.76 dB); c SoftCast (16.59 dB)

Fig. 7 Reconstructed video quality comparisons (all blocks transmitted): channel SNR: 10 dB: a The proposed algorithm (39.50 dB); b KMV-Cast
(38.91 dB); c SoftCast (32.90 dB)

Fig. 8 Reconstructed video quality comparisons (all blocks transmitted): channel SNR: − 10 dB: a The proposed algorithm (29.34 dB); b
KMV-Cast (23.79 dB); c SoftCast (16.59 dB)
324 Mobile Netw Appl (2018) 23:318–325

From Figs. 5–8, one can conclude that the proposed


algorithm performs the best among the three schemes.
Figures 9–10 compare the detailed performance of the three
schemes in the above two scenarios with different channel
SNRs and the results show that the proposed algorithm can
obtain the best performance.

5 Conclusions

In this paper, a MAP decoding algorithm for the KMV-


Cast system has been proposed. We found that the residual
noise is located on a fixed circle. Then, we decode the
video/image signal and residual noise with a MAP method.
The detailed derivation steps are given. In order to verify
the correctness of the proposed algorithm, the standard test
Fig. 9 Reconstructed video quality comparisons under different chan- video sequence “Carphone” are used. The simulation results
nel SNRs (some blocks are transmitted) show that the proposed MAP decoding performs the best,
compared with other two schemes.
respectively. Under the condition of low channel SNR
(− 10 dB), the proposed algorithm almost has almost
Acknowledgements This work is supported by the National Nat-
the same performance as the traditional KMV-Cast ural Science Foundation of China under Grant No.61631017 and
scheme due to strong correlation information. No.U1733114.
2) All original DCT coefficients are transmitted, which
corresponds weak correlation between the transmit-
ted video/image and information in the cloud. From References
Figs. 7–8, one can see that in the case of high channel
SNR (10 dB), the proposed algorithm has about 0.6 dB 1. Cisco (2017) Cisco visual networking index: global mobile data
and 6.6 dB gains in terms of PSNR, compared with traffic forecast update 2016–2021 white paper
the traditional KMV-Cast and SoftCast, respectively. 2. Chen S, Zhao J (2014) The requirements, challenges, and tech-
nologies for 5G of terrestrial mobile telecommunication. IEEE
Under the condition of low channel SNR (-10 dB), Commun Mag 52(5):36–43
the proposed algorithm has about 5.6 dB and 12.8dB 3. Maggioni M, Sánchez-Monge E, Foi A (2014) Joint removal
gains in terms of PSNR, compared with the traditional of random and fixed-pattern noise through spatiotemporal video
KMV-Cast and SoftCast, respectively. filtering. IEEE Trans Image Process 23(10):4282–4296
4. Malinski L, Smolka B (2016) Fast averaging peer group filter for
the impulsive noise removal in color images. J Real-Time Image
Proc 11(3):427–444
5. Wen B, Ravishankar S, Bresler Y (2015) Video denoising by
online 3D sparsifying transform learning. In: Proceedings of ICIP,
pp 118–122
6. Lee HY, Hoo WL, Chan CS (2015) Color video denoising using
epitome and sparse coding. Expert Syst Appl 42(2):751–759
7. Llordén GR, Ferrero G, Martin M (2015) Anisotropic diffusion fil-
ter with memory based on speckle statistics for ultrasound images.
IEEE Trans Image Process 24(1):345–358
8. Mittal A, Moorthy AK, Bovik AC (2012) No-reference image
quality assessment in the spatial domain. IEEE Trans Image
Process 21(12):4695–4708
9. Chen B, Xing L, Liang J, Zheng N, Principe JC (2014) Steady-
state mean-square error analysis for adaptive filtering under
the maximum correntropy criterion. IEEE Signal Process Lett
21(7):880–884
10. Kang X, Stamm MC, Peng A, Liu KR (2013) Robust median fil-
tering forensics using an autoregressive model. IEEE Trans Inf
Forensics Secur 8(9):1456–1468
11. Lu CT, Chou TC (2012) Denoising of salt-and-pepper noise cor-
Fig. 10 Reconstructed video quality comparisons under different rupted image using modified directional-weighted-median filter.
channel SNRs (all blocks are transmitted) Pattern Recogn Lett 33(10):1287–1295
Mobile Netw Appl (2018) 23:318–325 325

12. Zhang P, Li F (2014) A new adaptive weighted mean filter 16. Naghizadeh M (2012) Seismic data interpolation and denoising in
for removing salt-and-pepper noise. IEEE Signal Process Lett the frequency-wavenumber domain. Geophysics 77(2):71–80
21(10):1280–1283 17. Parrilli S, Poderico M, Angelino CV, Verdoliva L (2012) A nonlo-
13. Yang Q (2015) Stereo matching using tree filtering. IEEE Trans cal SAR image denoising algorithm based on LLMMSE wavelet
Pattern Anal Mach Intell 37(4):834–846 shrinkage. IEEE Trans Geosci Remote Sens 50(2):606–616
14. Horng SJ, Hsu LY, Li T, Qiao S, Gong X, Chou HH, Khan 18. Huang X-L, Wu J, Hu F (2017) Knowledge enhanced mobile
MK (2013) Using sorted switching median filter to remove video broadcasting (KMV-cast) framework with cloud support.
high-density impulse noises. J Vis Commun Image Represent IEEE Trans Circuits Syst Video Technol 27(1):6–18
24(7):956–967 19. Huang X-L, Tang X-W, Huan X-N, Wang P, Wun J (2017)
15. Nasimudeen A, Nair MS, Tatavarti R (2012) Directional switch- Improved KMV-cast with BM3D denoising. Mobile Network and
ing median filter using boundary discriminative noise detection Application 1–8
by elimination. Signal, Image and Video Processing 6(4):613– 20. Jakubczak S, Katabi DA (2011) A cross-layer design for scalable
624 mobile video. In: Proceedings of ACM mobicom, pp 289–300

You might also like