Academia.eduAcademia.edu

Image Interpolation by Super-Resolution

2006

Term “super-resolution” is typically used for a high-resolution image produced from several low-resolution noisy observations. In this paper, we consider the problem of high-quality interpolation of a single noise-free image. Several aspects of the corresponding super-resolution algorithm are investigated: choice of regularization term, dependence of the result on initial approximation, convergence speed, and heuristics to facilitate convergence and improve the visual quality of the resulting image.

Image Interpolation by Super-Resolution Alexey Lukin, Andrey S. Krylov, Andrey Nasonov Moscow State University, Moscow, Russia [email protected], [email protected], [email protected] Abstract Term “super-resolution” is typically used for a high-resolution image produced from several low-resolution noisy observations. In this paper, we consider the problem of high-quality interpolation of a single noise-free image. Several aspects of the corresponding super-resolution algorithm are investigated: choice of regularization term, dependence of the result on initial approximation, convergence speed, and heuristics to facilitate convergence and improve the visual quality of the resulting image. interpolation are the same as coefficients used for interpolation of source image pixels by pixels of the decimated source image. This algorithm provides very good interpolation quality but it is very complex, so it is often executed only in small areas with strong edges while simpler algorithms process the rest of area. Keywords: image interpolation, super-resolution, regularization. The rest of the paper is organized as follows. In section 2, we introduce the super-resolution method for image interpolation and the process of regularization. In section 3, we discuss several variants and modifications of the method and describe their influence on the resulting image with respect to visual artifacts and PSNR quality by giving the results of our experiments. Section 4 concludes the paper by summarizing variants of the super-resolution method that provide best image quality. 1. INTRODUCTION 2. SUPER-RESOLUTION METHOD Linear methods for image interpolation are usually constructed to deal with bandlimited signals. The interpolated one-dimensional Super-resolution method is typically used to restore a highresolution image from several low-resolution noisy observations [4]. In this paper, we consider the interpolation of a single image. So, we will formulate the problem as signal is defined as: F ' ( x ) = ∑ +∞ i = −∞ F (ih) K (ih − x ) , where K(x) – is the interpolation filter, h is the sampling step. In a twodimensional case, the interpolation is typically performed separately for each axis. The most popular weight functions are box filter (or nearest neighbor), tent function (or bilinear), ideal low-pass filter, Lanczos filter, Gaussian filter, and bicubic interpolation [1]. For every algorithm which is using linear interpolation there are some typical artifacts: blurriness, ringing effect, and jagged edges [6]. Reduction of one of these artifacts increases the others. As usual, non-linear algorithms are used to scale two-dimensional images with a fixed ratio without constructing continuous image. Interpolated pixel values are calculated as a linear combination of nearest sampled values, but the main difference with the linear interpolation is the variability of coefficients which depend on surrounding pixel intensities. The main idea of gradient algorithms is the fact that directed interpolation along edges results in better interpolation than nondirected linear interpolation. The direction and the intensity of an edge in a point are defined by the local gradient information. One of these algorithms is WADI [2], which is based on the modification of bilinear interpolation. It computes the derivate along the normal to every side of a square formed by four sampled pixels and modifies coefficients of bilinear interpolation in a special way: the side with greater derivative results in smaller coefficients for points of this side. Gradient algorithms are fast in the class of non-linear algorithms and produce better results than linear interpolation; it makes edges less jagged and more realistic. NEDI algorithm (New Edge-Directed Interpolation) is a typical non-linear algorithm, which doubles the resolution of images [3]. It uses the concept of self-similarity. The assumption is that coefficients of linear combination used for destination pixel Ax = y , (1) where x is the unknown high-resolution image (represented as a vector of pixel values), y is the known low-resolution image, and A is the downscaling operator typically consisting of decimation D following a low-pass filtering H: A = DH (2) The choice of the low-pass filtering operator depends on a point spread function of the imaging system that produced the lowresolution image. If the imaging system is unknown we will assume that operator H is a simple box filter. 2.1 Regularization The equation (1) is generally ill-posed and a small change of the input vector y can cause a huge change of the resulting vector x. For the equation (1), the regularized solution is found as: x = arg min Ax − y p n + αF ( x ) , (3) where the first term is called as “discrepancy”, F (x) is the stabilizer and α>0 is the coefficient of regularization [5]. The most popular and universal stabilizer is the Tikhonov functional. It is calculated as a grid approximation of the functional: 2 F ( x ) = Δx 2 , (4) and n=2, p=2. For each α>0 the solution x is correct: it is unique, defined for every y and continuously depends on y. We can write the Euler equation for this case: ( AT A + Δ2 ) x = AT y . But in this case the algorithm becomes linear because x is the solution of the system of linear equations. So, this method inherits International Conference Graphicon 2006, Novosibirsk Akademgorodok, Russia, http://www.graphicon.ru/ drawbacks of linear interpolation algorithms and we need to find more adaptive stabilizer for image resampling. We will consider Total Variation (TV) and Bilateral TV (BTV) stabilizers [4], which are working in l1 norm (n=1, p=1): TV ( x ) = ∇x 1 , where ∇x is the gradient operator (its modulus), s ,t = p BTV ( x) = ∑γ s+t x − S xs S yt x , (6) 1 s ,t = − p where (5) We have tested this method with different initial images: zerofilled, bilinear, bicubic, gradient-interpolated and using NEDI algorithm aimed at getting the fastest convergence of iterative method. We’d like to get a good approximation after small number of iterations. That’s why using the best possible initial approximation (NEDI) helps getting better results after fixed number of iterations. Fig. 1 displays improvement of SNR (ISNR) against bilinear interpolation for the super-resolution method with different initial approximations. ISNR, dB 3 S xs and S yt are shift operators along x and y axes by s and 2,5 t pixels respectively, γ=0.8. 2.2 Inverse iterations 2 To solve the equation (3) with a stabilizer (6) the iterative steepest-descent method can be used: x n +1 = x n − β {H D sign ( DHx − y ) + T +α s ,t = p ∑λ s ,t = − p 1,5 T |s|+|t | −s x −t y Bilinear interp. 1 ( I − S S ) sign ( x − S S x)} s x t y SR (x0=bilinear) (7) z=sign x is a vector with per-element applied sign function; DT is an up-scaling operation. If D in (2) is the simplest decimation operator that takes every k-th pixel, DT is the up-scaling operator by zero insertion. If H in (2) is a symmetric filtering, then HT is equal to H. x0 is the initial approximation of the high resolution image. 3. STUDY OF THE METHOD We have used the described method to perform image enlargement by the factor of 2. The operator H has been set to box filter. The following paragraphs describe our experiments with various modifications of the super-resolution method and obtained results. To evaluate the quality of results, we have selected several diverse test images and scaled them down using a box filter. Then we applied different variants of the super-resolution algorithm and compared the enlarged images with original high-resolution (“ground truth”) copies. Thus, descriptions of noticed artifacts will be supported with PSNR measurements. 3.1 Dependence on initial approximation The simplest form of the initial approximation high-resolution image would be a zero image. However such an approximation will take long to converge to the solution. A better choice would be the original image upscaled by some simple algorithm, such as bilinear interpolation or a gradient-directed interpolation. We have found that using gradient-directed interpolation as initial approximation image results in slightly better quality that when using bicubic interpolation and significantly better quality than using bilinear interpolation. Bilinear interpolation used for initial approximation leaves the jagged edges artifact in the resulting image, and it cannot be completely eliminated by the following super-resolution iterations. Even better results can be obtained using NEDI enlargement of the original image as initial approximation. SR (x0=bicubic) 0,5 SR (x0=NEDI) 0 Cactus Lena Lighthouse Text Average Im age Figure 1: Dependence of SNR on initial approximation after 16 iterations. 3.2 Convergence speed In the described iterative method, we can’t reach a precision less than β, because sign function can only take values of -1, 0, and 1 and it is multiplied by β to form the correction term at each iteration. So, if we want to get closer to the optimal image we need to decrease the value of β and increase the number of iterations. Also it can be noted that after a certain number of iterations with constant β, iterated images start fluctuating around the optimal image without approaching the target image closely. To improve the convergence speed we can use the variable coefficient β. First, we have chosen the following way: a number of iterations are processed with a constant coefficient until the image starts fluctuating. When this happens, β is decreased by some fixed ratio and the process continues until the next start of fluctuations and so on. The task is to detect fluctuations and to choose the optimal ratio of β modification. We need to operate with the discrepancy to detect the fluctuations. Before the beginning of fluctuations the discrepancy has a tendency to decrease, but when fluctuations begin the discrepancy starts to randomly oscillate around some mean value. So, the following algorithm can be applied: we count a discrepancy at each iteration and when it becomes greater than on the previous iteration, the counter of oscillations is increased. If a counter is greater than a threshold, it is reset and the value of β is decreased. The greater value of the threshold results in more precise detection of the beginning of fluctuations, but makes the convergence speed lower. In practice we set the threshold value to 2 oscillations. Finally, we need to define the divisor for β. Practically, small divisors lower the convergence speed, while big divisors result in International Conference Graphicon 2006, Novosibirsk Akademgorodok, Russia, http://www.graphicon.ru/ worse quality, but there isn’t any optimal intermediate value. If the divisor is big the algorithm needs more iterations before the image starts to fluctuating than when the divisor is smaller. The optimal divisor range is from 2 to 4. For any value of divisor from this range, the convergence speed and the quality are approximately the same. Image becomes piecewise constant; in other words it looks “painted” with strokes of paintbrush. Small regularization amount α leaves much ringing around edges and doesn’t make them sharper or less jagged. If the first approximation image was produced by a bad interpolation method, the super-resolution method won’t improve it in absence of regularization. We have noted that the number of iterations between updates of β for a fixed divisor depends very weakly on the source image and iteration number. So, after the first fluctuation, we may use the geometric progression instead of the piecewise constant behavior of β (fig. 2). Strong regularization using Tikhonov stabilizer results in strong blur, weak regularization leaves jagged edges. So, we have used the following strategy: a certain number of first iterations are processed with a constant coefficient, and after the beginning of fluctuations, the coefficient is exponentially decreased (geometrical progression). Good selection of the initial approximation image makes it possible to fix the number of iterations with some constant initial β and the method becomes independent on the source image. Here are some graphs for different ways of changing β. The thick line is β, the thin line is the discrepancy, and the dotted line is the mean square error between the current solution x and the ground truth result. The vertical scale is logarithmic. Divisor = 2 100 16,2 16 15,8 15,6 15,4 10 15,2 15 14,8 14,6 1 14,4 1 4 7 10 13 16 19 22 25 28 31 34 37 40 Geometric progression with a factor of 0.875 100 l1-regularization (such as TV or BTV) is different: strong regularization sharpens the edges, trying to make the image piecewise-constant [6]. The Gibbs phenomenon is effectively suppressed. 3.3.2 Noise reduction Many real images captured at low resolution contain noise, typically white noise. Different interpolation algorithms have different tolerance to such noise, some of them smooth the noise, and others tend to amplify it. For example, bilinear interpolation usually smoothes noise (together with image contours), while bicubic or 12-tap NEDI [3] interpolations slightly amplify it (due to overshoot/ringing property of their resampling filters). Super-resolution algorithm produces even sharper images and the high-frequency component of noise is typically amplified. However the process of regularization helps preventing noise amplification by minimizing the total variation of the resulting image. If the strength of regularization (value of α) is increased, the noise is suppressed, while the sharpness of image contours is preserved, see [6] for examples. 3.4 Effects of changing the internal upsampling algorithm The whole iterative method of super-resolution (7) can be viewed as several simple steps: 1. Downscale the current approximation image. 2. Compare it with the original low-resolution image and truncate the difference using the sign function. 3. Enlarge the difference. 4. Add the (amplified) enlarged difference to the current approximation. 5. Add the (amplified) regularization term to the current approximation. 6. Go to 1 until convergence. 16,2 16 15,8 10 15,6 15,4 15,2 15 1 1 4 7 10 13 16 19 22 25 28 31 34 37 40 14,8 14,6 0,1 14,4 Figure 2: Variation of MSE and discrepancy depending on β. 3.3 Effects of regularization 3.3.1 Ringing suppression Despite of excellent interpolation of edges, iterative method using TV and BTV has some artifacts: big regularization coefficient results in watercolor effect and loss of fine details, but it very strongly suppresses the ringing artifact (Gibbs phenomenon). The next improvement of the iterative method is modifying D and H operators. Originally, the upsampling HTDT of the sign operator of the discrepancy H T D T sign( DHx − y ) is simple: we interpolate the discrepancy with zeros and apply the filter HT. This is a linear method and it does not preserve edge directions, so it results in some edge jaggedness. We propose to use edgedirectional interpolation (such as gradient interpolation) at this stage to reduce effect of jagged edges. The coefficients of gradient interpolation can be calculated only once, using the first approximation image x0. The gradient interpolation is applied at each iteration to enlarge the modified difference. This modification of the original method increases both PSNR and visual quality. Fig. 3 compares PSNR for bilinear vs. gradient upsampling of discrepancy. International Conference Graphicon 2006, Novosibirsk Akademgorodok, Russia, http://www.graphicon.ru/ ISNR, dB 4 3,5 3 Bilinear interp. Bicubic interp. Gradient interp. NEDI SR (x0=bilinear) SR (x0=NEDI, a=grad) SR (fractal subst.) 2,5 2 1,5 1 0,5 Figure 4: Comparison of visual quality for bilinear interpolation (left) and the proposed super-resolution algorithm (right). 0 Cactus Lena Lighthouse Text Average Im age Figure 3: Dependence of SNR on discrepancy upsampling algorithm. 3.5 Future work We are currently investigating the following modifications of the super-resolution algorithm. The strength of regularization can be made image-adaptive. The main purposes of regularization are reduction of ringing and reduction of noise level. So, it is possible to apply more regularization in areas of high-contrast edges and in noisy areas, while at the same time preventing excessive “watercolor” artifact in other areas. In the effort to generate plausible high-frequency content, we have developed a patch-based algorithm for substitution of highfrequency details in the current approximation xn from the original low-resolution image y. We call this stage “fractal substitution” because substituted details are selected using a self-similarity criterion across 2 scales. 4. CONCLUSION We have analyzed properties of super-resolution method applied to the interpolation of a single image. It was shown that modifications of the original super-resolution method given by (7) can improve the visual quality and PSNR of resulting images. The example of work of the algorithm is given on a fig. 4, more examples can be found in [6]. The careful choice of initial approximation prevents jagged edges artifact from happening. We suggest using the NEDI algorithm to calculate the initial approximation image. 5. REFERENCES [1] Ken Turkowski “Filters for Common Resampling Tasks” // “Graphics gems”, pp. 147-165, Academic Press Professional, Inc., 1990. [2] Shuai Yuan, Masahide Abe, Akira Taguchi, Masayuki Kawamata “High Accuracy WADI Image Interpolation with Local Gradient Features” // Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems pp. 85-88, 2005. [3] J.A. Leitao, M. Zhao and G. de Haan “Content-Adaptive Video Up-Scaling for High-Definition Displays” // Proceedings of IVCP 2003, vol. 5022, January 2003. [4] Sina Farsiu, Dirk Robinson, Michael Elad, Peyman Milanfar “Fast and Robust Multi-Frame Super-Resolution” // IEEE Trans. On Image Processing, Vol. 13, No. 10, pp. 1327-1344, October 2004. [5] A. Tikhonov, V. Arsenin “Solutions of Ill-Posed Problems” // Washington DC: WH Winston, 1977. [6] Demo web page with image examples: http://audio.rightmark.org/lukin/graphics/superres.htm About authors Andrey S. Krylov is an associated professor at the Moscow State University, Faculty of Computational Mathematics and Cybernetics. Email: [email protected] The gradient-directed interpolation used for upsampling of the difference image prevents jagged edges occurring in the process of iterations. Alexey Lukin is a member of scientific staff at the Moscow State University, Faculty of Computational Mathematics and Cybernetics, member of IEEE and AES. Email: [email protected] The choice of Bilateral TV regularization (6) doesn’t blur the edges while reducing the ringing artifact and limiting the noise amplification. Andrey Nasonov is a student of the Moscow State University, Faculty of Computational Mathematics and Cybernetics. Email: [email protected] The adaptive way of modification of β parameter allows reducing the number of required iterations to 16-20 iterations for producing results with best PSNR. Acknowledgements. Alexey Lukin is grateful to Dr. Steven A. Ruzinsky for fruitful discussions on image interpolation algorithms. The simulation on a 1 GHz P3 computer has shown that it takes approximately 1 minute to produce a 1-megapixel enlarged image from a quarter-megapixel source image (for 16 iterations). This work has been partially funded by the Russian Foundation for Basic Research, grant 06-01-00789. International Conference Graphicon 2006, Novosibirsk Akademgorodok, Russia, http://www.graphicon.ru/