Software description and configuration

HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

View metadata, citation and similar papers at brought to you by CORE provided by INRIA a CCSD electronic archive server Software description and configuration Christine Guillemot, Laurent Guillo, Marco Cagnazzo, Giuseppe Valenzise, Béatrice Pesquet-Popescu To cite this version: Christine Guillemot, Laurent Guillo, Marco Cagnazzo, Giuseppe Valenzise, Béatrice Pesquet-Popescu. Software description and configuration. 2013, pp.11. ฀hal-00935612฀ HAL Id: hal-00935612 Submitted on 23 Jan 2014 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Projet PERSEE S hémas Per eptuels et Codage vidéo 2D et 3D ANR-09-BLAN-0170 Livrable D5.3 26/09/2013 Software des ription and onguration Christine Laurent GUILLEMOT IRISA GUILLO IRISA Mar o CAGNAZZO Giuseppe VALENZISE LTCI PESQUET-POPESCU LTCI Béatri e LTCI Contents 2 Contents 1 DCR 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Matlab fun tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Modied en oder len ode.exe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 5 2 WTM 2.1 Qui k overview . . . . . . . . . . . . . 2.2 Some implementation details . . . . . 2.2.1 WTM a tivation . . . . . . . . 2.2.2 Template shape signalling . . . 2.2.3 Number and dimensions of sear 2.3 Conguration . . . . . . . . . . . . . . 2.4 Example of use . . . . . . . . . . . . . 6 6 7 7 8 9 9 9 Referen es Software nal . . . . . . . . . . . . . . . . . . . . . . . . . . . . h windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 D5.3 Contents 3 Introdu tion This do ument gathers information about onguration and instru tions for use of two software arried out during the Persee proje t: DCR and WTM. Software nal D5.3 DCR 4 1 DCR The Don't Care Region (DCR) approa h is based on the idea that in multiple-viewplus-depth video, depth maps are not dire tly viewed, but are only used to provide geometri information for view synthesis at de oder. Thus, as long as the resulting geometri error does not lead to una eptable quality for the synthesized view, ea h depth pixel only needs to be re onstru ted at the de oder oarsely within a tolerable range. We rst formalize the notion of tolerable range per depth pixel asDon’t Care Region (DCR), by studying the synthesized viewdistortion sensitivity to the pixel value  a sensitive depth pixel will have a narrow DCR, and vi e versa. 1.1 Overview We now dene per-pixel DCRs for depth map n. Dn , assuming target synthesized view is In the following, we will rather refer to the disparity eld from the depth on e the vn , with asso iated disparity value in view n+1 dn , whi h an be obtained vn (i, j) in texture map amera parameters are known. A pixel dn (i, j), an be mapped to a through a view synthesis fun tion s(i, j; dn (i, j)). orresponding pixel In the simplest ase s(i, j; dn (i, j)) orresponds to a pixel in texture map vn+1 of view n + 1 displa ed in the x-dire tion by an amount proportional to dn (i, j). The view synthesis error, ε(i, j; d), an thus where the views are aptured by purely horizontally shifted ameras, be dened as the absolute error between re onstru ted and original pixel value, given ε(i, j; d) = |s(i, j; d) − vn (i, j)| . If dn is ompressed, d˜n (i, j) employed for view synthesis may dier from ˜ dn (i, j) by an amount e(i, j) = dn (i, j) − dn (i, j), resulting in a (generally larger) view synthesis error ε(i, j; dn (i, j) + e(i, j)) > ε(i, j; dn (i, j)). We dene the Don't Care Region DCR(i, j) = [DCRlow (i, j), DCRup (i, j)] as the largest ontiguous interval of disparity values ontaining the ground-truth disparity dn (i, j), su h that the view synthesis error for any point of the interval is smaller than ε(i, j; dn (i, j)) + τ , for a given threshold τ > 0. Note that DCR intervals are dened per pixel, thus giving disparity d for pixel (i, j); i.e., the re onstru ted disparity value pre ise information about how mu h error The DCR information an be tolerated in the disparity maps. an be then used in order to perform a more ee tive motion estimation, to en ode the predi tion residual, and to enhan e the use of the SKIP mode. We have implemented in Matlab a fun tion that MVD sequen e, and save them into a binary le that an generate the DCR's of a an in turn be used by a modied H.264/MPEG-4 AVC en oder (JM referen e software v. 18.0) 1.2 Matlab fun tions [DCR_low DCR_up℄ = generate_DCR(isLeft, thres, depth_s ale) Inputs isLeft: binary ag indi ating whether the DCT is to be omputed for the left or for the rigth view. E.g., if you have views 3 and 5 in Kendo, and you have to synthesize view 4, you shall use isLeft = 1 to generate the DCR Software nal D5.3 DCR 5 needed to en ode disparity of view 3, and isLeft = 0 to generate the DCR relative to disparity of view 5. Note that DCR is dened as the worst- ase errror when you have texture and depth from one same view (e.g. texture and depth from view 3) and you want to generate the other view (view 5) thres: it is the τ in the report (and in our PCS paper): the highest tolerated threshold to dene the DCR. We advi e to use depth_s ale: τ =5 or similar values it is the s aling fa tor from depth to disparity. It must be known a priori. It depends on the sequen e, that should in any ase be re tied. For Kendo, the value to use is 0.204 Outputs DCR_low and DCR_up, respe tively the lower and upper bound of DCR interval per pixel. In order to be used by the en oder, they must be onverted into a binary le by using write_DCR_bin.m Note that this fun tion generates a DCR for a single image. So you need to all it image-by-image for an entire sequen e. The inputs of the s ripts are the texture and the depth of the right and left view, that should be passed as png les with names (left_depth.png, left_texture.png, et .) write_DCR_bin.m This s ript writes a matrix (M × N × F , with M × N the number of frames) into a le that is the spatial resolution and F an be used by the en oder. The DCR must be DCR_low and DCR_up; they will be written into two les DCR_low.bin and DCR_up.bin Please note that generate_DCR only generates stored into variables named named one 2D matrix. It is up to the user to generated as many matri es as you need and to sta k them into a 3D matrix. 1.3 Modied en oder len ode.exe This is the modied version of H.264 (JM v.18.0) using the les DCR_up.bin produ ed by the write_DCR_bin.m fun DCR_low.bin and tion in order to perform the DCR- based en oding as des ribed in the delivrable D4.3 and in the PCS paper. It works in baseline mode, with some restri tions, as one le. In parti ular, the RDO should be high an see from the en oder_baseline. fg omplexity and the motion estimation should be full sear h. Software nal D5.3 WTM 6 2 WTM WTM is an intra predi tion method based on a linear ombination of template mat hing predi tors. The method was previously des ribed in [1℄. After a qui k reminder, the following se tions presents how some details pe uliar to this method were implemented and how to ongure WTM. An example of use is then given. 2.1 Qui k overview WTM aims at providing an intra predi tion for blo ks of 4x4, 8x8, 16x16 and 32x32 sizes. This predi tion is based on a linear ombination of template mat hing predi tors belonging to the ausal neighbourhood. Figure 1: Sear h regions from ausal neighbourhood. Then, N blo ks Bi surrounded by the best mat hing areas are used to ompute predi tors Pi , whi h are then averaged to get the predi tion P of the blo k B : P = N 1 X Pi N i=1 (1) WTM relies on this general approa h but there are three main enhan ements: • it uses 4 dierent template shapes whatever the blo k size: the traditional L- shape whi h is 1 pixel large and three other shapes with the left, the top part of both an be 4 pixel large. However, only one template shape is used to determine all template predi tors. • the orrelation fa tors is based on the dot produ t between the template and the template predi tors. • template predi tors are not sear hed within all the ausal neighbourhood but within only two or three sear h windows. The number of sear h windows is Software nal D5.3 WTM 7 related to the rank of the blo k to be predi ted within the predi tion unit (PU) and their size depends on the size of the blo k to be predi ted. For more details about these three hara teristi s see [1℄.The following se tions gives information about how they have been used and implemented. Figure 2: Shape of templates. Figure 3: Sear h windows positions relatively to blo k B. 2.2 Some implementation details Distin tive features listed in the previous se tions lead to the following hoi es of implementation. 2.2.1 WTM a tivation WTM is not always a tivated for all PU sizes. Is a tivation depends on the lass of videos belonging to the orpus provided by JCT-VC and the PU sizes. The ases for whi h WTM is a tivated are listed in the Table 1 Software nal D5.3 WTM 8 Table 1: A tivation of WTM a ording to video lasses and PU sizes HE LC 4x4 8x8 16x16 32x32 64x64 4x4 8x8 16x16 32x32 64x64 Class A X X X X X Class B X X X X X X X Class C X X X X X Class D X X X X X Class E X X X X X X X Class F X X X X X X X Table 2: Relation between intra mode and shape of template INTRA mode Shape 10 UDL 11 U 12 L 13 UL 2.2.2 Template shape signalling The 4 template shapes are available. Consequently, two pie es of information must be signalled to the de oder: when a blo k is predi ted with WTM and whi h shape of template was used. To do so, four dire tion modes have been overloaded: from the mode 10 up to the mode 13. They are asso iated to a shape ( f. Fig.1) as listed in Table 2. An extra bit is added for all of these four modes and set to true if WTM is used as des ribed in Fig. 4. Figure 4: Signaling taking into a ount WTM. Software nal D5.3 WTM Blo k B size 4x4 8x8 16x16 32x32 9 Table 3: Sear h areas hara teristi s Sear h windows number Sear h windows width 3 12 2 20 2 8 2 4 Sear h windows height 4 8 16 32 2.2.3 Number and dimensions of sear h windows The number of sear h windows and also their size depond on the size of the blo k to be predi ted. The hara teristi s of the sear h areas are summarized in Table 3. 2.3 Conguration The WTM algorithm is written on top of the test model of HEVC, release 4.0. So, the onguration les dedi ated to WTM are based on the HTM-4.0 all intra en oding onguration le. A se tion is added to spe ify parameters relation to WTM. In parti ular, this se tions indi ates whether: • WTM is a tivated or not • 4x4, 8x8, 16x16, 32x32, WTM predi tion are a tivated An optional parameter, STMObserver, an be set to generate statisti s or predi tion maps. The following ex erpt of a onguration le lists the parameters related to STM. #============ WTM ================ STM : 1 # 0 : unsed, 1: a tivated STM4x4 STM8x8 STM16x16 STM32x32 : : : : 1 1 1 1 # # # # Predi Predi Predi Predi tion tion tion tion a a a a tivated tivated tivated tivated for for for for 4x4 blo k 8x8 blo k 16x32 blo 32x32 blo size size k size k size STMObserver : 3 # 0:unused, 1: % of sele tion, 2: stats files, 3: output frames The other se tions of the onguration le are kept un hanged. 2.4 Example of use To build the sofware, refer to the "how-to" provided in the deliverable 3.4 and uploaded to the website of the Persee proje t ( yn.e On e built, the en oding is laun hed with the following ommand: Software nal D5.3 WTM 10 TAppEn oder - tests. fg where "tests. fg" is the onguration le. To de ode the en oded video, just enter the following ommand: TAppDe oder -b str.bin -o de .yuv where "str.bin" is the en oded video (the name was spe ied in the en oding onguration le) and "de .yuv" the name of the de oded video. Software nal D5.3 Referen es 11 Referen es [1℄ Persee, 2d oding tools nal report, ANR-09-BLAN-0170, Delivrable D 3.4, July 2013. Software nal D5.3