Spherical Harmonic Matrix

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

IET Research Journals

On the Conditioning of the Spherical ISSN 1751-8644


doi: 0000000000
www.ietdl.org
Harmonic Matrix for Spatial Audio
Applications
Sandeep Reddy C1∗ , Rajesh M Hegde2
1
Department of Electrical Engineering, Indian Institute of Technology, Kanpur 208016, India
2
Department of Electrical Engineering, Indian Institute of Technology, Kanpur 208016, India
* E-mail: [email protected], [email protected]

Abstract: In this paper, we attempt to study the conditioning of the Spherical Harmonic Matrix (SHM), which is widely used
in the discrete, limited order orthogonal representation of sound fields. SHM’s has been widely used in the audio applications
like spatial sound reproduction using loudspeakers, orthogonal representation of Head Related Transfer Functions (HRTFs) etc.
arXiv:1710.08633v2 [eess.AS] 5 Mar 2018

The conditioning behaviour of the SHM depends on the sampling positions chosen in the 3D space. Identification of the optimal
sampling points in the continuous 3D space that results in a well-conditioned SHM for any number of sampling points is a highly
challenging task. In this work, an attempt has been made to solve a discrete version of the above problem using optimization based
techniques. The discrete problem is, to identify the optimal sampling points from a discrete set of densely sampled positions of the
3D space, that minimizes the condition number of SHM. This method has been subsequently utilized for identifying the geometry of
loudspeakers in the spatial sound reproduction, and in the selection of spatial sampling configurations for HRTF measurement. The
application specific requirements have been formulated as additional constraints of the optimization problem. Recently developed
mixed-integer optimization solvers have been used in solving the formulated problem. The performance of the obtained sampling
position in each application is compared with the existing configurations. Objective measures like condition number, D-measure,
and spectral distortion are used to study the performance of the sampling configurations resulting from the proposed and the
existing methods. It is observed that the proposed solution is able to find the sampling points that results in a better conditioned
SHM and also maintains all the application specific requirements.

1 Introduction of a rectangular matrix, with a subset of similar columns, that mini-


mizes the condition number can be studied using optimization based
Signals measured in real environments that are spatially dependent techniques. Solving such an optimization problem has many appli-
usually undergo perturbation errors due to the inaccurate placement cations in the domain of signal processing. Consider the spherical
of measurement setups. The modern signal processing techniques harmonic matrix as given below.
extract useful information from the measured signals by trans-
forming in to various domains like Fourier, Wavelets, Spherical
Y00 (θ1 , φ1 ) Y00 (θ2 , φ2 ) Y00 (θQ , φQ )
 
Harmonics etc [1, 2]. Generally, this transforms involve signal repre- ..
Y −1 (θ , φ ) Y1−1 (θ2 , φ2 ) .. Y1−1 (θQ , φQ )
sentations in terms of continuous orthogonal basis functions. But in  1 1 1 
practice, measured signals are usually discretized and represented in Y=
 . . . . 
 (1)
limited orders to reduce the complexity. These representations have  . . . . 
adverse effects to the perturbation errors in measured signals, if the YNN (θ1 , φ1 ) YNN (θ2 , φ2 ) .. N
YN (θQ , φQ )
transformation matrices are ill-conditioned. For example, spherical
harmonic representation of three dimensional sound fields involves The behaviour of the matrix depends on the sampling positions cho-
representations in terms of spatially dependent continuous orthogo- sen in the 3D space. It has been widely used in the domain of
nal functions. A discrete, limited order matrix representation called spherical array signal processing [4, 5], spatial audio [6, 7] etc. Par-
as Spherical Harmonic Matrix (SHM) is used for practical compu- ticularly in the domain of spatial audio, sampling configurations
tations [2, 27]. This representation can have adverse effects to the are chosen for HRTF measurements [8] or loudspeaker position-
perturbation errors in measured signals, if the SHM is ill-conditioned ing [3, 9, 10]. Minimizing the condition number of SHM can aid
[2, 3]. So it is important to analyse the condition behaviour of SHM. in choosing better sampling configurations that would prevent from

being sensitive to the perturbation errors in the measurement pro-
Condition number of any matrix determines the sensitivity of cess. But there exists many sampling configurations in the literature
the output to a small change in the input under linear transforma- which are widely used in various domains. It is important to know
tion. Condition number of a matrix is a non-convex and non-smooth the properties of each configuration so that their relevance in the
function. General problem of minimizing the condition number of domain of spatial audio can be analysed.
a matrix is a complex problem. However choosing a sub-matrix out Sampling schemes around a sphere are widely studied in vari-
ous domains [11–14]. Some of the well known spherical sampling
schemes existing in the literature are T-designs of platonic solids
[15], Equiangle sampling, Gaussian sampling [2], Fibonacci lattice
∗ This paper is a preprint of a paper submitted to IET Signal Processing [12], Fliege nodes [16], Lebedev grids [17], Interaural sampling
Journal. If accepted, the copy of record will be available at the IET Digital
[18] etc. Each of these designs have different properties which
assumes high significance in particular application. For example, T-
Library. The first author of the paper is currently affiliated with center for designs and Fliege nodes exhibit nearly uniform placement of points
vision, speech and signal processing, University of Surrey. over a sphere and finds application in the loudspeaker positioning

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 1
for spatial sound reproduction [31] and design of spherical micro- of a rectangular matrix with a reduced condition number. Sub-
phone arrays [4, 20]. A fifty-node Lebedev grids as discussed in sequently, the proposed method would be used to obtain a well
[17], exhibits better orthogonality among spherical harmonic vec- conditioned SHM for spatial audio applications. Particularly spatial
tors as compared to T-designs and Fliege nodes, and found to be sampling configurations are identified for loudspeaker based spa-
important for recording and reproduction of sound fields. Equian- tial sound reproduction and HRTF measurement. The configurations
gular sampling, Gaussian sampling are regular configurations which obtained using the proposed method is evaluated for various spheri-
exhibit closed form expressions for angular directions and quadra- cal harmonic orders and compared with the existing configurations.
ture weights. They have a practical advantage of having equal spac- The rest of the paper is organized as follows. Section 2 describes
ing between consecutive sampling points in the azimuthal plane. In the proposed method for minimizing the condition number of a rect-
fact, Equiangular sampling exhibits equal spacing between the sam- angular sub-matrix using an optimization based approach. Section 3
pling points in both azimuthal and elevation planes. But they suffer presents the identification of loudspeaker geometry and HRTF sam-
from dense sampling points at the poles. Fibonacci sampling exhibits pling configuration using the proposed method. Section 4 discusses
nearly uniform sampling on a sphere with closed form expression the practical implementation, and the performance analysis of the
for angular directions for any number of sampling points. Interaural proposed method.
sampling is considered to be important in the domain of spatial hear-
ing where sampling points are chosen on interaural circles. In this
manner, each sampling scheme has advantages in different applica-
tions. This work mainly focus on the condition behaviour of SHM 2 Condition Number Minimization of a
for various sampling configurations that are relevant in the domain of Rectangular Sub-Matrix Using an Optimization
spatial audio. In this regard, we mainly focus on two applications of based Approach
spatial audio, namely, spatial sound reproduction using loudspeak-
ers and HRTF measurement where choice of sampling configuration Condition number of any matrix determines the sensitivity of the
based on the conditioning of SHM assumes importance. output to a small change in the input under linear transformation.
Sound field reproduction is the synthesis of a desired sound field In many applications, it is desirable to identify a sub matrix, by
over a particular region using multiple loudspeakers located in the choosing a selected columns or rows of a original matrix such that,
3D space. Many works have been earlier attempted to reproduce the sub-matrix would result in a lowest condition number. Such
desired sound field by optimal choice of sampling configuration for problems are difficult to handle by an exhaustive search as the com-
loudspeakers [3, 9, 21]. In [3], plane-wave reproduction using array putational complexity increases in a combinatorial way with the
of loudspeakers has been discussed where loudspeakers are treated increase in the order of the matrix. Hence it is important to reformu-
as point sources. It also discuss the importance of the condition- late the problem such that the optimal solution can be identified for
ing behaviour of SHM, and incorporates the T-design configurations larger matrices with the available resources. An optimization based
for loudspeaker positioning. Loudspeaker positioning for spatial approach for solving above problem is discussed as follows.
field reproduction has also been studied as a power constrained
error minimization problem [22]. In [6], sound field reconstruction
using spherical harmonic representation called as Ambisonics has 2.1 Problem Formulation
been discussed in a compressed sensing framework by assuming
Consider a rectangular matrix AP ×Q (Q >> P ). Condition num-
loudspeakers as plane wave sources. Among the above mentioned
ber of A, denoted as κ(A), is defined as [29],
approaches, this work mainly focuses on the Ambisonics way of
reproducing sound fields. In Ambisonics [23], spatial sound is gen-
erally encoded in to spherical harmonics. These encoded signals κ(A) = ||A|| ||A−1 || (2)
are subsequently used to drive loudspeakers for the synthesis of
spatial sound. The encoding and decoding process involves linear where || · || is the operator norm. If the norms are defined in an
transformation using SHM. Choosing a proper configuration for Euclidean way, then the condition number of a matrix A can be
loudspeakers is an important task to maintain a well conditioned computed as the ratio of maximum singular value to the minimum
SHM. T-designs were earlier used for positioning the loudspeakers singular value of matrix A. The expression for κ(A) is given as,
in the 3D space [31]. T-designs allow a uniform placement of loud-
speakers in the 3D space, and also results in orthogonal spherical
harmonic vectors with a well conditioned SHMs. But T-designs are σmax (A)
κ(A) = (3)
available only for a limited number of configurations, and for a max- σmin (A)
imum of 20 samples, supporting a maximum order of only N = 2
[2]. T-designs for number of samples greater than 20 are available where σmax (A), σmin (A) represent maximum and minimum sin-
only for few set of T-values and their corresponding spherical har- gular values of matrix A respectively. Any sub-matrix of A with
monic orders [15, 24]. Hence the focus of this work is to identify dimension P × Q0 can be constructed as
configurations for any number of loudspeakers that can reproduce
sound field efficiently and also result in well conditioned SHM.
In the measurement of HRTFs, spatial configurations are chosen A0 = AP ×Q ΓQ×Q0 (4)
based on the following factors. The resolution of HRTF measure-
ment is usually chosen based on the humans sensitivity to the where A0 is the sub-matrix with Q0 columns, and ΓQ×Q0 is a rect-
spatialised sound perception. The configuration is generally chosen angular binary matrix with Q rows and Q0 columns (Q > Q0 ). The

such that circular hoops can be utilised for simultaneous HRTF mea- objective of this work is to identify the optimal sub-matrix A0
surement. Earlier many HRTF sampling configurations like CIPIC which has least condition number among all the sub-matrices of A
[18], KEMAR [25] were proposed considering the above factors. with dimension P × Q0 . The objective function of the minimization
But it is also important to choose spatial configuration which results problem is given as,
in well conditioned SHM. It is so because, spherical harmonic repre-
sentation of HRTFs plays an important role in the interpolation and
min κ(A0 ) ≡ min κ(AΓ)
range extrapolation of HRTFs [26]. Additionally HRTFs of individ- Γ
ual frequencies have been earlier expressed using SHM [27, 28] . So
it is important to obtain a spatial configuration which results in well The sub-matrix A0 can contain a subset of columns of A, only if,
conditioned SHM and also considering the aforementioned factors. each column of Γ contains exactly one non-zero element and each
The primary contributions of this work are as follows. A novel row of Γ contains a maximum of one non-zero element. Imposing
optimization based solution is developed for obtaining a sub-matrix these constraints on Γ would result in a constrained optimization
problem as given below.

IET Research Journals, pp. 1–12


2 c The Institution of Engineering and Technology 2015
minimize
Γ
κ(AΓ) T r(AAH )
ηub = . (12)
P
subject to γij ∈ {0, 1}
The derivation for upper bound is described in Appendix 6.2. So the
Q final bounds for η ∗ is given as,
∀j ∈ {1 · · · Q0 }
X
γij = 1 (5)
i=1 T r(AAH )
0 < η∗ ≤ . (13)
Q0
X P
γij ≤ 1 ∀i ∈ {1 · · · Q} The computation of optimal lower bound (η ∗ ) from the above given
j=1 range is an important task of the proposed solution. A complete
methodology to identify the optimal lower bound is discussed as
where γij is an element of the ith row and j th column of matrix follows.
Γ. The problem posed in Equation 5 is a constrained integer pro-
gramming problem with binary variables. The objective function is
non-convex, and hence it is re-formulated by utilizing the following 2.2 Computation of Optimal Lower Bound of Minimum
properties of condition number. Eigenvalue
q Consider a variable η denoting a lower bound value in the range
κ(A) = κ(AAH ) (6) [ηlb , ηub ]. The objective of the work is to find an optimum value of η
s in its range, denoted as η ∗ , for which condition number of A∆AH
σmax (A) λmax (AAH ) is minimum. This is expressed as given below.
= (7)
σmin (A) λmin (AAH )
T r(AAH )
where H is a Hermitian operator, λmax (·) and λmin (·) are the η ∗ = min{κ(A∆AH )} 0 ≤ η ≤ . (14)
η P
maximum and minimum eigenvalues of a corresponding matrix
However the range of η values are continuous. But only a discrete
respectively. Using the property given in Equation 7, the objective
and finite number of η values has actual significance in the com-
function can be reformulated as,
putation of optimal lower bound η ∗ . Because, for over a range of
η values, the optimization problem given in Equation 9 will result
in a similar binary diagonal matrix ∆∗ , and a similar condition
s s
λmax (AΓΓH AH ) λmax (A∆AH )
κ(AΓ) = = (8) number. In order to better understand this peculiar behaviour, few
λmin (AΓΓH AH ) λmin (A∆AH ) additional terms are introduced and computation of lower bounds at
the transition points is also presented. The value of η where there is
where ∆ = ΓΓH , is a diagonal matrix with binary elements (0, 1). a transition in the value of condition number is computed as,
∆ comprises of exactly Q − Q0 zeros as its diagonal elements. The
matrix (A∆AH ) given in Equation 8 is a Hermitian positive defi- η i = λi−1 i = 1···R (15)
nite matrix, if the sub-matrix AΓ is full row rank matrix. Otherwise min
(A∆AH ) is a Hermitian positive semi-definite matrix whose mini- i
where η is the lower bound of minimum eigenvalue at the i tran- th
mum eigenvalue is zero and an infinite condition number. Therefore, sition point, and λi−1
min is the minimum eigenvalue of the optimum
it is important to maintain the matrix AΓ as full row rank. In prac- sub-matrix (A∆i−1 AH ) resulted in the (i − 1)th transition. The
tise, this is achieved by choosing (Q0 >> P ). It is also a known binary diagonal matrix resulted for the lower bound η i is computed
fact that, the maximum and minimum eigenvalues of any Hermitian as,
matrix are convex and concave functions respectively. Therefore the
non-convex problem in Equation 5 is reformulated as,
∆i = minimize λmax (A∆AH )
∆,diagonal
∆∗ = minimize λmax (A∆AH )
∆,diagonal subject to δii ∈ {0, 1} ∀i ∈ {1 · · · Q}
subject to δii ∈ {0, 1} ∀i ∈ {1 · · · Q} Q (16)
δii = Q0
X
Q (9)
δii = Q0
X
i=1
i=1 λmin (A∆AH ) > η i
λmin (A∆AH ) > η ∗
Equation 16 is solved using numerical techniques [34]. Implementa-
s tion issues are discussed in section 4.1. The sub-matrix (A∆i AH ),
∗ λmax (A∆∗ AH ) its minimum eigenvalue λimin , maximum eigenvalue λimax , and the
κ(A0 ) = (10)
λmin (A∆∗ AH ) condition number κi for R intervals is illustrated in Figure 1. The
reason for an optimization problem to result in a similar solution for
where δij is an element of the ith row and j th column of matrix
values of η lying between (η i , η i+1 ) is described as follows.
∆. ∆∗ is the optimal solution of the above constrained optimization
problem for an optimal lower bound η ∗ on the minimum eigenvalue Consider η = η̂, where η i < η̂ < η i+1 . The feasible region
of A∆AH . The proof for the equivalence of the optimization prob- for λmin (A∆AH ) > η̂ is a subset of the feasible region for
lems given in Equation 5 and Equation 9 is given in Appendix 6.1 . λmin (A∆AH ) > η i as illustrated in Figure 2. Additionally, the
The optimum lower bound η ∗ is an unknown quantity, but it lies in minimum eigenvalue of optimum sub-matrix at ith transition is
between an upper and lower bound as given below. greater than η̂ as given below.

ηlb <= η ∗ <= ηub . (11) λmin (A∆i AH ) = λimin > η̂ (17)
These bounds ηlb , and ηub are computed using the properties of Unless the feasible region for lower bound η̂ excludes the optimum
eigen values of a Hermitian matrices. Its a known fact that, the eigen- solution generated at the ith transition, i.e η̂ > λimin , the optimal
values of A∆AH are positive. Hence the lower bound ηlb can be solution for lower bound η̂ will always be same as lower bound η i .
considered as zero. The upper bound ηub is computed as, This can be mathematically given as follows.

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 3
R−2
η R−1 = λmin T r(AAH )
Lower Bound (η) : η0 = 0 η 1 = λ0min η 2 = λ1min P

Diagonal Matrix (∆) : ∆0 ∆1 ··· ∆R−1

Sub-matrix (A∆AH ) : A∆0 AH A∆1 AH ··· A∆R−1 AH

Maximum Eigenvalue : λ0max λ1max ··· λR−1


max
of Sub-matrix (λmax )

Minimum Eigenvalue : λ0min λ1min ··· λR−1


min
of Sub-matrix (λmin )

Condition Number (κ) : κ0 κ1 ··· κR−1

Fig. 1: Figure illustrating R transitions of lower bound η and the corresponding optimum solution (∆) resulted at each transition by the
optmization problem.

η i = λi−1 T r(AAH )
min
η̂ η i+1 = λimin ηR = P
ˆ = ∆i ,
∆ λimin = λ̂min , κi = κ̂ (18)

where λ̂min , κ̂ are the minimum eigenvalue and condition number → →


of sub-matrix A∆A ˆ H . Hence there can only be finite number (R) λmin (·) > η i λmin (·) > η̂
of distinct diagonal matrices ∆i and condition numbers κi . Theoret-
Q
ically R can take any value less than Q0 . So the practical feasibility Fig. 2: Figure illustrating the feasible regions of λmin (A∆AH )
of the proposed algorithm cannot be generalised for all the rect- function for lower bound η̂ and lower bound of ith transition η i .
angular matrices. However the current works restrict to Spherical The optimal solution at the ith transition (λimin ) is in the feasible
Harmonic Matrices (SHM) which is found be exhibiting fewer num- region of λmin (·) > η̂.
ber of transitions (R) which will be discussed in the section 3.1.3.
Total number of transitions (R) would be smaller if the step size
is larger which in turn depends on the minimum eigenvalue gener-
ated at that particular transition. The steps involved in identifying the Algorithm 1 Algorithm to identify R distinct lower bounds, and its
R lower bounds, and its corresponding R diagonal matrices and R corresponding R diagonal matrices and R condition numbers using
condition numbers is enumerated in Algorithm 1. Hence the optimal the proposed method
lower bound for minimum eigenvalue is computed as 1: Input: η = 0, i = 0, Q0 , A.
T r(AAH )
2: while η < P do
η ∗ = arg{min{κi (η i )}} i ∈ (0, R − 1) (19) 3: ηi = η
4: The optimal ∆i for lower bound η i is obtained by solv-
The optimal ∆∗ and its corresponding condition number can now ing Equation 16 using optimization toolboxes as discussed in
be computed as section 4.1.
5: λimin r= λmin (A∆i AH )
λmax (A∆i AH )
6: κi = λmin (A∆i AH )
∆∗ = minimize λmax (A∆AH ) 7: η = λimin
∆,diagonal 8: i ← i + 1,
subject to δii ∈ {0, 1} ∀i ∈ {1 · · · Q} 9: end while
10: Total number of subsets R = i.
Q (20) 11: Output: R distinct lower bounds η 0 · · · η R−1 , and its corre-
δii = Q0
X
sponding diagonal matrices (∆0 , ∆1 , · · · ∆R−1 ) and condi-
i=1 tion numbers (κ0 , κ1 , · · · κR−1 ).
λmin (A∆AH ) > η ∗

3 Applications
s

λmax (A∆ AH )
κ(A0∗ ) = (21) A method to minimize the condition number of a rectangular sub-
λmin (A∆∗ AH ) matrix was presented in Section 2. In the current section, the pro-
posed method together with application specific constraints is used
where ∆∗ is the optimal binary diagonal matrix resulted from in solving two different problems that arise in the domain of spa-
Equation 20 and κ(A0∗ ) is the minimum condition number obtained tial audio. These problems include the identification of loudspeaker
for the lower bound η ∗ . An example based illustration of the geometry for spatial sound reproduction, and the identification of
proposed method is discussed in Appendix 6.3. spatial sampling configurations for HRTF measurement. Both these

IET Research Journals, pp. 1–12


4 c The Institution of Engineering and Technology 2015
T=2, Q=4 T=3, Q=6 T=5, Q=12 T=10, Q=60 Fibonacci Q=100

(a) (b) (c) (d) (e)

Fig. 3: Figures (a-d) illustrates the voronoi polygonal diagrams of T-designs (T) for various number of sampling points (Q). Figure (e)
illustrates the voronoi polygonal diagram for Q = 100 following Fibonacci lattice.

Q
X ak ˆ2
ν = (d − d) (22)
4π k
k=1
p
ν (2π)
D = (23)

where ak is the area of the kth voronoi polygon. dk = a1k is the den-
Q
sity of the kth voronoi polygon. dˆ = 4π is the average density over
the sphere. The voronoi polygons formed for (T = 2, Q = 4), (T =
3, Q = 6), (T = 5, Q = 12), and (T = 10, Q = 60) are illustrated
in Figure 3(a-d). It can be observed that, all the polygons of any
particular T-design exhibit equal areas, and thus an equal voronoi
Fig. 4: Figure illustrating the variation of polygonal area for vari-
densities. Hence the D-measure for T-designs is zero. However T-
ous faces of the polygon centred by sampling points generated using
designs are available only for limited number of sampling points
Fibonacci lattice.
and spherical harmonic orders. But it is important to identify optimal
geometry for loudspeakers for any number of sampling points. So a
nearly uniform configuration of sampling points on sphere called as
Fibonacci lattice [12] is used as the reference for identifying the opti-
problems and the proposed solutions are discussed in a detailed mal loudspeaker geometry. The voronoi polygonal plot for Fibonacci
manner. lattice is given in Figure 3e of 100 sampling points on the sphere.
The variance of the polygonal area for various number of sampling
points and polygonal faces is given in Figure 4. It can be noted that,
except for few faces the polygonal area is nearly equal for almost all
3.1 Identification of Loudspeaker Geometry for Spatial polygonal faces across various number of sampling points. But the
Sound Reproduction condition number of SHM for Fibonacci lattice is high. For exam-
ple, condition number of SHM with Q = 32, N = 3 is 1670. So the
Loudspeaker geometry for spatial sound reproduction is chosen conditioning behaviour of SHM for Fibonacci lattice is minimized
mostly based on the practical feasibility of loudspeaker positioning by appropriate selection of sampling points. The ensuing section first
in the 3D space, ability to reproduce the desired sound field, and discusses the importance of SHM in loudspeaker positioning. Sub-
the conditioning behaviour of the SHM. Although T-designs of pla- sequently a method to identify the sampling points that results in a
tonic solids were quiet apt in satisfying these factors, but they are well conditioning behaviour of SHM and efficient reproduction of
limited to a maximum sampling positions of Q = 20. T-designs for spatial sound is discussed.
number of samples Q > 20 exhibiting nearly uniform configuration
are available for only few set of values as described in [15],[24]. 3.1.2 Importance of SHM for Identification of Loudspeaker
Lack of uniform configurations for any possible number of sampling Geometry: Spherical harmonic functions has been widely used
points and spherical harmonic orders motivates us to explore other in the domain of spatial audio for 3D sound field reproduction
configurations which maintain closer properties as T-designs. In this using loudspeakers. Particularly in the techniques like Higher Order
section, we firstly introduce a measure to quantify the uniformness of Ambisonics (HOA), spherical harmonic analysis is considered to be
sampling points on sphere. Subsequently the importance of SHM in a powerful tool for describing the spatial properties of sound fields.
loudspeaker geometry identification is discussed. Further minimiza- Sound field emitted by a set of ideal planewave sources can be fully
tion of condition number of SHM using the proposed method method described by a set of spherical harmonic components as [6],
would be discussed for identifying the best geometry of loudspeakers
for positioning in the 3D space.
b̃(t) = Ỹs(t) (24)

3.1.1 Measure of Uniformity for Spatial Distribution of Sam-


Y00 (θ̃1 , φ̃1 ) Y00 (θ̃2 , φ̃2 ) Y00 (θ̃P , φ̃P )
 
..
pling Points on a Sphere: The uniformity of sampling points Y −1 (θ̃ , φ̃ )
distributed on a sphere has been earlier measured using the voronoi  1 1 1 Y1−1 (θ̃2 , φ̃2 ) .. Y1−1 (θ̃P , φ̃P )
density, which is inverse of the voronoi polygonal area [30]. The Ỹ = 
 . . . .  (25)

variation of these densities across all the polygons can be utilised as
 . . . . 
a measure for uniformity of sampling points on the sphere. In [30], YNN (θ̃1 , φ̃1 ) YNN (θ̃2 , φ̃2 ) .. YNN (θ̃P , φ̃P )
this measure is called as D-measure which is given as,

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 5
where s(t) is the signal vector of P plane wave sources, Ỹ is the and 29. Therefore, a well conditioned SHM will avoid perturbation
SHM truncated to order N, and b̃(t) is the vector of spherical har- errors in both the encoding and decoding process. Further the coeffi-
monic coefficients. It has to be noted that, the angular convention cients of the reproduced sound field b(t), and the actual sound fields
for elevation is measured from the z-axis. The sound field produced b̃(t) exhibit an error. The normalized error exhibited between them
by P plane wave sources is reproduced by a set of Q loudspeakers can be calculated as follows.
(Q > P ) as given below:
b(t) = Yg(t), (26) ||b(t) − b̃(t)||2
ξ(t) = (32)
where Y is SHM of Q sources as given earlier in Equation 1, g(t) ||b̃(t)||2
indicates the set of loudspeaker signals which are assumed to be
emitting plane-waves herein [6]. b(t) is the reproduced sound field The plane wave reproduction error depends on the the dimension of
coefficients. It has to be noted that, Q directions of loudspeak- SHM as given in Equation 26. For Q >> (N + 1)2 , reproduction
ers are different as compared to the P directions of plane wave error ξ(t) will be nearly equal to zero. But as plane wave composes
sources. Equation 26 is usually termed as the encoding process. The of spherical harmonic terms of infinite order, it is indeed necessary to
equivalent decoding process is to generate the driving signals for check the error performance for higher orders. For Q < (N + 1)2 ,
loudspeakers using the spherical harmonic coefficients to synthesize the normalized Euclidean error as given in Equation 32 would be
3D sound fields. Decoding process is given as, minimum, if the SHM (Y) exhibits orthogonal spherical harmonic
vectors. A well conditioned SHM can in fact result in a better orthog-
g(t) = Db̃(t), (27) onality among spherical harmonic vectors and thereby minimize the
where D is the decoding matrix. Two widely used methods to com- plane wave reproduction error. Therefore in the following section,
pute the decoding matrix D are given as follows [7, 31]. obtaining the well conditioned SHM using the method proposed in
Section 2 would be discussed.
1. Sampling Decoder:
4π T
D= Y (28) 3.1.3 Minimization of SHM Condition Number for Identifying
Q
the Loudspeaker Geometry: Consider Q points distributed over
2. Modematching Decoder:
a sphere in a Fibonacci lattice. Azimuthal and elevation angles for
D = YT (YYT )−1 (29) this Q points can be obtained as [12],

It has to be noted that, in both the encoding and decoding process, 5−1
SHM is used as a transformation matrix. Particularly in the decoding c1 =
2
process SHM is used in its transpose and least square form. A well
φi = (2π · c1 · i) mod 2π i = 1···Q (33)
conditioned decoding matrix will prevent from being sensitive to the
perturbation errors. But it is known that, 2i
θi = sin−1 { − 1} i = 1 · · · Q
Q
κ(YT ) = κ(Y), (30)
where, mod is the modulo operation. Among this Q sampling
T
κ(Y (YY ) T −1
) ≤ κ(Y) 3
(31) points over a sphere, Q0 sampling points which result in a well condi-
tioned SHM has to be identified. These sampling points are obtained
So minimizing the condition number of SHM is equal to minimizing by solving the optimization problem given in Equation 20 with an
the condition number of the decoding matrix given in Equation 28 additional regularization constraint as given below.

Fig. 5: Figure illustrating minimum eigenvalue and condition number of the sub-matrix (Y∆YH ) for various lower bounds (η) for selecting
Q0 = 32 optimal sampling points from a total number of sampling points of Q = 100 .

Table 1 Table depicts the total number of transition points (R), optimal lower bound (η ∗ ), condition number (κ) for selecting Q0 = 32 optimal sampling points from
a set of (Q) spatial sampling points.

Q=50 Q=55 Q=60 Q=65 Q=70 Q=75 Q=80 Q=85 Q=90 Q=95 Q=100

η ∗ (10−4 ) 1.0851 1.1996 1.6513 1.3332 2.0353 1.4502 1.8704 2.2368 2.6939 2.2368 2.5471

κ 289.38 257.74 213.48 232.58 182.21 229.37 209.96 189.38 147.56 174.65 166.73

R 63 59 134 68 76 6 74 100 49 23 133

IET Research Journals, pp. 1–12


6 c The Institution of Engineering and Technology 2015
H(k) = YT Hnm (k) (36)
H(k) = [H(k, θ1 , φ1 ) H(k, θ2 , φ2 ) · · · H(k, θQ , φQ )]T
Q×1

Hnm (k) = [H00 (k) H1(−1) (k) · · · HN N (k)]T


(N +1)2 ×1
2πf
k= c , c = 340 m/s

H(k, θj , φj ) represents the head related transfer function of wave


number k and spatial direction (θj , φj ). Hnm (k) represents spheri-
cal harmonic coefficients of degree n and order m, and Y represent
spherical harmonic matrix as given in Equation 1. The spherical har-
monic coefficients Hnm (k) can be obtained using the least square
solution of Equation 36. HRTFs of any arbitrary direction can now
Fig. 6: Figure illustrating Q0 = 32 optimum spatial sampling points be obtained using the spherical harmonic expansion as follows:
(filled circle) obtained using the proposed method among Q = 100
N n
spatial sampling points (empty circles) of the Fibonacci Lattice.
Hnm Ynm (θ, φ).
X X
H̃(k, θ, φ) = (37)
n=0 m=−n

where, H̃(k, θ, φ) is the interpolated HRTF obtained for direction


∗ H (θ, φ). It can be observed from Equation 36 that, the accuracy of the
∆ = minimize λmax (Y∆Y ) interpolated HRTF depends up on the sampling points chosen for
∆,diagonal
SHM. It is important to choose a configuration for spatial sampling
subject to δii ∈ {0, 1} ∀i ∈ {1 · · · Q} points which has a better condition number for SHM (Y). Otherwise
Q (34) the errors incurred during measurement of HRTFs would be over
δii = Q0 sensitive during spherical harmonic transformation. But in order
X

i=1
to reduce the complexity of HRTF measurement process, circular
hoops were earlier utilized for simultaneous measurement of HRTFs
λmin (Y∆YH ) > η ∗ from various directions. So the optimal spatial sampling configura-
tion for HRTF measurement should not only reduce the condition
s number of the spherical harmonic matrix but also enable us to take
∗ λmax (Y∆∗ YH ) advantage of circular hoops. However traditional sampling schemes
κ = (35)
λmin (Y∆∗ YH ) like Interaural sampling scheme or Gaussian sampling scheme leads
to an ill-conditioned SHM. For example, condition number of CIPIC
where, Y is the SHM of P rows and Q columns, η ∗ in Equation 34 configuration is 338.0216. This value can be reduced by selecting a
can be obtained using Algorithm 1 and Equation 19. The problem subset of optimal spatial sampling points discussed as follows.
posed in Equation 34 is first solved here for various values of Q. Consider Q spatial sampling points of the 3D space. Q0 sam-
Figure 5 illustrates the minimum eigenvalue and condition number pling points (Q0 << Q), resulting in a well conditioned SHM can
of sub matrix (Y∆YH ) across various lower bounds for selecting be identified by solving the optimization problem given in Equation
Q0 = 32 optimal spatial points from a total of Q = 100 sampling 34. However, it is sometimes necessary to constrain the total number
points. Further it can be observed from Table 1 that, number of of sampling positions in a circular hoop. For example, CIPIC con-
intervals with constant condition number or the number of transi- figuration has dense sampling at the left and right poles. The number
tion points (R) is different for different values of Q. But as claimed of sampling positions can be restricted based on the relevance of
in Section 2.2, for any integer value of Q, the value of R is found HRTFs in spatial sound perception. Thus the newly constrained
to be finite and small for SHM. The length of each interval is also optimization problem is given as,
arbitrary and depends only on the minimum eigenvalue obtained by
solving the optimization problem at each transition. It can also be
observed that, condition number attains a minimum value for a par- ∆∗ = minimize λmax (Y∆YH )
∆,diagonal
ticular interval of η indicated by an arrow mark in Figure 5. The
condition number values obtained for optimum lower bound η ∗ for subject to δii ∈ {0, 1} ∀i ∈ {1 · · · Q}
an SHM of order N = 3 is tabulated in Table 1. Choosing any value Uδ ≤ v
for η in this interval result in a sub-matrix of SHM which results (38)
in minimum condition number. Q0 columns of these resultant sub- Q
0
X
matrix correspond to a Q0 spatial directions. Q0 optimum sampling δii = Q
points obtained are a subset of Q sampling points of Fibonacci lat- i=1
tice. These optimal sampling points obtained for Q = 100, Q0 = 32
λmin (Y∆YH ) > η ∗
are illustrated in Figure 6. It can be observed that optimum sampling
points are away from the plane θ = π/2 and spread uniformly along  
entire azimuth range. 1 1 ··· 1 0 0 ··· 0 0 ··· 0 0 ··· 0
··· ··· ··· ···
 
 0 0 0 1 1 1 0 0 0 0 
3.2 Spatial Sampling Configurations for HRTF U=

 · · ··· · · · ··· · · ··· · · ··· ·


Measurement using SHM Condition Number Minimization 
 · · ··· · · · ··· · · ··· · · ··· ·


0 0 ··· 0 0 0 ··· 0 0 ··· 1 1 ··· 1
Spherical harmonics are widely used for representation of acous-
tic fields over a sphere. HRTFs are considered to be samples of
a valid acoustic field. So they can be also be expressed in spheri- δ = [δ11 δ22 ··· δQQ ]T (39)
cal harmonics [32, 33]. These representations not only capture the
HRTFs compactly in few coefficients but also enable to interpolate where δ is a vector with diagonal elements of matrix ∆. U is a
and extrapolate HRTFs [26]. HRTFs of various directions can be binary matrix with J rows and Q columns, where Q = JK. Here
represented in spherical harmonic functions as [27, 28], J indicate the number of circular hoops and K indicate number of

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 7
Fig. 7: Distribution of sampling points on a sphere for various methods (a) original CIPIC data of 1250 sampling points. (b) 640 optimum
sampling points obtained using the proposed method. (c) The modified CIPIC configuration of 850 sampling points.

Fig. 8: Figure illustrating minimum eigenvalue and condition number of the sub-matrix (Y∆YH ) for various lower bound simulated for total
number of sampling points Q = 1250 and Q0 = 640 following interaural sampling configuration of CIPIC database.

sampling positions in each circular hoop. The ith element of vec-


tor v indicate the upper bound for number of sampling points in the
ith circular hoop. Solving this above optimization problem results in
Q0 optimal sampling points which reduces the condition number and
constrain the number of sampling points in each circular hoop. Simu-
lations are performed for sampling points of CIPIC data. Among the
total 1250 sampling points, with 25 circular hoops and 50 sampling
points per each hoop, 640 optimal sampling points are identified.
The number of sampling points in each hoop is constrained to be
less than an upper bound whose values are given below.

vi = 14 + 3(i − 1) i = 1 · · · 13 (40)
vi = 50 − 3(i − 13) i = 14 · · · 25 (41)

Figure 8, illustrates minimum eigenvalue and condition number


of sub matrix (Y∆YH ) for R = 31 distinct lower bounds. The Fig. 9: Figure illustrates condition number performance of the
condition numbers for the original sampling configuration of 1250 CIPIC original configuration and CIPIC Modified configuration for
sampling points is found to be 338.02, and the condition number of various spherical harmonic orders
optimum sampling configuration of 640 points is found to be 186.7.
It can be observed that, there is a decrease in condition number of
SHM for the latter configuration. Figure 7 shows the original and
optimal spatial sampling points for CIPIC data. It can be clearly
observed that, the optimum sampling points are distributed uni- the plane is considered. The new modified configuration is shown
formly along all the lateral angles and sparsely populated on and in Figure 7c and its corresponding condition number is found to be
around the the plane θ = 00 . It has to be noted that, elevation angle 269.10. Condition number performance of the CIPIC original con-
in the current application is measured above the xy plane. Since spa- figuration and CIPIC Modified configuration for various spherical
tial perception is an important criteria for choosing sampling points, harmonic orders is shown in Figure 9. Proposed method is found
a new configuration which distributes sampling points sparsely on to be reducing the condition number of SHM, particularly for the
and around the plane θ = 00 and distributes uniformly away from higher spherical harmonic orders.

IET Research Journals, pp. 1–12


8 c The Institution of Engineering and Technology 2015
4 Performance Evaluation 4.2 Computational Complexity Analysis

Performance of the proposed method in obtaining the spatial sam- The importance of the proposed method is mainly due its capabil-
pling configurations for spatial audio applications is discussed ity to reach an optimal solution in a practically acceptable amount
herein. Initially the practical implementation of the proposed method of time for Q < 2000, N < 5. The exhaustive search for finding
is presented. Further the SHM condition number performance and an optimal solution is practically impossible. For example, a matrix
the plane wave reproduction error is analysed for various spherical with Q = 100, N = 3 and Q0 = 32 would require 1020 hours for
harmonic orders. Finally HRTF interpolation accuracy is evalu- computing the optimum sub-matrix and minimum condition num-
ated for various interaural sampling configurations and spherical ber. Where as, the proposed method can reach to an optimal solution
harmonic orders. in less than 12 hours. In fact, the proposed method would consume
less time for SHM than any other matrix, because of the properties
of spherical harmonic functions as discussed below.
4.1 Implementation
Practical implementation of the proposed algorithm for condition Ynm (−θ, φ) = Ynm (θ, φ) (44)
number minimization is performed using numerical optimization. π   3π 
The fundamental problem formulated in Equation 9 falls in to the Ynm + θ, φ = Ynm − θ, φ (45)
2 2
category of mixed integer programming with binary variables. The
objective function and the constraints are all convex except that the These properties of spherical harmonics result in similar columns
variables are constrained to be binary. This problem was complex for SHM. In general concatenating any number of column vectors
to solve using a traditional linear or convex programming tools. which are identical to the already existing column vectors of a matrix
However the recent advancements in optimization, particularly in does not alter the condition number. So computational complex-
the domain of mixed integer programming resulted in a number of ity for identifying a sub-matrix which result in minimum condition
optimization toolboxes that are freely available in the public domain number can be reduced for SHM using the aforementioned proper-
[34]. YALMIP [34] is one such toolbox which can be interfaced ties. For example, consider CIPIC configuration of 1250 sampling
to MATLAB and also enable the user to choose appropriate solver points on a sphere. Number of sampling points which result in
for a particular purpose. Sedumi 1.3 [35] is a mixed integer pro- similar spherical harmonic coefficients is 909. So in this case, the
gramming solver that is freely available for academic purpose. This computational complexity is reduced by one-fourth of the actual
solver is used to solve all the mixed integer optimization problems time consumed.
that are formulated in this work. It is important to note that, the Computational complexity of the proposed method also depends
mixed integer toolboxes need not necessarily result in a global opti- on the upper bound of the minimum eigen value. A simple func-
mum solution. Nevertheless, this work attempts at achieve the best tion depending on the trace of the matrix YYH is considered as
possible reduction in condition number maintaining the application an upper bound for η in this work. The values of η are increased
T r(YY H )
specific requirements. It is also to be noted that, the fundamental stepwise from lower bound 0 to upper bound P as given
problem posed in Equation 9 contains strict inequalities (>) on the in Equation 15. However the algorithm proposed in this work can
lower bound of the minimum eigenvalue. But in general, the opti- also terminate without actually reaching the upper bound. This can
mization solvers can only solve non-strict inequalities (>=). Hence happen if the problem formulated in Equation 16 turns in to an infea-
a small value of  = 10−7 is used to convert strict inequalities in to sible optimization problem for a particular value of η. The number
non strict inequalities as shown below. of transition points R depends on the number of steps the pro-
posed algorithm has taken to either reach an infeasible state or the
T r(YY H )
upper bound P . Time complexity of the proposed algorithm
Strict Inequality: λmin (A∆AH ) > η (42) depends on the value of R and the time taken for the optimization
H
problem to reach an optimal solution. So tighter upper bound for η
Non-strict Inequality: λmin (A∆A ) >= η +  (43) would reduce the number of transition points R and thereby reduce
the time complexity of the proposed algorithm.
The simulations of the proposed algorithm is performed using MAT-
LAB 2015 on a 64 bit PC of 8GB RAM. Most of the performance 4.3 Condition Number Analysis of Various Sampling
analysis is performed for a maximum Q value of 2000. It has been Schemes for Spatial Sound Reproduction
found that, the optimization problem for Q > 2000 and N > 5
results in insufficient memory issues for the above configurations. The performance of various sampling configurations are evaluated
High end computers can be utilised to extract the results of the pro- in terms of conditioning behaviour of SHM and uniformity of spatial
posed algorithms for higher number of sampling points and spherical sampling points. D-measure as discussed in section 3.1.1 is used to
harmonic orders. quantify the uniformness of spatial sampling points. Table 2 depicts

Table 2 Table listing the condition number and D-measure values for proposed and conventional methods for various spherical harmonic orders N , and sampling
points Q0 selected from a total of Q = 100 sampling points.

Q0 = 6 Q0 = 8 Q0 = 12 Q0 = 18 Q0 = 20 Q0 = 24 Q0 = 30 Q0 = 32 Q0 = 36 Q0 = 50
N =1 N =1 N =2 N =2 N =3 N =3 N =4 N =4 N =4 N =4

log10 (κ) 0.422 0.444 1.478 1.269 2.560 2.331 4.033 3.647 3.455 3.427
Proposed
D 0.080 0.088 1.046 1.199 0.352 0.981 0.723 0.968 1.070 0.683

log10 (κ) 0 0 0 - - 0 - - 0 -
T-design
D 0 0 0 - - 0 - - 0.002 -

log10 (κ) 1.227 0.828 2.256 1.935 4.061 3.424 5.172 5.581 4.575 3.885
Fibonacci
D 0.095 0.077 0.048 0.032 0.029 0.024 0.019 0.018 0.016 0.012

log10 (κ) - 0.383 - 16.037 - - - 16.752 - 16.496


Gaussian
D - 0.036 - 0.065 - - - 0.176 - 0.292

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 9
Fig. 10: Figure illustrating normalised reproduction error (ξ) between sound field generated by a plane wave source located at (θ˜1 , φ˜1 ) =
(π/4, π/4) and sound field reproduced by loudspeakers for various spherical harmonic order (N ) and sampling points Q0 arranged in (a)
Proposed configuration, (b) Gaussian configuration (c) T-designs.

the condition number and D-measure values for various number of less for the proposed method. This can be noted from Figure 10a
sampling points (Q0 ) and spherical harmonic orders (N ) across for Q0 = 30, and Q0 = 36. Irrespective of the choice of sampling
three different configurations namely, T-design, Fibonacci, Gaussian scheme, plane wave reproduction using limited number of sam-
and the proposed configuration. It can be observed that T-designs pling points exhibit a maximum error for higher spherical harmonic
result in the lowest condition number and D-measure values if there orders. In order to study the dependence of plane wave source direc-
exists a configuration for the particular value of number of sam- tion on the reproduction error, simulations are performed for 648
pling points (Q0 ) and spherical harmonic orders (N ). For example, (36 azimuthal and 18 elevation) different source directions indepen-
it can be noted from the Table 2 that, for some of (Q0 , N ) values, dently. Number of directions (in percentage) proposed and T-designs
T-design configuration does not exist. Fibonacci lattice as discussed sampling scheme exhibiting lesser reproduction error for different
in Section 3.1.3 provides nearly uniform distribution for sampling regularization parameter values is given in Table 3. It can be noted
points. This can also be noted from D-measure values of Fibonacci that, across all possible values of (Q0 , N ) proposed method has
lattice in Table 2. But it can be observed that, the condition num- lesser error for more number of directions.
ber values given in Table 2 in logarithmic scale are found to be
higher as compared to the proposed method. Gaussian distribution
Table 3 Number of directions (in %) exhibiting lesser reproduction error by
of sampling points are equisampled in azimuthal angles, and exhibits
proposed and T-design and T-design sampling schemes
lesser condition number and D-measure values for Q0 = 8, N = 1
as compared to the proposed method. But for Q0 > 8, N > 1 both
the condition number and D-measure values of Gaussian configura- Q0 =6,N=2 Q0 =12, N=3 Q0 =24, N=4 Q0 =36, N=6
tion are found to be higher which can be noted from Table 2. The
sampling points obtained from the proposed configuration exhibits
T-design 46.44 27.62 36.57 47.84
lesser condition number than the Fibonacci and Gaussian distribu-
tions. This result comes at a cost of increase in non-uniformity as
Proposed 53.56 72.38 63.43 52.16
compared to Fibonacci configuration. It can be understood from the
above mentioned observations that the proposed method is able to
result in a well conditioned SHM as compared to Fibonacci and
Gaussian configurations. Unlike T-design, proposed method is able
to generate sampling points for any number of sampling positions in 4.5 Performance Analysis of HRTF Interpolation for Various
the 3D space. Interaural Sampling Configurations

In section 3.2, it has been noted that, choice of different sampling


4.4 Plane-wave Reproduction Error Analysis of Various configurations will alter the conditioning behaviour of SHM and
Loudspeaker Geometries in turn have an effect on the accuracy of the interpolated HRTFs.
Further, it has also been observed that, in the interaural sampling
In section 3.1.2, the error exhibited between the actual sound field configurations, choosing sparse sampling at the left and right poles
and the reproduced sound field using Q loudspeakers is given in results in a better condition number for SHM. Considering this
Equation 32. Sound field generated by a plane-wave source from observations, the error incurred in the interpolated HRTFs for two
the direction (θ˜1 , φ˜1 ) = (π/4, π/4) is synthesized using Equation different configurations will be discussed herein. The first configura-
24. This sound field is reproduced by the loudspeakers arranged in tion is the Equi-sampled CIPIC Configuration (ECC) with elevation
various geometries. The normalised reproduction error (ξ) obtained angles spaced by 11.250 as compared to the original CIPIC angular
for various loudspeaker geometries like T-designs, Gaussian and the spacing of 5.6250 [18]. This generates 625 samples with 25 samples
Proposed sampling schemes are considered for evaluations. Sam- for each interaural circle. The second configuration is the Modified
pling positions of T-designs obtained for Q0 = {6, 12, 24, 36} are CIPIC Configuration (MCC) generated by choosing sparse sampling
used for computing the plane wave reproduction error ξ. As total towards the left and right poles as discussed in Figure 7. In total 625
number of sampling points of a Gaussian grid on sphere is mul- number of samples are chosen with a variable number of samples
tiple of 2(N + 1)2 , error ξ is computed for Q0 = {8, 18, 32, 50}. for each of 25 interaural circles. The elevation angular separation
For the proposed method, error analysis is performed for sampling for each of the interaural circle in the range between −450 to 450
points obtained for Q0 = {6, 12, 20, 24, 30, 36}. For each of these and 1350 to 230.6250 is given in Table 4.
sampling schemes, error ξ is computed across various spherical har- Spherical harmonic coefficients Hnm (k) are obtained for both
monic orders (N ). The normalised error obtained for all the three the above configurations using the least square solution of Equation
configurations is illustrated in Figure 10. It can be observed that, for 36. Using these coefficients HRTFs can be either reconstructed or
Q0 ≥ (N + 1)2 , T-designs and the proposed method exhibit nearly interpolated for measured or non-measured directions respectively.
zero error. For Q0 < (N + 1)2 , the error increases with the spheri- Interpolated or reconstructed HRTFs are obtained using Equation 37.
cal harmonic order. But the rate of increase in error is found to be Spherical harmonic order of N = 10 is considered for computing

IET Research Journals, pp. 1–12


10 c The Institution of Engineering and Technology 2015
Table 4 Table enumerates elevation angular spacing followed in MCC Case 2.2: If λ̃max < λ∗max for atleast one λ̃min lying between
[0, λ∗min ), then minimization of both the problems may not result in
±450 same solution. Because there is a possibility that, λmax (A∆AH )
±800 ±250
Lateral ±400 ±100 λ (A∆AH )
±650 ±200 00 is minimum but not the ratio λmax(A∆AH ) .
angles ±350 ±50 min
±550 ±150 So a matrix can exhibit a property discussed in Case 2.1. If not,
±300 it will definitely exhibit the property discussed in Case 1 which
Elevation indicates, indeed there is an optimal lower bound η ∗ for which mini-
angular 33.750 28.1250 22.50 16.8750 11.250 mization of the λmax (A∆AH ) function under λmin (A∆AH ) >
spacing λ (A∆AH )
η ∗ will be equal to the minimization of the ratio λmax(A∆AH ) .
min

interpolated HRTFs. In order to analyse the performance of inter- 6.2 Derivation for Upper Bound (ηub )
polated HRTFs for all the angles, Log Spectral Distortion (LSD)
between the ground truth HRTFs and HRTFs obtained using each of Consider the sub-matrix A∆AH . The order of the matrix is P ×
the above two methods is performed. Figure 11 illustrates the LSD P . A inequality between minimum eigenvalue and trace of the sub-
computed for all directions described in the CIPIC database. It is to matrix is given as,
noted that the color bars of Figure 11 (a,b) are of different scale. It
can be clearly observed that, MCC has lesser error for elevations 500 T r(A∆AH ) ≥ λmin + λmin · · · λmin = P λmin
to 1300 . Further for all of the ipsilateral angles (−800 , 00 ), where λmin ≤ 1
T r(A∆AH )
P
the HRTFs magnitude is high, MCC performs better than ECC. This
can be noted from Figure 11c, where the regions in black indicate where T r(·) denotes the trace of the matrix. However the matrix ∆
MCC has lower error than ECC. So using this analysis it can be con- is unknown, therefore an upper bound which can be computed by
cluded that, a condition number improvement in SHM can result in the elements of the matrix A is determined. Consider aij , âij , and
better interpolated HRTFs. ȧij are the elements of the ith row and j th column of matrices A,
A∆AH , and AAH respectively. We have the following relations
5 Conclusions and Future Work between these elements.
P 0 2
âii = Q
PQ 2
A novel method for reducing the condition number of spherical i=1 aii < i=1 aii = ȧii
harmonic matrix is presented in this work. Two applications of T r(A∆AH ) PP 1 PP T r(AAH )
his method in the domain of spatial audio are also presented. The P = P1 i=1 âii ≤ P i=1 ȧii = P
proposed method is able to identify the loudspeaker geometry by
maintaining a trade off between reproduction accuracy and well- Thus the new upper bound on the minimum eigenvalue is,
conditioning behaviour of SHM. The proposed method is also found
1 1
to be useful in the optimal selection of spatial points for better λmin (A∆AH ) < T r(AAH ) =⇒ ηub = T r(AAH ).
spherical harmonic representation of HRTFs. The primary solu- P P
tion proposed in this work can be further improved by identifying
tighter upper bounds for minimum eigenvalue. The conditioning 6.3 Example based illustration of Proposed Method
behaviour of SHM in the applications of spherical array processing
like Beamforming and Localization will be studied as future work. Consider a rectangular grid as shown below. Each column of this
grid constitute possible values for λmax (A∆AH ) for a particu-
lar value of λmin (A∆AH ). Below the grid is a table listing the
6 Appendices values of lower bound η i , minimum and maximum eigenvalues of
6.1 proof for equivalence of optimization problems sub-matrix (A∆i AH ), and the corresponding condition number κi
for all four transitions. Starting from a least value of lower bound,
The optimization problems formulated in Equation 5 and Equation say in this example η 1 = 0.1, identifying the minimum value for
9 results in same solution. If ith non-zero element lies at the q th λmax results in λ1max = 1. The corresponding λ1min value is 0.4.
diagonal position of ∆∗ , then the q th row and ith column of The crucial step of the proposed method is as follows. It is unneces-
Γ∗ will contain a one non-zero element of value 1. In this man- sary to search a minimum value for λmax for the η lying in between
ner, the elements of ∆∗ and Γ∗ has a one-to-one correspondence. (0.1, 0.4). It is so because, until η > λ1min the minimum value for
λ (A∆AH ) λmax is same. For instance, consider η̂ = 0.2, minimum value of
In order to prove minimization of λmax(A∆AH ) and minimiza- λmax for λmin (·) > 0.1 and λmin (·) > 0.2 is same, which is equal
min
tion of λmax (A∆AH ) under λmin (A∆AH ) > η ∗ result in same to λmax = 1. Therefore at 2nd transition, until η assumes a value
solution, consider the following line diagram. greater than λ1min = 0.4, the sub-matrix and the corresponding con-
Case 1: Consider η ∗ = λ∗min . We have, λ̂min ≥ λ∗min . If dition number resulted from the optimization problem for η̂ = 0.2
λ∗ would be same as that resulted for η 1 = 0.1. Hence the proposed
λ̂max < λ∗max implies λ̂max < λmax ∗ which is an invalid state-
λ̂min min algorithm will search only in the regions where there is a transition
λ∗
max
ment as λ∗ should result in minimum value. Hence in the range in the value of condition number. Once η reaches its upper bound
min
λmin (A∆AH ) ≥ λ∗min , minimum value of λmax (A∆AH ) is which is ηub = P1 T r(AAH ), the problem become infeasible. So in
λ (A∆AH ) λ∗ the proposed algorithm the number of times the optimization prob-
λ∗max , and also minimum value of the λmax(A∆AH ) is λmax ∗ .
min
∗ min lem to be solved depends on the step size for η at each transition.
Both these are obtained at the same value of ∆ = ∆ . Hence opti- As the step size increases, η reaches its upper bound quickly which
mization problems of Equation 5 and Equation 9 result in same reduces the computation time. In the current example, after all the
solution. four transitions (λmax , λmin ) = (2, 0.9) results in the minimum
Consider λ̃min = λmin (A∆A ˜ H ) lying in the interval (η ∗ , λ∗ ) condition number of κ = 2.2.
min
as shown in the above line diagram. Let the corresponding maximum

eigenvalue be λ̃max . We have, λ̃min < λmin . Now consider two 7 Acknowledgments
sub-cases as follows.
Case 2.1: If λ̃max > λ∗max for every value of λ̃min lying This work was funded in part by TCS Research Scholarship Pro-
between [0, λ∗min ), then minimization of λmax (A∆AH ) function gram under project number TCS/CS/2011191E and in part by
under λmin (A∆AH ) > η ∗ will result in λ∗max as the minimum SERB, Dept. of Science and Technology, GoI under project number
value. SERB/EE/2017242.

IET Research Journals, pp. 1–12


c The Institution of Engineering and Technology 2015 11
(a) (b) (c)
Fig. 11: Figure illustrates LSD between the ground truth HRTFs and the interpolated HRTFs obtained for (a) Modified CIPIC Configuration
(MCC), (b) Equi-sampled CIPIC Configuration (ECC) for Left ear of Subject 003 of CIPIC database. Figure (c) illustrates the regions (Black)
where MCC has smaller error as compared to ECC.

∗ T r(AAH ) 12 Gonza?lez, A.: ’Measurement of areas on a sphere using fibonacci and lati-
0 η∗ λ̃min λmin λ̂min P tude?longitude lattices’,Mathematical Geosciences, 42(1):49, 2009.
13 Saff, E.B., and Kuijlaars, A.B.J.: ’Distributing many points on a sphere’, The
→ mathematical intelligencer, 19(1):5?11, 1997.
λmin (·) 14 Brauchart, J. S., and Grabner, P.J.: ’Distributing many points on spheres: minimal
energy and designs’, Journal of Complexity, 31(3):293?326, 2015.
15 Hardin, R. H., and Sloane, N.J.A.: ’Mclaren?s improved snub cube and other new
spherical designs in three dimensions’, Discrete and Computational Geometry,
8 9 8 11 6 2 6 3 2 15(4):429?441, 1996. http://dx.doi.org/10.1007/BF02711518.
16 Fliege, J., and Maier, U.: ’The distribution of points on the sphere and corre-
H sponding cubature formulae’, IMA Journal of Numerical Analysis, 19(2):317?334,
λmax (A∆A ) 11 6 6 5 4 9 10 4 6
1999.
7 17 Lecomte, P., Gauthier, P-A., Langrenne, C., Berry, A., and Garcia, A.: ’A fifty-node
2 4 3 1 3 11 4 2 lebedev grid and its applications to ambisonics’, Journal of the Audio Engineering
Society, 64(11):868?881, 2016.
18 Algazi, V., R., Duda, R.O., Thompson, D.M., and Avendano, C.: ’The cipic hrtf
λmin (A∆AH ): 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 database’, In IEEE Workshop on the Applications of Signal Processing to Audio
and Acoustics, 2001, pages 99?102, 2001.
19 Zotter, F., Frank, M., and Sontacchi, A.: ’The virtual t-design ambisonics-rig using
λimax
i λmin (·) > η i min {λmax (·)} (λimax , λimin ) κi = λimin
vbap’, Proceedings of the 1st EAA, 2010.
20 Meyer, J., and Elko, G.: ’A highly scalable spherical microphone array based on an
orthonormal decomposition of the soundfield’, In Acoustics, Speech, and Signal
1 λmin (·) > 0.1 1 (1, 0.4) 2.5 Processing (ICASSP), 2002 IEEE International Conference on, volume 2, pages
II?1781. IEEE, 2002.
2 λmin (·) > 0.4 2 (2, 0.6) 3.3 21 Asano, F., Suzuki, Y., and Swanson, D. C.: ’Optimization of control source con-
figuration in active control systems using gram-schmidt orthogonalization’, IEEE
transactions on speech and audio processing, 7(2):213?220, 1999.
3 λmin (·) > 0.6 2 (2, 0.8) 2.5 22 Khalilian, H., Bajic?, I.V., and Vaughan, R.G.: ’Comparison of loudspeaker place-
ment methods for sound field reproduction’, IEEE/ACM Transactions on Audio,
4 λmin (·) > 0.8 2 (2, 0.9) 2.2 Speech, and Language Processing, 24(8):1364?1379, 2016.
23 Daniel, J., Rault, J.B., and Polack, J.D.: ’Ambisonics encoding of other audio for-
mats for multiple listening conditions’, In Audio Engineering Society Convention
105, Sep 1998.
24 Sloane, N., ’http://neilsloane.com/sphdesigns/index.html’.
25 Gardner, B., Martin, K., et al: ’HRTF measurements of a kemar dummy-head
8 References microphone’, 1994.
1 Oppenheim, A. V., and Schafer, R. W.: ’Discrete-time signal processing’, Pearson 26 Duraiswaini, R., Zotkin, D. N., and Gumerov, N.A.: ’Interpolation and range
new International Edition. Pearson Higher Ed, 2013. extrapolation of hrtfs [head related transfer functions]’, In Acoustics, Speech,
2 Rafaely, B.: ’Fundamentals of spherical array processing’, volume 8. Springer, and Signal Processing, 2004. Proceedings. (ICASSP ?04), IEEE International
2015. Conference on, volume 4, pages iv?45?iv?48 vol.4, May 2004.
3 Ward, D. B., Abhayapala T.D.: ’Reproduction of a plane-wave sound field using 27 Avni, A., Ahrens, J., Geier, M., Spors, S., Wierstorf, H., and Rafaely, B.: ’Spatial
an array of loudspeakers’, IEEE Transactions on Speech and Audio Processing, perception of sound fields recorded by spherical microphone arrays with varying
9(6):697?707, Sep 2001. spatial resolution’, The Journal of the Acoustical Society of America, 133(5), 2013.
4 Li, Z., Duraiswami, R.: ’Flexible and optimal design of spherical microphone 28 Zhang, W., Abhayapala, T.,D., Kennedy, R., A., and Duraiswami, R.: ’Insights into
arrays for beamforming’, IEEE Transactions on Audio, Speech, and Language head-related transfer function: Spatial dimensionality and continuous representa-
Processing, 15(2):702?714, 2007. tion’, The Journal of the Acoustical Society of America, 127(4), 2010.
5 Rafaely, B.: ’Analysis and design of spherical microphone arrays’ ,IEEE Transac- 29 Strang, G.: Introduction to Linear Algebra. Wellesley-Cambridge Press, Wellesley,
tions on speech and audio processing,13(1):135?143, 2005. MA, fourth edition, 2009.
6 Wabnitz, A., Epain, N., Schaik, A. V., and Jin, C.: ’Time domain reconstruction of 30 Bauer, R.: ’Distribution of points on a sphere with application to star catalogs’,
spatial sound fields using compressed sensing’, In 2011 IEEE International Con- Journal of Guidance, Control, and Dynamics, 23(1):130?137, 2000.
ference on Acoustics, Speech and Signal Processing (ICASSP), pages 465?468, 31 Zotter, F., and Frank, M.: ’All-round ambisonic panning and decoding’ Journal of
May 2011. the Audio Engineering Society, 60(10):807?820, 2012.
7 Poletti, M. A.: ’Three-dimensional surround sound systems based on spherical 32 Evans, M.,J., Angus, J.A.S., and Tew, A., I: ’Analyzing head-related transfer
harmonics’, J. Audio Eng. Soc, 53(11):1004?1025, 2005. function measurements using surface spherical harmonics’ The Journal of the
8 Bates, A. P., Khalid, Z., and Kennedy, R. A.: ’Novel sampling scheme on the Acoustical Society of America, 104(4), 1998.
sphere for head-related transfer function measurements’, IEEE/ACM Transactions 33 Zotkin, D. N., Duraiswami, R., and Gumerov, N.A.: ’Regularized hrtf fitting
on Audio, Speech, and Language Processing, 23(6):1068?1081, June 2015. using spherical harmonics’, In 2009 IEEE Workshop on Applications of Signal
9 Radmanesh N., and Burnett, I.S.: ’Generation of isolated wideband sound fields Processing to Audio and Acoustics, pages 257?260. IEEE, 2009.
using a combined two-stage lasso-ls algorithm’ IEEE Transactions on Audio, 34 Lofberg, J.: ’Yalmip: A toolbox for modeling and optimization in matlab’, In IEEE
Speech, and Language Processing, 21(2):378?387, 2013. International Symposium on Computer Aided Control Systems Design, 2004,
10 Pulkki, V., and Karjalainen, M.: ’Multichannel audio rendering using amplitude pages 284?289. IEEE, 2004. https://yalmip.github.io.
panning [dsp applications]’, IEEE Signal Processing Magazine, 25(3):118?122, 35 Sturm, J., Romanko, O., and Pólik. I.: ’SeDuMi version 1.3, 2010’, MATLAB
May 2008. toolbox, available from sedumi.ie.lehigh.edu.
11 McEwen, J.D., and Wiaux, Y.: ’A novel sampling theorem on the sphere’, IEEE
Transactions on Signal Processing, 59(12):5876?5887, 2011.

IET Research Journals, pp. 1–12


12 c The Institution of Engineering and Technology 2015

You might also like