Applsci 10 05064 v2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

applied

sciences
Article
Real-Time Visual Tracking of Moving Targets Using
a Low-Cost Unmanned Aerial Vehicle with a 3-Axis
Stabilized Gimbal System
Xuancen Liu 1 , Yueneng Yang 1,2 , Chenxiang Ma 3 , Jie Li 4 and Shifeng Zhang 1, *
1 College of Aerospace Science and Engineering, National University of Defense Technology,
Changsha 410073, China; [email protected] (X.L.); [email protected] (Y.Y.)
2 College of Intelligence Science and Engineering, National University of Defense Technology,
Changsha 410073, China
3 School of Aeronautics and Astronautics, University of Electronic Science and Technology of China,
Chengdu 611731, China; [email protected]
4 College of Systems Engineering, National University of Defense Technology, Changsha 410073, China;
[email protected]
* Correspondence: [email protected]

Received: 4 July 2020; Accepted: 20 July 2020; Published: 23 July 2020 

Featured Application: A complete solution for visual detection and autonomous tracking of
a moving target is presented, which is applied to low-cost aerial vehicles in reconnaissance,
surveillance, and target acquisition (RSTA) tasks.

Abstract: Unmanned Aerial Vehicles (UAVs) have recently shown great performance collecting
visual data through autonomous exploration and mapping, which are widely used in reconnaissance,
surveillance, and target acquisition (RSTA) applications. In this paper, we present an onboard
vision-based system for low-cost UAVs to autonomously track a moving target. Real-time visual
tracking is achieved by using an object detection algorithm based on the Kernelized Correlation
Filter (KCF) tracker. A 3-axis gimbaled camera with separate Inertial Measurement Unit (IMU) is
used to aim at the selected target during flights. The flight control algorithm for tracking tasks is
implemented on a customized quadrotor equipped with an onboard computer and a microcontroller.
The proposed system is experimentally validated by successfully chasing a ground and aerial target
in an outdoor environment, which has proven its reliability and efficiency.

Keywords: target tracking; UAV; onboard vision; gimbal system; low-cost sensors

1. Introduction
The past decade has witnessed an explosive growth in the utilization of unmanned aerial vehicles
(UAVs), attracting more and more attention from research institutions around the world [1,2]. With a
series of significant advances in technology domains like micro-electro-mechanical system (MEMS),
many UAV platforms and mission-oriented sensors limited to military affairs in the past are now
widely applied to industrial and commercial sectors [3–7]. UAV-based target tracking is one of the most
challenging tasks, which is closely related to applications such as traffic monitoring, reconnaissance,
surveillance, and target acquisition (RSTA), search and rescue (SAR), inspection of power cables,
etc. [8–13]. The tracking system is an important part of UAVs which detects a target of interest
rapidly in a large area, and then performs continuous surveillance of the selected target in the tracking
phase [14]. In order to achieve target acquisition and localization, military UAVs are usually equipped
with airborne radars [15] or guided seekers [16]; however, they are too heavy and unaffordable for

Appl. Sci. 2020, 10, 5064; doi:10.3390/app10155064 www.mdpi.com/journal/applsci


Appl. Sci. 2020, 10, 5064 2 of 27

most civilian used UAVs. As one of the most popular UAV platforms, quadrotors are more stable and
have a lower manufacturing cost than helicopters. For portability and flexibility, the takeoff weight of
most quadrotors is less than 15 kg, so payload and battery endurance for onboard equipment are very
limited. In this situation, precisely and robustly tracking a moving target by using a small-scale UAV
platform is still a challenging task because the onboard computational capability is poor and sensors
are low-cost [17].
Compared with many other detection sensors, cameras seem to have an inherent potential for
UAV-based target tracking tasks, for they passively receive the environmental information and visual
features, while still being low-cost and lightweight [18]. In recent years, various methods are studied
in many research works to achieve target tracking by using cameras. Wenzel et al. [19] presented a
visual tracking system for autonomous landing of a Hummingbird quadrotor by using an infrared (IR)
camera that was extremely cheap. In [20], a vision-based landing control algorithm for an autonomous
helicopter was designed and implemented by using a downward-pointing charge coupled device
(CCD) camera. The onboard system was integrated with an algorithm for visual acquisition of the
target (a moving helipad) and state estimation, which calculated the six degrees of freedom (DOF)
pose with respect to the landing pad. The helipad which consists of a letter “H” surrounded by a
circle is a typical structured pattern widely used in vision-based tracking and landing tasks because it
can be easily detected and recognized in cluttered environments [21]. The research on square-based
fiducial markers, such as AprilTag [22] and ArUco [23], has also aroused increasing interests toward
marker-based visual tracking. Ref. [24] presented a fully autonomous flight control system for tasks of
target recognition, geo-location, following and finally landing on a AprilTag marker which is attached
to a high-speed moving car. Ref. [25] proposed a vision-based swarming approach for employing
three or more Parrot Ar.Drone 2.0 quadrotors without any extra positioning sensors, while the ArUco
markers fixed on the obstacles were used for localization and mapping. However, artificial markers or
cooperative targets can only be applied to some specific scenarios such as autonomous landing and
indoor navigation, in which the actual size and main features of the pattern are already known [26].
Since the vision system may be requested to track an arbitrary target with unknown size, shape,
and motion, it is impossible to have already acquired information and know how to recognize it.
Object tracking is one of the most fundamental fields in computer vision and has a wide
range of applications in many areas. For this task, many algorithms for visual tracking have been
proposed, which can be generally categorized into generative and discriminative methods according
to their appearance models. Generative methods typically search for the image region which is
most similar to the target, such as incremental subspace tracker (IVT [27]), L1 tracker [28], real-time
compressive sensing tracking (RTCST [29]), superpixel tracking [30], mean shift algorithm [31–33],
and the continuously adaptive mean shift (CAMSHIFT) algorithm [34]. Discriminative methods or
tracking-by-detection methods deal with the tracking problem as a binary classification task and
separate the target object from the background, such as multiple instance learning (MIL) tracker [35],
Struck [36], on-line AdaBoost (OAB) tracker [37], and Tracking-Learning-Detecting (TLD) [38]. The
potential of correlation filters for visual tracking has aroused tremendous research interest because it
reduces the overhead time through fast Fourier transformation (FFT) [39]. Bolme et al. [40] presented
a minimum output sum of squared error (MOSSE) filter, which produced stable correlation filters
when initialized using a single frame. It is able to quickly adapt to variations in scale, rotation, and
lighting while operating at 669 frames per second. Henriques et al. [41] proposed the correlation filter
of the circulant structure with kernel (CSK), and used the kernel trick to learn the appearance model.
After that, the Kernelized Correlation Filter (KCF) tracker that Henriques proposed in [42] further
improved the CSK method by using the histogram of oriented gradients (HOG) feature instead of
gray feature to represent the object, which showed an amazing speed on the OTB2013 dataset [43].
However, the original KCF tracker uses a fixed size template and is not able to handle scale changes
and occlusions, leading to bad performance in some scenarios. To alleviate theses drawbacks, Montero
et al. [44] proposed fast scalable solution based on KCF framework which used an adjustable Gaussian
Appl. Sci. 2020, 10, 5064 3 of 27

window function and a keypoint-based model for scale estimation to deal with the fixed size limitation
in the Kernelized Correlation Filter. Zhang et al. [45] presented an improved KCF tracker by adopting
a cascade classifier composed of multi-scale correlation filter and NN (Neighbor Nearest) classifier,
which showed favorable performance in accuracy and robustness.
To actually track a moving target in real time with a vision-based system onboard a multi-rotor
UAV platform, three steps are required. The first step is to detect the target and localize it in each image
frame as described above. The second step involves getting the line-of-sight (LOS) angle or the position
of the target and maintaining the target in the center of the camera’s field of view (FOV). The third step
is to use this information obtained by the vision system to define the control task of an autonomous
UAV when flying around the target. Generally, a downward-facing strap-down camera has advantages
with its small size, light weight, and simplicity [46]. However, the output of the strap-down camera
cannot be directly used in the flight control system, for it couples with the UAV body angular motion.
In addition, the restricted FOV of the camera makes it a challenging task to keep the fast-moving target
in the image, which also means the FOV constraints must be fully considered in the guidance law [47].
To solve these problems, a gimbal system is widely used to provide inertial stability to the camera
by isolating it from the UAV motion and vibration. A gimbaled camera now available on the market
provides a decoupling along the roll, pitch, and yaw attitudes, which has become an important unit of
the UAV system. In [48], a 3-axis gimbaled camera (DJI Zenmuse X3) is used to solve the problem,
which enabled autonomous landing of a quadrotor on a high-speed ground vehicle. The system is
experimentally implemented and validated by successfully landing a commercial quadrotor on the roof
of a car moving at speeds of up to 50 km/h. Ref. [49] presented a field-tested mini gimbal mechanism to
produce an estimation of the target position, which allowed a flying-wing UAV to fly around the target.
Jakobsen et al. [50] presented the architecture of a pan/tilt/roll camera system implemented on the
Georgia Tech’s UAV research helicopter. Each axis is driven by a servo and optical encoders are utilized
to measure the gimbal orientation. Whitacre et al. [51] performed flight tests using a SeaScan UAV with
a gimballing camera to track ground targets and studied the effect of altitude, camera FOV, and orbit
center offsets within the geo-location tracking performance for both stationary and moving targets.
In this paper, motivated by the existing challenges, an onboard vision-based system for tracking
arbitrary 3D objects moving at unknown velocities is proposed by utilizing a 3-axis gimbal system.
A KCF tracker is used to detect and localize the target of interest from images acquired by the gimbaled
camera. The 3-axis stabilized gimbal system is driven by brushless direct current motors (BLDCMs)
and a magnetic rotary encoder is attached to each axis with a high-resolution output of the angular
position. Then, the results of the KCF tracker are put into a proportional-derivative (PD) controller,
which aligns the optical axis of the gimbaled camera with the LOS joining the camera and the target.
A tracking strategy for multi-rotor UAVs is proposed based on the proportional navigation (PN)
method, which makes it possible to keep following the target, although the target position is not
known. A low-cost experimental quadrotor is customized for real-time flight tests, which is equipped
with a microcontroller using consumer sensors and an onboard computer for image processing. The
presented system is experimentally implemented and validated by successfully tracking a commercial
quadrotor flying along an unknown path. By analyzing the flight data of both the target and the
interceptor, the system is demonstrated to be reliable and cost-effective.
The rest of this paper is organized as follows. The vision system and control of the gimbaled
camera are described in Section 2. The visual tracking algorithm used to detect and localize the target
in image stream is presented in Section 3. Section 4 established the dynamic model of the quadrotor,
and the target tracking strategy as well as the control law is presented. Section 5 describes preparation
for flight experiments, and the results of the flight tests are given in this part. Finally, Section 6
concludes this paper and presents the future work.
2.1. Problem Formulation and System Architecture
Although there are many algorithms capable of tracking targets from the video streams,
techniques reported in the field of computer vision cannot be easily extended to airborne applications
because
Appl. of high
Sci. 2020, 10, 5064dynamic UAV-target relative motion. In this section, a small commercial drone 4 of(41
27
cm in diameter) is considered as the target with unknown speed. If this unwanted aerial visitor flies
into places such as airports, prisons, and military bases where consumer drones are not allowed, it
2.
canGimbal
cause aSystem
big problem. Therefore, a lot of anti-UAV defense systems are being introduced to combat
the growing threats of malicious UAVs. One way is to use a rifle-like device which sends a high-
2.1. Problem Formulation and System Architecture
power electromagnetic wave to jam the UAV control systems and force them to land immediately.
Another optionthere
Although is toare
capture
manythe target incapable
algorithms mid-air, ofusing
trackinga UAV platform
targets from thethat carries
video a nettechniques
streams, gun.
Consider
reported in theFigure
field of 1, computer
which depicts thecannot
vision UAV-Target
be easilyrelative kinematics
extended and defines
to airborne the coordinate
applications because
systems.
of Let B denote
high dynamic UAV-target the body
relativeframe that moves
motion. In this with theaUAV,
section, C the camera
small commercial frame
drone (41 that
cm in is
attached toisthe
diameter) UAV but as
considered rotates with respect
the target to framespeed.
with unknown B , NIf athis
North-East-Down
unwanted aerial (NED)
visitorcoordinate
flies into
systemsuch
places takenasasairports,
an inertial reference
prisons, and frame, , the image
militaryI bases whereframe. The origin
consumer dronesofarethenot
camera frame
allowed, it canC
is the optical center
cause a big problem. Therefore,and Z c
coincides with the optical axis of the camera. Following
a lot of anti-UAV defense systems are being introduced to combat the the notation
growing threats of malicious UAVs. One T way is to use a rifle-like device which sends a high-power
introduced in [52], let p   xc , yc , zc  denote the position of the target in frame C . The rotation
electromagnetic wave to Cjamthe UAV control systems and force them to land immediately. Another
n
transformation
option is to capture c
R the
from frame
target C to frame
in mid-air, using N is: platform that carries a net gun.
a UAV
Consider Figure 1, which depicts the UAV-Target n
relative kinematics and defines the coordinate
systems. Let B denote the body frame that moves R  nb Rthe
c with
b
c
R UAV, C the camera frame that is attached(1) to
the UAV but rotates with respect to frame B, N a North-East-Down (NED) coordinate system taken
where the transformation bn R is calculated by the roll, pitch, and yaw angles of the UAV given by
as an inertial reference frame, I, the image frame. The origin of the camera frame C is the optical
b
the flight
center andcontroller.
Zc coincides R can be computed by the gimbal system using the relative angular position
c with the optical axis of the camera. Following the notation introduced in [52],
T n R from
measured
let pC = [xcby , ycthe
, zc ]encoders.
denote the p be the
Letposition position
of the target ofin the
frametarget withrotation
C. The respecttransformation
to the optical center
c of
frame C to frame N is:
camera resolved in N , which is given as:
n n b
c R =n b Rb c R (1)
p  b R c R  pC (2)
where the transformation nb R is calculated by the roll, pitch, and yaw angles of the UAV given by
Thus,controller. b R can be pcomputed
the target position
the flight c T in NED coordinate
by the gimbal system can using
system be estimated, which
the relative is given
angular as:
position
measured by the encoders. Let p be the position ofn the target with respect to the optical center of
camera resolved in N, which is given as: pT  pB  b RpBC  p (3)

where pBC is the position of gimbaled camera relative to the UAV body in frame B , and pB is the
p = nb R bc R · pC (2)
UAV position in N .

Figure 1. UAV-Target relative kinematics.

Thus, the target position pT in NED coordinate system can be estimated, which is given as:

pT = pB + nb RpBC + p (3)
Appl. Sci. 2020, 10, 5064 5 of 27
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 28

where pBC is the position of gimbaled Figurecamera relative relative


1. UAV-Target to the UAV body in frame B, and pB is the UAV
kinematics.
position in N.
ToTodealdeal
withwith the of
the task task of moving
moving target tracking,
target tracking, an autonomous
an autonomous quadrotor quadrotor
UAV system UAV system
equipped
equipped
with a 3-axis with a 3-axis
gimbaled gimbaled
camera camera to
is constructed is detect
constructed to detect
and follow and follow
the flying the
object in theflying object in the
pursuit-evasion
pursuit-evasion
scenario. The visual scenario.
trackingThe visual
system trackingthe
processes system
imagesprocesses the images
and drives andto
the gimbal drives
searchthe
thegimbal
targetto
search
areas. Oncetheantarget areas. drone
intruding Once an intrudingthe
is detected, drone is detected,
location the location
of the target in each of the target
image frameinis each image
acquired,
frame
and this is
is acquired,
utilized byand thethis is utilized
automated by the automated
targeting module fortargeting module
aiming control offor
theaiming
gimbal.control
Whileof thethe
gimbal. While the target is locked down, the camera pose can be used as input
target is locked down, the camera pose can be used as input information to control the UAV flight. information to control
the UAV
Figure flight. the
2 presents Figure 2 presents
proposed the proposed
vision-based vision-based system.
system.

Figure
Figure 2. 2. Architecture
Architecture of of
thethe autonomous
autonomous visual
visual tracking
tracking system.
system.

For
Formost
mostcivil
civilUAV
UAVapplications,
applications,the
thegimbal
gimbalsystem
systemand
andthe
theUAVUAV flight control
flight controlsystem
systemareare
independent
independent ofof
each
each other.
other.However,
However, totoachieve
achieveautonomous
autonomoustracking
trackingofofa moving
a moving target, the
target, two
the two
systems
systemsareare
coordinated
coordinated bybythethe
proposed
proposed vision algorithm
vision algorithmwhich
which is is
implemented
implemented ininthe onboard
the onboard
computer
computer running
running a Linux
a Linuxbased
basedsystem.
system.

2.2. Kinematics
2.2. of of
Kinematics thethe
Gimbal System
Gimbal System
While
Whilethethe
FOVFOVof of
a single camera
a single camerais limited, thethe
is limited, gimbal
gimbalsystems
systemsare are
ableable
to rotate the the
to rotate camera to ato
camera
desired direction,
a desired which
direction, are widely
which applied
are widely to many
applied fields such
to many fieldsassuch
filming and monitoring.
as filming When these
and monitoring. When
systems are mounted onboard an UAV, the torque motors are activated by the IMUs
these systems are mounted onboard an UAV, the torque motors are activated by the IMUs and other and other angular
sensors
angularto sensors
compensate for all thefor
to compensate rotations resulting resulting
all the rotations from the from
UAV theflight,
UAV which returns
flight, whichthe stablethe
returns
member
stable member to its original attitude. As shown in Figures 3 and 4, the gimbal system which isthis
to its original attitude. As shown in Figures 3 and 4, the gimbal system which is used in used
paper consists
in this paperofconsists
direct current (DC)current
of direct motors(DC)that balance
motors thethatplatform,
balance magnetic
the platform, rotarymagnetic
encoders rotary
that
sense the relative
encoders rotation,
that sense embedded
the relative stabilization
rotation, embedded controller that process
stabilization controllerall that
the sensors
process information
all the sensors
and output the control signals, the vibration damper that connects the outer
information and output the control signals, the vibration damper that connects the outer gimbal to the UAV body,to
gimbal
and the camera that captures the images.
the UAV body, and the camera that captures the images.
angular sensors to compensate for all the rotations resulting from the UAV flight, which returns the
stable member to its original attitude. As shown in Figures 3 and 4, the gimbal system which is used
in this paper consists of direct current (DC) motors that balance the platform, magnetic rotary
encoders that sense the relative rotation, embedded stabilization controller that process all the sensors
information
Appl. and
Sci. 2020, 10, output the control signals, the vibration damper that connects the outer gimbal
5064 to
6 of 27
the UAV body, and the camera that captures the images.

Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 28

Figure 3. 3-axis gimbaled camera. (1) the output port of the video stream; (2) the embedded
stabilization controller; (3) the brushless DC motor with magnetic rotary encoder; (4) the vibration
Figure 3. 3-axis gimbaled camera. (1) the output port of the video stream; (2) the embedded
damper; (5) thecontroller;
stabilization camera. (3) the brushless DC motor with magnetic rotary encoder; (4) the vibration
damper; (5) the camera.

(a) (b)
Figure
Figure4.4.Details
Detailsofof
the gimbal
the control
gimbal system.
control system.(a)(a)
3-axis stabilization
3-axis controller
stabilization controller(BaseCam
(BaseCamSimpleBGC
SimpleBGC
32-bit
32-bitTiny);
Tiny);(b)
(b)HT3505
HT3505brushless
brushlessDC
DCmotor
motorwith
withAS5048A
AS5048Amagnetic
magneticrotary
rotaryencoder.
encoder.

The3-axis
The 3-axisgimbaled
gimbaledcamera
camerasupporting
supportingstructure
structureconsists
consistsofofthethecase,
case,outer
outerframe,
frame,middle
middleframe
frame
andinner
and innerframe
frameasasdepicted
depictedininFigure
Figure5.5.The
Thekinematic
kinematicrelations
relationsare aresetsetasasa ayaw,
yaw,roll,
roll,pitch
pitchsequence
sequence
andfour
and fourreference
reference frames
frames are introduced:
introduced: the the body-fixed
body-fixedframeframeF, Fthe outer
, the outerframe
frameO, the O ,middle frame
the middle
M, theM
frame inner inner G
, theframe connected
frame by three revolute
G connected by three joints.
revoluteConsidering the common
joints. Considering structure structure
the common of gimbals
ofingimbals
[53–55],in relative
[53–55],angles are angles
relative definedare
as defined
yaw (θY ), asroll
yaw (θ(R)Yand pitch
), roll ( R )(θand
P ). The
pitchframe
( P
). F is
The carried
frame into
F
isframe O by
carried frame θOY around
intorotation by rotation Y zaround
the axis F . Frame O axis
the is carried O isMcarried
into frame
zF . Frame by rotation θR around
into frame M by the
axis xO . Finally, frame M is carried into frame G by rotation θP around the axis yM . The coordinate
rotation R around the axis xO . Finally, frame M is carried into frame G by rotation P around
systems of the gimbal (Figure 5) are placed parallel to each other as the initial state in the configuration
the
(θYaxis
,θR ,θyPM) .=The coordinate
(0, 0, 0). systems of the gimbal (Figure 5) are placed parallel to each other as the
initial state in the configuration ( Y , R , P ) = (0, 0, 0).
Y R P

is carried into frame O by rotation Y around the axis zF . Frame O is carried into frame M by
rotation R around the axis xO . Finally, frame M is carried into frame G by rotation P around
the axis yM . The coordinate systems of the gimbal (Figure 5) are placed parallel to each other as the
Appl. Sci.
initial 10,the
2020,in
state 5064configuration (  ,  ,  ) = (0, 0, 0). 7 of 27
Y R P

Figure 5. The gimbal in configuration (θ ,θ ,θ ) = (0, 0, 0) viewed from the side and the front with
Figure 5. The gimbal in configuration ( YY, RR, PP ) = (0, 0, 0) viewed from the side and the front with
reference frames and relations between them. The direction of each joint is indicated by the dotted line.
reference frames and relations between them. The direction of each joint is indicated by the dotted
The coordinate of an arbitrary point P in frame G denoted as the vector G r can be described in a
line.
P
different coordinate frame F using the rotation matrix F RG and the translation vector F dG between the
frames according to the above relationship:

B
rP = F RG G rP + F dG (4)

where, in a 3D environment,
Gx
 
P
 
G Gy
 
rP =  P

 (5)
Gz
 
P
A more convenient way to describe such transformation is to use homogeneous transformation
matrices F TG given as:
" F
RG F dG
#
F
TG = (6)
01×3 1
Several intermediate transformations are required to get the final transformation in Equation (4)
given as
" B # " G #
rP rP
= F TG (7)
1 1
where
F
TG = F TO O TM M TG (8)

With the parameters l1 , l2 , h1 , h3 and b2 in Figure 5, the transformations between the frames are
as follows.
The transformation between the frame F and frame O:

 cos θY − sin θY 0 
 
F
RO =  sin θY cos θY 0 
 
(9)
 
0 0 1
Appl. Sci. 2020, 10, 5064 8 of 27

 −l1 cos θY
 

F
dO =  −l1 sin θY
 

 (10)

h1

or
 cos θY − sin θY −l1 cos θY
 
0 
 sin θY cos θY −l1 sin θY
 
F 0 
TO =   (11)

 0 0 1 h1 

0 0 0 1

The transformation between the frame O and frame M:


 
 1 0 0 
O
RM =  0 cos θR − sin θR
 

 (12)
0 sin θR cos θR
 

 
 l2 
O
=  −b2 cos θR
 
dM 
 (13)
−b2 sin θR
 

or  
 1 0 0 l2 
 0 cos θR − sin θR −b2 cos θR
 
O

TM =   (14)
 0 sin θR cos θR −b2 sin θR 


0 0 0 1

The transformation between the frame M and frame G:

 cos θP 0 sin θP
 

M
 
RG =  0 1 0 
 (15)
− sin θP 0 cos θP
 

 h3 sin θP
 

M
 
dG =  b2 
 (16)
h3 cos θP
 

or
 cos θP 0 sin θP h3 sin θP
 

 
M
 0 1 0 b2 
TG =   (17)
 − sin θP 0 cos θP h3 cos θP 


0 0 0 1

Thus, the total rotation matrix F RG and translation vector F dG between frame F and frame G is:

 cos θY cos θP − sin θY sin θR sin θP − cos θR sin θY cos θY sin θP + cos θP sin θY sin θR
 

FR =  cos θP sin θY + cos θY sin θR sin θP cos θY cos θR sin θY sin θP − cos θY cos θP sin θR
 
G 

 (18)
− cos θR sin θP sin θR cos θR cos θP
 

Fd = F RO O RM M dG + F RO O dM + O dF
G
 l2 cos θY − l1 cos θY + h3 cos θY sin θP + h3 cos θP sin θY sin θR
 

(19)
=  l2 sin θY − l1 sin θY + h3 sin θY sin θP − h3 cos θY cos θP sin θR
 


h1 + h3 cos θR cos θP
 
Appl. Sci. 2020, 10, 5064 9 of 27

The F RG is called a pitch-roll-yaw rotation matrix according to the order in which the rotation
matrices are successively multiplied. In a similar way, the rotation between frame and inertial reference
frame can be described as:

 cos αY cos αP − sin αY sin αR sin αP − cos αR sin αY cos αY sin αP + cos αP sin αY sin αR
 

NR =  cos αP sin αY + cos αY sin αR sin αP cos αY cos αR sin αY sin αP − cos αY cos αP sin αR
 
G 

 (20)
− cos αR sin αP sin αR cos αR cos αP
 

where αY , αR , and αP are derived using the information from the gyros and accelerometers on the IMU
attached to the camera. The N RG can also be derived using the rotation matrix F RG and N RF , as below:

N
RG = N RF F RG (21)

where F ωNF , and the angular velocity of frame F respect to frame N introduced in frame F is measured
and available given as:
h iT
F
ωNF = p q r (22)

Angular velocities of frame O, M, and G respect to frame N introduced in its own frames are
as follows:

 0   p cos θY + q sin θY 
   
 T
O
ωNO = O ωNF + O ωFO = F RO F ωNF +  0  =  −p sin θY + q cos θY 
   
(23)
 .   .
θY r + θY


 . .
p cos θY + q sin θY + θR
  
 θR   
 T . 
Mω = M ωNO + M ωOM = OR Oω  =  cos θR (−p sin θ + q cos θ ) + sin θR (r + θ )
  
NM M NO
+  0
Y Y Y

 (24)
   .
sin θR (p sin θY − q cos θY ) + cos θR (r + θY )

0
  

 
 0 
Gω Gω
 T  . 
NG
= NM
+ G ωMG = MR
G

NM
+  θP 
 
0
. . (25)
 cos θP (p cos θY + q sin θY + θR ) − sin θP (sin θR (p sin θY − q cos θY ) + cos θR (r + θY ))
 

 . . 
=  cos θR (−p sin θY + q cos θY ) + sin θR (r + θY ) + θP 

. .
sin θP (p cos θY + q sin θY + θR ) + cos θP (sin θR (p sin θY − q cos θY ) + cos θR (r + θY ))
 

The inertia matrices of the outer gimbal, middle gimbal, and inner gimbal are:
n o
JO = diag JOx JOy JOz (26)
n o
JM = diag JMx JMy JMz (27)
n o
JG = diag JGx JGy JGz (28)

where Jkx , Jky , Jkz (k = O, M, G) refer to the diagonal elements. For simplicity, it is assumed that
the off-diagonal elements of inertia matrices can be neglected and only the moments of inertia are
considered. The angular momentum of the pitch gimbal is:

HG = JG G ωNG (29)

the roll gimbal is:


HM = JM M ωNM + M RG JG G ωNG (30)

the yaw gimbal is:


HO = JO O ωNO + O RM ( JN M ωNM + M RG JG G ωNG ) (31)
Appl. Sci. 2020, 10, 5064 10 of 27

and each member of the gimbal system is treated as a rigid body and the moment equation can be
written as:
Appl. Sci. 2020, 10, x FOR PEER REVIEW dHk k 10 of 28
τk = + ωNk × Hk (k = O, M, G) (32)
dt
where external torques  ,  and  ,about z , xO and yM , respectively, are applied to gimbals
where external torques τOO, τMMand τG , Gabout zF , xF O and yM , respectively, are applied to gimbals from
from motor
motor and other
and other external
external disturbance
disturbance torques.
torques.

2.3.
2.3. Stabilization and Aiming
The
The camera
camera fitted onon the
the innermost
innermost frame
frame isis inertially
inertially stabilized
stabilized and
and controlled
controlled by
by the
the gimbal
gimbal
system. Furthermore, the
system. Furthermore, the system
systemisisrequired
requiredto toalign
alignitsitsoptical
opticalaxis
axisininelevation
elevationandandazimuth
azimuthwith
witha
aLOS
LOSjoining
joiningthe
thecamera
cameraandandtarget.
target.Figure
Figure66describes
describes thethe angular
angular geometry
geometry of how the gimbaled
gimbaled
camera aims
camera aimsatatthe
thetarget, whereθ is
target,where is the
 the pitch
pitch angle
angle of UAV
of the the UAV δ is the
body,body, angle, λangle,
is the boresight
 boresight is the
LOS
 isangle, θP is
the LOS the pitch
angle, P isangle of theangle
the pitch gimbal frame,
of the ε is theand
and frame,
gimbal  is the
boresight error angle. error angle.
boresight

Figure 6. Angular geometry of the gimbal system.


Figure 6. system.

There
There are
are four
four operation
operation modes,
modes, namely,
namely, the
the preset
preset angle
angle mode,
mode, search
search mode,
mode, stabilize
stabilize mode,
mode,
and
and tracking mode as
tracking mode as shown
shownin inFigure
Figure7.7.When
Whenthe the gimbal
gimbal system
system is powered
is powered ononandand initialized,
initialized, its
its direction is kept at δ = 0 in inertial space. In preset angle mode, the optical axis of the
direction is kept at   0 in inertial space. In preset angle mode, the optical axis of the camera will camera will
be
be set
set to
to aa given
given angle
angle and
and the
the control
control system
system will
will maintain
maintain the
the desired
desired direction
direction despite
despite disturbances.
disturbances.
Then,
Then, thethe system
system may
may switch
switch to
to search
search mode,
mode, inin which
which the
the gimbal
gimbal will
will rotate
rotate circularly
circularly between
between itsits
minimum and maximum angle to search a larger
minimum and maximum angle to search a larger range of area. range of area.
When a target is confirmed, the control system will switch to the tracking mode and keep the
target in the center of the camera view. In addition, if the target is lost and cannot be recaptured in a
few seconds, the system will return to the search mode and try to find it again. Figure 8 shows how the
control system works which contains two loops: tracking loop and stabilizing loop.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 28

Preset angle mode


Appl. Sci. 2020, 10, 5064 11 of 27
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 28
Search mode

Preset angle mode


N
Target
detected?
Search mode N
Y
N
Tracking mode
Target Y
N detected?
N
Y the
Lost Capture the target
Target? in T seconds?
Tracking mode
Y Y
N
Stabilize mode
Lost the Capture the target
Target? of the system
Figure 7. Diagram in operation
T seconds?modes.
Y
When a target is confirmed, the control system will switch to the tracking mode and keep the
Stabilize
target in the center of the camera view. mode
In addition, if the target is lost and cannot be recaptured in a
few seconds, the system will return to the search mode and try to find it again. Figure 8 shows how
Figurecontains
the control system works which 7. Diagram
twoof
Diagram the
the system
ofloops: operation
tracking loopmodes.
and stabilizing loop.

When a target is confirmed, the control system will switch to the tracking mode and keep the
target in the center of the camera view. In addition, if the target is lost and cannot be recaptured in a
few seconds, the system will return to the search mode and try to find it again. Figure 8 shows how
the control system works which contains two loops: tracking loop and stabilizing loop.

Figure 8. Architecture
Figure 8. Architecture of
of the
the gimbal
gimbal control
control system
system in
in elevation
elevation channel.
channel. 11 for
for the
the preset
preset angle
angle mode,
mode,
22 for
for the
the stabilize
stabilize mode,
mode, 33 for
for the
the search
search mode,
mode, and
and 44 for
for the
the tracking
trackingmode.
mode.

Based
Based on onthetheimage
imageinformation
information andandmeasurement
measurement data received from angular
data received sensors, sensors,
from angular the tracking
the
loop
tracking loop generates a rate command to direct the boresight towards the target LOS so thaterror
generates a rate command to direct the boresight towards the target LOS so that the pointing the
can be kept
Figure near
8. zero.
ArchitectureOnofthe
the other
gimbal hand, the
control stabilizing
system in loop
elevation isolates
channel. the
1 forcamera
the from
preset
pointing error can be kept near zero. On the other hand, the stabilizing loop isolates the camera from UAV
angle motion
mode,
and
UAVexternal
2 motion disturbances,
for the stabilize 3which
for thewould
mode,disturbances,
and external perturb
searchwhich
mode, andthe
would aim-point.
4 for the The
the tracking
perturb controlThe
mode.
aim-point. loops in roll,
control elevation,
loops in roll,
and azimuth channels are related by the cross coupling unit based on the
elevation, and azimuth channels are related by the cross coupling unit based on the gimbal systemgimbal system dynamics,
which Based on defined
maywhich
be the image information and axis
measurement data received from angular sensors, the
dynamics, may as
be the impact
defined as on
theone
impact on with
one the rotation
axis with theof another
rotation[56].
of another [56].
tracking loop generates a rate command to direct the boresight towards the target LOS so that the
3. Target error
pointing Tracking
can be kept near zero. On the other hand, the stabilizing loop isolates the camera from
3. Target Tracking
UAV motion and external disturbances, which would perturb the aim-point. The control loops in roll,
3.1. KCF Tracker
elevation, and azimuth channels are related by the cross coupling unit based on the gimbal system
3.1. KCF Tracker
The KCF
dynamics, tracker
which may [42,57]
be definedthatasis the
used in this
impact onpaper considers
one axis with the the process
rotation ofof sample[56].
another training as
TheRegression
a Ridge KCF tracker [42,57] that
problem, whichis used in athis
is also paperminimization
regular considers theproblem
process of sample
with training
a closed as a
solution.
Ridge
3. TargetRegression
Trackingproblem, which is also a regular minimization problem with a closed solution.

3.1. KCF Tracker


The KCF tracker [42,57] that is used in this paper considers the process of sample training as a
Ridge Regression problem, which is also a regular minimization problem with a closed solution.
Appl. Sci. 2020, 10, 5064 12 of 27

h iT
Consider a n × 1 vector x = x1 x2 · · · xn as the base sample, which represents a patch with
the target of interest. A small translation of this vector is given as:
h iT
Px = xn x1 ··· xn−1 (33)

where P is the permutation matrix:


 
 0 0 0 ··· 1 
 

 1 0 0 ··· 0 

P = 
 0 1 0 ··· 0 
 (34)
 .. .. .. .. .. 
. .


 . . . 

0 0 ··· 1 0

and u shifts can be made to achieve a larger translation by using the matrix power Pu x. By cyclic
shifting operations, we can use these vectors to constitute a circulant matrix as:

 (P0 x)T
   
  x1 x2 x3 ··· xn 
T
  
 (P1 x)

  xn x1 x2 ··· xn−1
   
 T   
X = C(x) =  (P2 x)
  xn−1 xn x1 ··· xn−2
 = 

(35)
 
 ..   . .. .. .. .. 
  . .

  . . . .


 .  


n−1 T x2 x3 x4 ··· x1
(P x)
 

and it is useful that all circulant matrices are diagonalized by the Discrete Fourier Transform (DFT),
regardless of the base sample x, which can be expressed as:

X = Fdiag(x̂)FH (36)

where x̂ is the DFT of base sample, x̂ = F(x), and F is a constant matrix known as the DFT matrix that
does not depend on x.
Based on Ridge Regression, the goal of training is to find a function f (z) that minimizes the
squared error over samples xi and their regression targets yi , as shown below:
X
E = min ( f (xi ) − yi )2 + λkwk2 (37)
w
i

where the regularization parameter λ is used to control over fitting, as in the Support Vector Machines
(SVM) [57] and w represents the filter coefficients.
Consider a linear regression function f (z) = wT z, the minimizer has a closed-form solution
given as:
−1
w = (XT X + λI ) XT y (38)

where X is the circulant matrix with one sample per row xi , each element of y is a regression target yi ,
and I is an identity matrix. By utilizing the diagonalization of the circulant matrices, Equation (38) can
be expressed in Fourier domain as:
x̂∗ ŷ
ŵ = ∗ (39)
x̂ x̂ + λ
where ŵ, x̂, ŷ are the DFT of w, x, y, respectively, and x̂∗ is the complex-conjugate. In addition, w can be
easily recovered in the spatial domain with Inverse Discrete Fourier Transform (IDFT).
Appl. Sci. 2020, 10, 5064 13 of 27

When regression function f (z) is nonlinear, the kernel trick is used to map the inputs of a linear
problem to a nonlinear and high-dimensional feature space ϕ(x):
X
w= αi ϕ(xi ) (40)
i

Then, the variables under optimization are α instead of w. The kernel function k is used to compute the
algorithm in terms of dot-products, as shown below:

ϕT (x)ϕ(x0 ) = k(x, x0 ) (41)

where all the dot-products between samples are stored in a n × n kernel matrix K, with elements:

Kij = k(xi , x j ) (42)

and the regression function f (z) can be expressed as:


n
X
f ( z ) = wT z = αi k(z, xi ) (43)
i=1

The solution of this regression function can be given as:

α = (K + λI )−1 y (44)

where K is the kernel matrix and α is the vector of coefficients αi , which express the solution in the
dual space. By making K circulant, Equation (44) can be diagonalized as in the linear case, obtaining:


α̂ = (45)
k̂xx +λ

where k̂xx is the correlation kernel of x with itself in Fourier domain and α̂, ŷ are the DFT of vector α, y.
For the kernel matrix Kz between all training samples (cyclic shifts of x) and candidate image
patches (cyclic shifts of base patch z), each element of Kz is given by k(Pi−1 z, P j−1 x). From Equation
(43), the regression function can be computed for all candidate patches with:

f (z) = (K z )T α (46)

where f (z) is a vector, containing the output of all cyclic shift of z, which is the full detection response.
The position where the output response takes the maximum value is the position of the target in a new
frame. To compute Equation (46) efficiently, it is diagonalized as shown below:

fˆ(z) = k̂xz α̂ (47)

where a hat ∧ denotes the DFT of a vector. In this paper, given the nonlinear Gaussian kernel
k(x, x0 ) = exp(− σ12 kx − x, k2 ), we get:

1
 
,
kxx = exp − 2 kxk2 + kx, k2 − 2F−1 (x̂∗ x̂0 ) (48)
σ
where the kernel correlation can be computed by using a few DFT/IDFT and element-wise operations
in O(n log n) time. Henriques et al. [42] proved that the conversion from inverse operation of matrix in
spatial domain to matrix multiplication in Fourier domain would greatly reduce the computational
complexity and shorten the computation time.
Appl. Sci. 2020, 10, 5064 14 of 27

In the tracking process, considering the target variations, such as illumination, scale, occlusion,
and deformation factors, the target apparent model and coefficient vector are updated after each
Appl. Sci. 2020, 10, x FOR PEER REVIEW 14 of 28
frame [58], as shown below:
xt = (1 − ηt )xt−1 + ηt x (49)
t  (1 t )t 1  t (50)
αt = (1 − ηt )αt−1 + ηt α (50)
where x , x are the target model updated after the t  1 and the t frame. t 1 , t are the
where xt−1 ,t 1xt aret the target model updated after the t − 1 and the t frame. αt−1 , αt are the coefficient
coefficient vector updated after the and the frame.
vector updated after the t − 1 and the t frame. ηt is the learning rate.
t  1 t  t
is the learning rate.

3.2.
3.2.Target Localization
Target Localization
Consider
Considera monocular camera
a monocular model
camera asas
model shown inin
shown Figure 9. 9.
Figure

Figure
Figure 9. 9. Coordinate
Coordinate system
system ofof a monocular
a monocular camera.
camera.

When
When anan
arbitrary point
arbitrary p(xwp, y(w(,xzw,)y in, zworld
point world W
) in frame is detected
frame by the camera,
W is detected by theitscamera,
2D position
its 2D
w w w
pi (xi , yi ) on the image plane can be given as:
position pi ( xi , yi ) on the image plane can be given as:
f xc
xi = =fx (u − u )dx (51)
xzi c c  (u 0 u0 )dx (51)
zc
f yc
yi = = (v − v0 )dy (52)
zc fyc
yi   ( v  v0 )dy (52)
where (xc , yc , zc ) is the position of p in the camera frame, zc f is the focal length, (u, v) is the target position
in pixel values, (u0 , v0 ) is the intersection of the optical axis, and the image plane, dx and dy are the
where ( xc , yc , zc ) is the position of p in the camera frame, f is the focal length, (u , v ) is the
physical length per pixel in the xi and yi axis directions. Equations (51) and (52) can be integrated as:
target position in pixel values, (u0 , v0 ) is the intersection of the optical axis, and the image plane, dx
     
and dy are the physical length per u  pixel in xc /zcx  and y xaxis
 the 1c directions. Equations (51) and (52)
  i i 
 v  = Min  yc /zc  = Min  y1c  (53)
     
can be integrated as:
1 1 1
     
u  xc / zc   x1c 
T      the  intrinsic parameter matrix of the
where [x1c , y1c , 1] is the point on the normalized plane and M is
v  Min  yc / zc   Minin  y1c  (53)
camera, which is given as:  1   1   1 
 f /dx 0 u0 
 
T
where [x1c , y1c ,1] is the point on the =  0 plane
Minnormalized  f /dyand v0 M (54)of
 in is the intrinsic parameter matrix


0 0 1

the camera, which is given as:

 f / dx 0 u0 
 
Min   0 f / dy v0  (54)
 0 0 1 
Appl. Sci. 2020, 10, 5064 15 of 27
Appl. Sci. 2020, 10, x FOR PEER REVIEW 15 of 28

With
With aa calibrated camera, the
calibrated camera, the pointing
pointing error angle ε in elevation
error angle in elevation
andand azimuth
azimuth cancan
be be computed
computed by
by using
using thethe above
above relations,
relations, as shown
as shown below:
below:
  arctan( xc / zc ) (55)
εχ = arctan(xc /z c) (55)

arctan(
εγ =arctan ( yycc/z
/ cz)c ) (56)

where εχ and


andεγare
arethe
thepointing
pointingerror
errorororboresight
boresighterror
error in
in azimuth
azimuth and elevation,
elevation, respectively,
respectively,
which can be put into the target aiming system as input information to control the gimbal.
which can be put into the target aiming system as input information to control the gimbal.
A scalable KCF tracker is used in this paper, with the scale changes of the target taken into
A scalable KCF tracker is used in this paper, with the scale changes of the target taken into
consideration. The tracker not only updates the centroid position of the target in the image frame,
consideration. The tracker not only updates the centroid position of the target in the image frame, but
but also outputs the target size in pixel values. This can be used to control the distance between the
also outputs the target size in pixel values. This can be used to control the distance between the
interceptor and the target while tracking, though the physical size of the target is unknown. The running
interceptor and the target while tracking, though the physical size of the target is unknown. The
results of tracking a pedestrian with the proposed tracker are shown in Figure 10. The bounding box
running results of tracking a pedestrian with the proposed tracker are shown in Figure 10. The
changes with adaption to the target variations in the video streams.
bounding box changes with adaption to the target variations in the video streams.

(a) (b)

(c) (d)
Figure 10. Tracking of a human target moving at pedestrian speed. (a) video video streams
streams with
with no target
selected; (b) tracking of the
the selected
selected target
target with
with its
its error
error pixels
pixels displayed
displayed on
on screen;
screen; (c)
(c) autonomously
autonomously
lock on the
the target
target in
in close
close range
range using
using the
the gimbal
gimbal system;
system; (d)
(d) real-time
real-time adjustment
adjustment to the scale
scale changes
changes
of the target.

4. Flight
4. Flight Control
Control Algorithm
Algorithm

4.1. UAV Dynamic Model


4.1. UAV Dynamic Model
The 6 DOF motion of a rigid quadrotor is described in Figure 11.
The 6 DOF motion of a rigid quadrotor is described in Figure 11.
Appl. Sci. 2020, 10, 5064 16 of 27
Appl. Sci. 2020, 10, x FOR PEER REVIEW 16 of 28

Figure
Figure 11. 11.
TheThe quadrotor
quadrotor with
with corresponding
corresponding frames.
frames.

LetLet m denote
m denote the the
mass of of
mass thethe
quadrotor
quadrotorand andJ the J themoment
momentofofinertia. The external
inertia. The externalforcesforcesand
andtorques
torques which act on the quadrotor platform are primarily caused by
which act on the quadrotor platform are primarily caused by propellers and gravity. A local propellers and gravity.
A local
NED NEDframe frame N and
N and body-fixed frameBB are are introduced totodescribe
describethethemotion ofof
thethe
quadrotor.
h ibody-fixed
T
frame
h introduced
iT motion quadrotor.
np = n n n T n n n n T
n
p   nppxx nppyy n ppzz  and and vn v=  n vxvx n vyv y n vz v z are arethe
the position
position and
and linear
linearvelocity
velocityof of thethe
quadrotor’s mass center relative to N. Θ = [φ θ Ψ] is the roll/pitch/yaw T
T
angles, which represents the
quadrotor’s
orientation of themass centerinrelative
quadrotor to N . matrix
N. The rotation   Rnfrom   B toisNthe
is roll/pitch/yaw
expressed as: angles, which
b
n
represents the  orientation of the quadrotor in N . The rotation matrix Rb from B to N is
expressedN
as: cos θ cos ψ sin φ sin θ cos ψ − cos φ sin ψ cos φ sin θ cos ψ + sin φ sin ψ 
RB =  cos θ sin ψ sin φ sin θ sin ψ + cos φ cos ψ cos φ sin θ sin ψ − sin φ cos ψ  (57)
θ  cos sin sin
 cos sinφcos
cosθ  cos  sin cos  sin cosφcos
cosθ sin  sin 
 
− sin
N  
RB   cos  sin sin  sin  sin  cos  cos cos  sin  sin   sin  cos  (57)
The equations of motion   sincan
 be described sinas:
 cos  cos  cos  
n .
p =as:n v
The equations of motion can be described (58)

n. pfb nnv
n
(58)
v = gn3 − Rb n3 (59)
m
b .
· b ω f+
 
b n
J · ω = − ω v× Jgn b Gna + τ
Rn (60)(59)
3
m b 3
where fb is the force applied to the quadrotor given in B, τ is the torque, g is the gravitational acceleration
and b ω is the angular velocity of the quadrotor in B. The gyroscopic moment Ga is mainly produced by
propellers, which can be neglected. In addition, the translational dynamics

J  b   b  J  b  Ga   (60)
shown in Equations (58) and (59)
canwhere
be simplified as:force applied to the quadrotor given in B ,  is the torque, g is the gravitational
fb is the
n .. fb
acceleration and b is the pangular (sin φ sin
x = − mvelocity ofψthe
+ cos φ sin θ cos
quadrotor in ψB) . The gyroscopic moment(61)Ga is
mainly produced by propellers, which
fb can be neglected. In addition, the translational dynamics shown
n ..
in Equations (58) and (59) can p y be=− (− sin φas:
simplified cos ψ + cos φ sin θ sin ψ) (62)
m
.. f fb
pxn
 pz=b g sin
n
cosφcos
−  sin cosθ sin  cos  (63)(61)
m m
Furthermore, it can be assumed that sin φ ≈ φ, cos φ ≈ 1, sin θ ≈ θ, and cos θ ≈ 1 for small angle
approximation, which leads to a simplified f dynamic model as described in [59].
n

p y   b   sin  cos  cos  sin  sin  (62)
m

fb
n

pz  g  cos  cos  (63)
m
Appl. Sci. 2020, 10, 5064 17 of 27

4.2. Tracking Strategy


The tracking strategy used in this paper is based on proportional navigation (PN), which is a
well-known guidance law and has been widely used to enable a missile to catch its target in optimal
time. The constant bearing approach considers that the missile will finally collide with the detected
target if the LOS angle is kept constant. The PN method improves the constant bearing approach to
accommodate for target maneuver by accelerating the missile in a direction lateral to the LOS with
magnitude proportional to the rate change of the LOS angle. There are different types of PN methods
according to their different mathematical formulations and their performances have been analyzed
in [60], when applied to guidance of a quadrotor.
.. des
The desired acceleration u obtained by the PN method can be expressed as:
.. des .
u = NL × λ (64)
.
where λ is rate change of the LOS angle, N is the navigation gain, and L is the normal direction of the
acceleration command that is calculated for different ways as follows:

LRTPN = uT − u (65)
. .
LIPN = uT − u (66)
.
LPPN = −u (67)
.
uu sin β
LNGL = · . (68)
|uT − u| λ

where LRTPN , LIPN , LPPN , and LNGL represent for Realistic True Proportional Navigation (RTPN) [61],
Ideal Proportional Navigation (IPN) [62], Pure Proportional Navigation (PPN) [63], and Nonlinear
. ..
Guidance Law (NGL) [64], respectively. uT , uT , uT ∈ R2 are the position, velocity, and acceleration
. ..
of the target, respectively, u, u, u ∈ R2 are the position, velocity, and acceleration of the interceptor,
respectively. |·| represents the magnitude of the vector and β is the angle between interception velocity
and LOS of the target.
.
In addition, λ can be described as:
. .
. (uT − u) × (uT − u)
λ= (69)
|uT − u|2

The aim of the research in this paper is to track a moving target with a quadrotor platform, and
coordinated control of the UAV and gimbaled camera is considered. As shown in Figure 12a,b, it can
be done in two directions: the longitudinal direction and the lateral direction [65].
The λχ , λγ are the lateral and longitudinal LOS angle of the target, respectively. The εχ , εγ are
the boresight error angle in lateral and longitudinal direction, respectively, which are controlled to
.χ .γ
be zero ( εχ , εγ → 0 ). The uT , uT are the velocity of the target in lateral and longitudinal directions,

respectively. The u is the velocity of the UAV in lateral direction that is aligned with the axis xb of the
body frame B and the desired yaw angle of the UAV is ψdes = λχ . A pure pursuit guidance law is used
for tracking in the lateral direction, which works on the principle that, if the interceptor persistently
. .
points towards the target ( ψ → λχ , ψ → λχ ), then it will ultimately intercept it.
When the tracking is initiated, the UAV follows the target by tracking its lateral LOS angle with
.χ .
a forward speed u , which is the horizontal component of the approaching velocity uApp . To keep
.
following the target moving at unknown speed, the approaching velocity uApp is decided by the scale
changes of the target in the image frame. Then, a PPN guidance law will activate and the acceleration
.. des
command uPPN will be applied to the UAV according to the rate changes of λγ , as shown in Figure 12b.
Appl.Sci.
Appl. Sci.2020, 10,x 5064
2020,10, FOR PEER REVIEW 1818ofof2827

(a) (b)
Figure 12. Schematic diagram showing the tracking strategy. (a) description of tracking the lateral
LOS angle based on a pure pursuit method; (b) description of tracking the longitudinal LOS angle
based on PN guidance law.

When the tracking is initiated, the UAV follows the target by tracking its lateral LOS angle with
a forward speed u  , which is the horizontal component of the approaching velocity u App . To keep
following the target moving at unknown speed, the approaching velocity u App is decided by the scale
(a) (b)
changes of the target in the image frame. Then, a PPN guidance law will activate and the acceleration
command
Figure uPPN
Figure12.
12.des
will bediagram
Schematic
Schematic appliedshowing
diagram to the UAV
showing according
thetracking
the tracking to the
strategy.
strategy. rate
(a)(a) changes of of
description
description  , as
theshown
tracking
tracking the in
lateral
lateral LOSFigure
LOS angle
angle based
based on a on a pure
pure pursuit
pursuit method;
method; (b) description
(b) description of tracking
of tracking the longitudinal
the longitudinal LOS
LOS angle angle
based on
12b.
PN guidance
based law.
on PN guidance law.
4.3. Flight
4.3. Flight Control System
System
When Control
the tracking is initiated, the UAV follows the target by tracking its lateral LOS angle with
The quadrotor
a forward
The quadrotor
speed u  , is is aa typical
which typical
is theunderactuated system with
horizontal component
underactuated system with only
of the four independent
independent
approaching
only four inputs
velocityinputs lesskeep
u App . To
less than
than
the degrees
the degrees of freedom, so only the desired position and desired yaw angle can be directly tracked.
following theof freedom,
target moving so at
only the desired
unknown position
speed, and desired
the approaching yaw angle
velocity u App can be directly
is decided by thetracked.
scale
The desired
The desiredrollrollangle
angleand andthethe desired
desired pitch
pitch angleangle are determined
are determined by theby the known
known ones.
ones. The Thecontrol
flight flight
changes of
control of the
system target in the
of a quadrotor image frame. Then,
is described a PPN guidance law will activate and the acceleration
system a quadrotor is described in Figurein13.
Figure 13.
command u PPN
des
will be applied to the UAV according to the rate changes of  , as shown in Figure
12b.

4.3. Flight Control System


The quadrotor is a typical underactuated system with only four independent inputs less than
the degrees of freedom, so only the desired position and desired yaw angle can be directly tracked.
The desired roll angle and the desired pitch angle are determined by the known ones. The flight
control system of a quadrotor is described in Figure 13.

Figure 13. Hierarchical control architecture of the quadrotor for target tracking.
Figure 13. Hierarchical control architecture of the quadrotor for target tracking.

The pixel values uT , vT are the centroid of the target position in each image frame, and ST is the
The pixel values uT , vT are the centroid of the target position in each image frame, and ST is
scale change of the target, which is given by the proposed KCF tracker. The state of the quadrotor
(the
Θ, ωscale change of
) expressed in the target,
global NEDwhichframeisisgiven
givenby bythe
theproposed
autopilot KCF
usingtracker. The state
an Extended of theFilter
Kalman quadrotor
(EKF).
( ,  ) expressed in global NED frame is given by the autopilot using an Extended Kalman Filter
h i T
The desired position xd yd zd and the desired yaw angle ψd are given by the gimbal system in the
T
(EKF). NED
global The desired
using the position  xd yd zdtracking
above-mentioned  and the desired
strategy. yaw angle
A cascade  d are given by the gimbal
proportional-integral-derivative
(PID)
system controller is designed
in the global to individually
NED using control the 6 tracking
the above-mentioned DOF motion of theAquadrotor.
strategy. The attitude
cascade proportional-
control loop is implemented on the microcontroller, while the outer
integral-derivative (PID) controller is designed to individually control the 6 DOF motion loop for position control
of theis
implemented
quadrotor. The onattitude
Figurethe13.onboard
controlcomputer.
Hierarchical control
loop All PID gains
architecture
is implemented of have
onthe been preliminarily
quadrotor
the for target while
microcontroller, tuned
tracking. in hovering
the outer loop for
flight tests. The outputs of the cascade PID controller are the desired force fd and the desired torque τd ,
which Thearepixel values
applied uT ,UAV
to the vT are the centroid
body. The mixerofgives
the target position
the desired in each
angular imageofframe,
velocity and Sto
each motor is
T the
electronic
the speedofcontroller
scale change the target,(ESC),
which which is expressed
is given as a Pulse
by the proposed KCF Width Modulation
tracker. The state (PWM) signal.
of the quadrotor
( ,  ) expressed in global NED frame is given by the autopilot using an Extended Kalmanbut
It is worth mentioning that the thrust value is not only determined by the desired position, also
Filter
by the takeoff weight of the quadrotor platform. T Thus, the height control can be considered as two
(EKF). The desired position  xd yd zd  and the desired yaw angle  d are given by the gimbal
parts: a slightly changed base value for hovering control and a fast controller for position control.
system in the global NED using the above-mentioned tracking strategy. A cascade proportional-
integral-derivative (PID) controller is designed to individually control the 6 DOF motion of the
quadrotor. The attitude control loop is implemented on the microcontroller, while the outer loop for
the desired torque  d , which are applied to the UAV body. The mixer gives the desired angular
velocity of each motor to the electronic speed controller (ESC), which is expressed as a Pulse Width
Modulation (PWM) signal.
It is worth mentioning that the thrust value is not only determined by the desired position, but
Appl. Sci. 2020, 10, 5064 19 of 27
also by the takeoff weight of the quadrotor platform. Thus, the height control can be considered as
two parts: a slightly changed base value for hovering control and a fast controller for position control.
When the
When the target
target isis selected,
selected, the
the quadrotor
quadrotor will keep the
will keep the distance
distance relative
relative to
to the
the target
target based
based on
on the
the
estimation of its scale changes in the image frame, which is shown in Supplementary
estimation of its scale changes in the image frame, which is shown in Supplementary Materials. Materials.

5. Experiments
Experiments and Results

Experimental Setup
5.1. Experimental
Most experimental
experimentalUAVs UAVs areare
equipped
equipped withwith
expensive sensors
expensive and devices,
sensors such as such
and devices, high-precision
as high-
IMUs, 3DIMUs,
precision light detection, and ranging
3D light detection, and (Lidar)
rangingsensors
(Lidar) and differential
sensors global positioning
and differential system
global positioning
(DGPS), which will
system (DGPS), definitely
which improveimprove
will definitely the control
the accuracy but are unaffordable
control accuracy in many practical
but are unaffordable in many
applications.
practical To test the
applications. Toproposed tracking system
test the proposed trackinginsystem
this paper, a customized
in this quadrotorquadrotor
paper, a customized platform
(65 cm in diameter)
platform (65 cm in isdiameter)
used to perform
is usedalltothe flight experiments,
perform all the flightwhich weighs 4.2
experiments, kg including
which weighs 4.2all the
kg
payloads,
including all as shown in Figure
the payloads, as 14.
shownThe in
cost is much
Figure lower
14. The than
cost the other
is much lowerplatforms
than the(e.g.,
otherDJI Matrice
platforms
200). DJI
TheMatrice
3-axis gimbal at 3-axis
a dimension × 86.2 × of 3 , weighs only3 409 g, as shown in
(e.g., 200). The gimbal ofat 108
a dimension 137.3
108mm× 86.2 × 137.3 mm , weighs only 409 g,
Figure 15a. The IMUs and encoders are consumer sensors
as shown in Figure 15a. The IMUs and encoders are consumer sensors at at very low prices. Theprices.
very low cameraThe with focal
camera
lengths
with ranging
focal lengthsfrom 4.9–49
ranging mm4.9–49
from has ammmaximum resolution
has a maximum of 1920 ×of1080
resolution 1920at×60 frames
1080 at 60 per second
frames per
(fps).
second The costThe
(fps). of the gimbaled
cost of the camera
gimbaled is less than is
camera $400,
lesswhich
than makes it very makes
$400, which attractive considering
it very its
attractive
great performance.
considering its greatThe AS5048A magnetic
performance. The AS5048A rotary encoderrotary
magnetic used in the gimbal
encoder used system measures
in the gimbal the
system
angular position
measures of each
the angular axis, which
position of eachhas a 14-bit
axis, whichhigh
has aresolution
14-bit high output (0.0219
resolution deg/LSB).
output (0.0219 deg/LSB).

Appl. Sci. 2020, 10, x FOR PEER REVIEW 20 of 28


Figure 14. The customized quadrotor platform.

(a) (b)
Figure
Figure 15.
15. The
The gimbaled
gimbaled camera
camera (a)
(a) and
and onboard
onboard equipment
equipment (b).
(b).

As shown in Figure 15b, the quadrotor is equipped with an embedded microcontroller


developed by the Pixhawk team at ETU Zürich [66]. The selected firmware version is 1.9.2.
To achieve real-time image processing, a NVIDIA Jetson TX2 module is used as an onboard
computer to implement the tracking algorithm, which is almost the fastest and most power-efficient
embedded AI computing device. The output data of the vision system can be transferred from the
onboard computer to the Pixhawk flight controller using serial communication, which is based on a
Appl. Sci. 2020, 10, 5064 20 of 27

As shown in Figure 15b, the quadrotor is equipped with an embedded microcontroller developed
by the Pixhawk team at ETU Zürich [66]. The selected firmware version is 1.9.2.
To achieve real-time image processing, a NVIDIA Jetson TX2 module is used as an onboard
computer to implement the tracking algorithm, which is almost the fastest and most power-efficient
embedded AI computing device. The output data of the vision system can be transferred from
the onboard computer to the Pixhawk flight controller using serial communication, which is based
on a MAVLINK [67] extendable communication node for the Robot Operation System (ROS) [68].
The control rate is at 30 Hz limited by the onboard processing speed. By using a 2.4 GHz remote
controller (RC), the tracking process can be initiated by switching to the offboard mode when a target
is selected.

5.2. Experimental Results and Analysis


To evaluate the performance of the proposed tracking and targeting system, we test it in different
situations. After selecting a target in the video streams, the gimbal system is activated and rotates
the camera to point at the selected target, which can be regarded as a step response. The boresight
error pixels are plotted in azimuth and elevation, respectively, which are also printed on the top left of
the screen.
As shown in Figure 16, the system responds rapidly and the steady-state error is about ±3 pixels,
while the initial errors are hundreds of pixels. This test is usually used to tune all the control parameters
Appl. Sci. 2020, 10, x FOR PEER REVIEW 21 of 28
of the gimbal system, which can be completed on the ground.

(a) (b)
Figure 16.
Figure 16. Boresight error in pixels
pixels during
during ground
ground step
step tests.
tests. (a) the response of the gimbal
gimbal system
system in
in
azimuth; (b)
azimuth; (b) the
the response
response of
of the
the gimbal
gimbal system
system in
in elevation.
elevation.

Then, the gimbal system and the onboard computer are fixed on the experimental quadrotor
platform for further tests. To aim at a target from the UAV, the motion and vibration of the platform
should be isolated. Once the target is locked, the UAV is in a fully autonomous mode controlled by the
integrated system. Figure 17 shows the results of the boresight error while the UAV is following a
pedestrian moving at 0.9–2 m/s. In about 250 seconds’ flight, an accuracy of ±9.34 pixels in the azimuth
and ±5.07 pixels in the elevation was achieved.

(a) (b)
Figure 17. Boresight error in pixels while the UAV is following a human target moving at pedestrian
(a) (b)
Figure 16. Boresight error in pixels during ground step tests. (a) the response of the gimbal system in
Appl. Sci. 2020, 10,
azimuth; (b)5064
the response of the gimbal system in elevation. 21 of 27

(a) (b)
Figure 17. Boresight error in pixels while the UAV is following a human target moving at pedestrian
speed. (a) the response in azimuth; (b) the response in elevation.

While tracking a ground moving pedestrian, the altitude change of the target can be ignored in
most situations, which makes the task less difficult. However, tracking a flying drone is much more
complicated. The The drones are able to change its position and velocity in a very short time, which may
cause tracking errors or failures. During the flight experiment, autonomous tracking of an intruded
drone
Appl.
has been
been
Sci. 2020,
achieved.
10, xachieved.
Figure
Figure 18
FOR PEER REVIEW 18 shows
showsthe
theresults
resultsof
ofthe
theboresight
boresighterror
errorwhile
whilethe
theUAV
UAVisistracking
tracking a
22 of 28
aflying drone.
flying drone.

(a) (b)
Figure 18. Boresight
Boresight error
error in
in pixels
pixels while
while the
the UAV is tracking a flying
flying drone.
drone. (a) the response in
azimuth; (b) the response in elevation.

Some oscillations still remain in the current configuration, which occurred when the flight path
of the target suddenly changed. ItIt is is aa great
great challenge
challenge for
for the
the system
system toto catch
catch up
up with the target in
such a short time.
time. The deviations caused by the image transmission delay cannotbebeignored,
The deviations caused by the image transmission delay cannot ignored,which is
which
about
is 220
about 220milliseconds
millisecondsfor
forthe
thecurrent
currentsystem.
system.Other
Otherfactors
factorssuch
suchasasimage
image noises
noises and
and illumination
changes may
mayalso
alsohave
haveananimpact
impact onon
thethe
tracking accuracy
tracking to some
accuracy extent.
to some The root
extent. The mean squaresquare
root mean errors
errors (RMSEs) of boresight errors in drone tracking experiment are listed in Table 1, compared other
(RMSEs) of boresight errors in drone tracking experiment are listed in Table 1, compared with the with
twoother
the tests.two tests.

Table 1. Root mean square errors (RMSEs) of boresight errors in different cases.

RMSE (Pixel) Azimuth Elevation 2D


Step test 31.57 4.83 31.94
Pedestrian following 9.34 5.07 10.63
Drone tracking 21.50 19.28 28.88
(a) (b)
Figure 18. Boresight error in pixels while the UAV is tracking a flying drone. (a) the response in
azimuth; (b) the response in elevation.
Appl. Sci. 2020, 10, 5064 22 of 27
Some oscillations still remain in the current configuration, which occurred when the flight path
of the target suddenly changed. It is a great challenge for the system to catch up with the target in
Table 1. Root mean square errors (RMSEs) of boresight errors in different cases.
such a short time. The deviations caused by the image transmission delay cannot be ignored, which
is about 220 milliseconds for the
RMSE current system.
(Pixel) Azimuth Other factors such as image
Elevation 2D noises and illumination
changes may also have anStep impact
test on the tracking
31.57 accuracy to
4.83 some extent.
31.94 The root mean square
errors (RMSEs) of boresight errors
Pedestrian in drone tracking experiment are listed in Table 1, compared with
9.34 5.07 10.63
the other two tests. following
Drone tracking 21.50 19.28 28.88
Table 1. Root mean square errors (RMSEs) of boresight errors in different cases.
Figure 19 shows the processRMSE of a successful
(Pixel) drone tracking
Azimuth experiment.
Elevation 2D During the flight, the
roll/pitch/yaw angle of the UAV, camera,
Step test and gimbal frames are
31.57 plotted,31.94
4.83 respectively, in Figure 20a–c.
The actual approachingPedestrian
speed of the UAV
followinghas tracked
9.34 the setpoint
5.07 changes
10.63 accurately as shown in
Figure 20d. The trajectoriesDroneof the intruded
tracking drone and
21.50the interceptor
19.28 are plotted in a local NED frame
28.88
as shown in Figure 21.
Higher control
Figure 19 rate
shows theand less image
process transmission
of a successful delay
drone would experiment.
tracking significantly During
improvethetheflight,
response
the
speed and the accuracy, if better hardware configuration were used. However,
roll/pitch/yaw angle of the UAV, camera, and gimbal frames are plotted, respectively, in Figureconsidering that
20a–all
the sensors
c. The actualand onboard devices
approaching speed ofarethelow-cost,
UAV hasthe performance
tracked of the
the setpoint proposed
changes tracking
accurately as system
shown in is
very attractive in practical applications. A video of the experiments is available in the Supplementary
Figure 20d. The trajectories of the intruded drone and the interceptor are plotted in a local NED frame
Materials
as shown section
in Figure(Video
21. S1).

Appl. Sci. 2020, 10, x FOR PEER REVIEW 23 of 28

(a) (b)

(c) (d)
Figure 19. Autonomously
Autonomously tracking
tracking a flying
flying drone with the experimental platform. (a) the
the target
target drone
and the interceptor (the quadrotor platform); (b) the searching phase; (c) start the tracking
tracking phase when
the target is recognized; (d) keep following the target
target drone.
drone.
(c) (d)
Figure 19. Autonomously tracking a flying drone with the experimental platform. (a) the target drone
and the interceptor (the quadrotor platform); (b) the searching phase; (c) start the tracking phase when
Appl. Sci. 2020, 10, 5064the target is recognized; (d) keep following the target drone. 23 of 27

(a) (b)

(c) (d)
Figure 20. The attitude and approaching speed of the UAV while autonomously tracking a flying
Figure 20. The attitude and approaching speed of the UAV while autonomously tracking a flying
drone. (a) roll angle measured by the camera IMU, encoder and UAV navigation system; (b) pitch
drone. (a) roll angle measured
angle measured by theIMU,
by the camera camera
encoder,IMU,
and UAVencoder
navigationand UAV
system; navigation
(c) yaw system; (b) pitch
angle measured
angle measuredbyby thethe
camera IMU, encoder,
camera IMU,and UAV navigation
encoder, and UAVsystem;navigation
and (d) the desired and actual
system; (c) approaching
yaw angle measured by
speed of the UAV.
the camera IMU, encoder, and UAV navigation system; and (d) the desired and actual approaching
speed of theAppl.
UAV.Sci. 2020, 10, x FOR PEER REVIEW 24 of 28

(a) (b)

(c) (d)
Figure 21. The actual position (a–c) and 3D trajectory (d) of the UAV and the target drone in the local
Figure 21. The NED
actual position (a–c) and 3D trajectory (d) of the UAV and the target drone in the local
frame.
NED frame.
Higher control rate and less image transmission delay would significantly improve the response
speed and the accuracy, if better hardware configuration were used. However, considering that all
the sensors and onboard devices are low-cost, the performance of the proposed tracking system is
very attractive in practical applications. A video of the experiments is available in the Supplementary
Materials section (Video S1).

6. Conclusions
In the presented work, we proposed an onboard visual tracking system which consists of a
gimbaled camera, an onboard computer for image processing and a microcontroller to control the
UAV to approach the moving target. Our system used a KCF-based algorithm to detect and track an
arbitrary object in real time, which has proved its efficiency and reliability in experiments. With the
Appl. Sci. 2020, 10, 5064 24 of 27

6. Conclusions
In the presented work, we proposed an onboard visual tracking system which consists of a
gimbaled camera, an onboard computer for image processing and a microcontroller to control the
UAV to approach the moving target. Our system used a KCF-based algorithm to detect and track an
arbitrary object in real time, which has proved its efficiency and reliability in experiments. With the
visual information, the 3-axis gimbal system autonomously aims at the selected target, which has
achieved good performance during real flights. The proposed system has been demonstrated through
real-time target tracking experiments, which enabled a low-cost quadrotor to chase a flying drone as
shown in the video.
Future work may include using a laser ranging module attached to the camera, which is able to
provide an accurate distance of the UAV with respect to the target. Even though this will increase
the cost of the system, we look forward to its potential applications such as target geo-location and
autonomous landing. Performance improvements could also be achieved by using deep learning-based
detection algorithms combined with a large number of sample images. The CMOS sensors used in
this paper are low-cost, which could lead to the effect of rolling shutter [69]. If the error is dramatic,
compensation should be made to handle this issue.

Supplementary Materials: The following are available online at http://www.mdpi.com/2076-3417/10/15/5064/s1,


Video S1: Gimbal-based Visual Tracking System for UAV.
Author Contributions: Conceptualization, X.L. and S.Z.; methodology, X.L. and Y.Y.; software, X.L. and C.M.;
validation, X.L. and Y.Y.; formal analysis, J.L.; investigation, C.M.; data curation, J.L.; funding acquisition, Y.Y. and
S.Z. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the Science and Technology Committee of China (02-ZT-005-021-01),
National Science Key Lab Project of the Technology of Space Intelligent Control (No. HTKJ2019KL5022016),
the Research Projects of Equipment (No. 2018-824), and the Support Program of Young Talents of Huxiang
(No. 2019RS2029), Chinese Postdoctoral Science Foundation (No. 47661), The sixth Youth Fund Project of High
Resolution Earth Observation System.
Acknowledgments: The authors would like to thank Rongwei Li, Qi Xiao, Xin Yi, and Ren Jin for their contribution
to the experiments.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Cai, G.; Dias, J.; Seneviratne, L. A Survey of Small-Scale Unmanned Aerial Vehicles: Recent Advances and
Future Development Trends. Unmanned Syst. 2014, 2, 175–199. [CrossRef]
2. Tomic, T.; Schmid, K.; Lutz, P.; Domel, A.; Kassecker, M.; Mair, E.; Grixa, I.L.; Ruess, F.; Suppa, M.; Burschka, D.
Toward a Fully Autonomous UAV: Research Platform for Indoor and Outdoor Urban Search and Rescue.
Robot. Autom. Mag. 2012, 19, 46–56. [CrossRef]
3. Judy, J.W. Microelectromechanical systems (MEMS): Fabrication, design and applications. Smart Mater.
Struct. 2001, 10, 1115–1134. [CrossRef]
4. Zhao, Z.; Yang, J.; Niu, Y.; Zhang, Y.; Shen, L. A Hierarchical Cooperative Mission Planning Mechanism for
Multiple Unmanned Aerial Vehicles. Electronics 2019, 8, 443. [CrossRef]
5. Laliberte, A.S.; Jeffrey, E.H.; Rango, A.; Winters, C. Acquisition, Orthorectification, and Object-based
Classification of Unmanned Aerial Vehicle (UAV) Imagery for Rangeland Monitoring. Photogramm. Eng.
Remote Sens. 2015, 76, 661–672. [CrossRef]
6. Hausamann, D.; Zirnig, W.; Schreier, G.; Strobl, P. Monitoring of gas pipelines - a civil UAV application.
Aircr. Eng. Aerosp. Technol. 2005, 77, 352–360. [CrossRef]
7. Bristeau, P.J.; Callou, F.; Vissière, D.; Petit, N. The Navigation and Control technology inside the AR.Drone
micro UAV. In Proceedings of the 18th World Congress of the International Federation of Automatic Control
(IFAC), Milano, Italy, 28 August–2 September 2011; pp. 1477–1484.
8. Tai, J.C.; Tseng, S.T.; Lin, C.P.; Song, K.T. Real-time image tracking for automatic traffic monitoring and
enforcement applications. Image Vis. Comput. 2004, 22, 485–501. [CrossRef]
Appl. Sci. 2020, 10, 5064 25 of 27

9. Chow, J.Y.J. Dynamic UAV-based traffic monitoring under uncertainty as a stochastic arc-inventory routing
policy. Int. J. Transp. Sci. Technol. 2016, 5, 167–185. [CrossRef]
10. Silvagni, M.; Tonoli, A.; Zenerino, E.; Chiaberge, M. Multipurpose UAV for search and rescue operations in
mountain avalanche events. Geomat. Nat. Hazards Risk 2017, 8, 18–33. [CrossRef]
11. Bejiga, M.B.; Zeggada, A.; Nouffidj, A.; Melgani, F. A Convolutional Neural Network Approach for Assisting
Avalanche Search and Rescue Operations with UAV Imagery. Remote Sens. 2017, 9, 100. [CrossRef]
12. Li, Z.; Liu, Y.; Walker, R.; Hayward, R.; Zhang, J. Towards automatic power line detection for a UAV
surveillance system using pulse coupled neural filter and an improved Hough transform. Mach. Vis. Appl.
2010, 21, 677–686. [CrossRef]
13. Sa, I.; Hrabar, S.; Corke, P. Inspection of Pole-Like Structures Using a Visual-Inertial Aided VTOL Platform
with Shared Autonomy. Sensors 2015, 15, 22003–22048. [CrossRef] [PubMed]
14. Chuang, H.-M.; He, D.; Namiki, A. Autonomous Target Tracking of UAV Using High-Speed Visual Feedback.
Appl. Sci. 2019, 9, 4552. [CrossRef]
15. Wang, F.; Cong, X.-B.; Shi, C.-G.; Sellathurai, M. Target Tracking While Jamming by Airborne Radar for Low
Probability of Detection. Sensors 2018, 18, 2903. [CrossRef] [PubMed]
16. Wang, Y.; Lei, H.; Ye, J.; Bu, X. Backstepping Sliding Mode Control for Radar Seeker Servo System Considering
Guidance and Control System. Sensors 2018, 18, 2927. [CrossRef]
17. Liu, X.; Zhang, S.; Tian, J.; Liu, L. An Onboard Vision-Based System for Autonomous Landing of a Low-Cost
Quadrotor on a Novel Landing Pad. Sensors 2019, 19, 4703. [CrossRef]
18. Yang, S.; Scherer, S.A.; Schauwecker, K.; Zell, A. Autonomous Landing of MAVs on an Arbitrarily Textured
Landing Site Using Onboard Monocular Vision. J. Intell. Robot. Syst. 2014, 74, 27–43. [CrossRef]
19. Wenzel, K.E.; Rosset, P.; Zell, A. Low-cost visual tracking of a landing place and hovering flight control with
a microcontroller. J. Intell. Robot. Syst. 2010, 57, 297–311. [CrossRef]
20. Saripalli, S.; Montgomery, J.F.; Sukhatme, G.S. Visually guided landing of an unmanned aerial vehicle.
IEEE Trans. Robot. Autom. 2003, 19, 371–380. [CrossRef]
21. Yang, S.; Scherer, S.A.; Zell, A. An onboard monocular vision system for autonomous takeoff, hovering and
landing of a micro aerial vehicle. J. Intell. Robot. Syst. 2013, 69, 499–515. [CrossRef]
22. Olson, E. AprilTag: A robust and flexible visual fiducial system. In Proceedings of the IEEE International
Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011; pp. 3400–3407.
23. Garrido-Jurado, S.; Muñoz-Salinas, R.; Madrid-Cuevas, F.J.; Marín-Jiménez, M.J. Automatic generation
and detection of highly reliable fiducial markers under occlusion. Pattern Recognit. 2014, 47, 2280–2292.
[CrossRef]
24. Kyristsis, S.; Antonopoulos, A.; Chanialakis, T.; Stefanakis, E.; Linardos, C.; Tripolitsiotis, A.; Partsinevelos, P.
Towards Autonomous Modular UAV Missions: The Detection, Geo-Location and Landing Paradigm. Sensors
2016, 16, 1844. [CrossRef] [PubMed]
25. Pestana, J.; Sanchez-Lopez, J.L.; de la Puente, P.; Carrio, A.; Campoy, P. A Vision-based Quadrotor Swarm
for the participation in the 2013 International Micro Air Vehicle Competition. In Proceedings of the 2014
International Conference on Unmanned Aircraft Systems (ICUAS), Orlando, FL, USA, 27–30 May 2014;
pp. 617–622.
26. Ma, L.; Hovakimyan, N. Vision-based cyclic pursuit for cooperative target tracking. In Proceedings of the
2011 American Control Conference, San Francisco, CA, USA, 29 June–1 July 2011; pp. 4616–4621.
27. Ross, D.A.; Lim, J.; Lin, R.S.; Yang, M.H. Incremental Learning for Robust Visual Tracking. Int. J. Comput. Vis.
2008, 77, 125–141. [CrossRef]
28. Mei, X.; Ling, H. Robust visual tracking using `1 minimization. In Proceedings of the IEEE International
Conference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009; pp. 1436–1443.
29. Li, H.; Shen, C.; Shi, Q. Real-time visual tracking using compressive sensing. In Proceedings of the 24th IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June
2011; pp. 1305–1312.
30. Wang, S.; Lu, H.; Yang, F.; Yang, M.H. Superpixel tracking. In Proceedings of the IEEE International
Conference on Computer Vision (ICCV), Barcelona, Spain, 6–13 November 2011; pp. 1323–1330.
31. Cheng, Y. Mean Shift, Mode Seeking, and Clustering. IEEE Trans. Pattern Anal. Mach. Intell. 1995, 17,
790–799. [CrossRef]
Appl. Sci. 2020, 10, 5064 26 of 27

32. Comaniciu, D.; Meer, P. Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern
Anal. Mach. Intell. 2002, 24, 603–619. [CrossRef]
33. Collins, R.T. Mean-shift blob tracking through scale space. In Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (CVPR), Madison, WI, USA, 16–22 June 2003;
pp. 234–240.
34. Bradski, G.R. Computer Vision Face Tracking for Use in a Perceptual User Interface. Intel Technol. J. 1998, 2,
12–21.
35. Babenko, B.; Yang, M.H.; Belongie, S. Robust Object Tracking with Online Multiple Instance Learning. IEEE
Trans. Pattern Anal. Mach. Intell. 2011, 33, 1619–1632. [CrossRef]
36. Hare, S.; Golodetz, S.; Saffari, A.; Vineet, V.; Cheng, M.M.; Hicks, S.L.; Torr, P.H. Struck: Structured Output
Tracking with Kernels. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 2096–2109. [CrossRef]
37. Grabner, H.; Bischof, H. Online Boosting and Vision. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (CVPR), New York, NY, USA, 17–22 June 2006; pp. 260–267.
38. Kalal, Z.; Mikolajczyk, K.; Matas, J. Tracking-Learning-Detection. IEEE Trans. Softw. Eng. 2011, 34, 1409–1422.
[CrossRef]
39. Wei, J.; Liu, F. Coupled-Region Visual Tracking Formulation Based on a Discriminative Correlation Filter
Bank. Electronics 2018, 7, 244. [CrossRef]
40. Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
(CVPR), San Francisco, FL, USA, 13–18 June 2010; pp. 2544–2550.
41. Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the Circulant Structure of Tracking-by-Detection
with Kernels. In Proceedings of the 12th European conference on Computer Vision, Florence, Italy, 7–13
October 2012; Springer: Berlin, Germany, 2012; pp. 702–715.
42. Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-Speed Tracking with Kernelized Correlation Filters.
IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 583–596. [CrossRef] [PubMed]
43. Wu, Y.; Lim, J.; Yang, M.H. Online Object Tracking: A Benchmark. In Proceedings of the Computer Vision
and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2411–2418.
44. Montero, A.S.; Lang, J.; Laganière, R. Scalable Kernel Correlation Filter with Sparse Feature Integration.
In Proceedings of the IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 7–13
December 2015; pp. 587–594.
45. Zhang, Y.H.; Zeng, C.N.; Liang, H.; Luo, J.; Xu, F. A visual target tracking algorithm based on improved
Kernelized Correlation Filters. Proceedings of 2016 IEEE International Conference on Mechatronics and
Automation, Harbin, China, 7–10 August 2016; pp. 199–204.
46. Santos, D.A.; Gonçalves, P.F.S.M. Attitude Determination of Multirotor Aerial Vehicles Using Camera Vector
Measurements. J. Intell. Robot. Syst. 2016, 86, 1–11. [CrossRef]
47. Falanga, D.; Zanchettin, A.; Simovic, A.; Delmerico, J.; Scaramuzza, D. Vision-based Autonomous Quadrotor
Landing on a Moving Platform. In Proceedings of the 15th IEEE International Symposium on Safety, Security,
and Rescue Robotics (SSRR), Shanghai, China, 11–13 October 2017.
48. Borowczyk, A.; Nguyen, D.T.; Phu-Van Nguyen, A.; Nguyen, D.Q.; Saussié, D.; Le Ny, J. Autonomous
Landing of a Multirotor Micro Air Vehicle on a High Velocity Ground Vehicle. arXiv 2016, arXiv:1611.07329.
Available online: http://arxiv.org/abs/1611.07329 (accessed on 26 May 2020).
49. Quigley, M.; Goodrich, M.A.; Griffiths, S.; Eldredge, A.; Beard, R.W. Target Acquisition, Localization, and
Surveillance Using a Fixed-Wing Mini-UAV and Gimbaled Camera. In Proceedings of the 2005 IEEE
International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 2600–2605.
50. Jakobsen, O.C.; Johnson, E. Control Architecture for a UAV-Mounted Pan/Tilt/Roll Camera Gimbal.
In Proceedings of the Infotech@Aerospace, AIAA 2005, Arlington, VA, USA, 26–29 September 2005;
p. 7145.
51. Whitacre, W.; Campbell, M.; Wheeler, M.; Stevenson, D. Flight Results from Tracking Ground Targets
Using SeaScan UAVs with Gimballing Cameras. In Proceedings of the 2007 American Control Conference,
New York, NY, USA, 11–13 July 2007; pp. 377–383.
52. Dobrokhodov, V.N.; Kaminer, I.I.; Jones, K.D.; Ghabcheloo, R. Vision-based tracking and motion estimation
for moving targets using small UAVs. In Proceedings of the 2006 American Control Conference, Minneapolis,
MN, USA, 14–16 June 2006.
Appl. Sci. 2020, 10, 5064 27 of 27

53. Johansson, J. Modelling and Control of an Advanced Camera Gimbal. Teknik Och Teknologier, 2012. Available
online: http://www.diva-portal.org/smash/get/diva2:575341/FULLTEXT01.pdf (accessed on 26 May 2020).
54. Kazemy, A.; Siahi, M. Equations of Motion Extraction for a Three Axes Gimbal System. Modares J. Electr. Eng.
2014, 13, 37–43.
55. Barnes, F.N. Stable Member Equations of Motion for a Three-Axis Gyro Stabilized Platform. IEEE Trans.
Aerosp. Electron. Syst. 1971, 7, 830–842. [CrossRef]
56. Otlowski, D.R.; Wiener, K.; Rathbun, B.A. Mass properties factors in achieving stable imagery from a gimbal
mounted camera. In Airborne Intelligence, Surveillance, Reconnaissance (ISR) Systems and Applications 2008;
International Society for Optics and Photonics: Orlando, FL, USA, 2008; Volume 6946.
57. Rifkin, R.; Yeo, G.; Poggio, T. Regularized Least-Squares Classification. In Nato Science Series Sub Series III
Computer and Systems Sciences; IOS Press: Amsterdam, The Netherlands, 2003; Volume 190, pp. 131–154.
58. Islam, M.M.; Hu, G.; Liu, Q. Online Model Updating and Dynamic Learning Rate-Based Robust Object
Tracking. Sensors 2018, 18, 2046. [CrossRef]
59. Pestana, J.; Mellado-Bataller, I.; Sanchez-Lopez, J.L.; Fu, C.; Mondragon, I.F.; Campoy, P. A General Purpose
Configurable Controller for Indoors and Outdoors GPS-Denied Navigation for Multirotor Unmanned Aerial
Vehicles. J. Intell. Robot. Syst. 2014, 73, 387–400. [CrossRef]
60. Tan, R.; Kumar, M. Tracking of Ground Mobile Targets by Quadrotor Unmanned Aerial Vehicles.
Unmanned Syst. 2014, 2, 157–173. [CrossRef]
61. Guelman, M. The Closed-Form Solution of True Proportional Navigation. IEEE Trans. Aerosp. Electron. Syst.
1976, 12, 472–482. [CrossRef]
62. Yuan, P.J.; Chern, J.S. Ideal proportional navigation. Adv. Astronaut. Sci. 1992, 95, 81374. [CrossRef]
63. Mahapatra, P.R.; Shukla, U.S. Accurate solution of proportional navigation for maneuvering targets.
IEEE Trans. Aerosp. Electron. Syst. 1989, 25, 81–89. [CrossRef]
64. Park, S.; Deyst, J.; How, J. A new nonlinear guidance logic for trajectory tracking. In Proceedings of the
AIAA Guidance, Navigation, and Control Conference and Exhibit, Providence, RI, USA, 16–19 August 2004.
65. Mathisen, S.H.; Gryte, K.; Johansen, T.; Fossen, T.I. Non-linear Model Predictive Control for Longitudinal
and Lateral Guidance of a Small Fixed-Wing UAV in Precision Deep Stall Landing. In Proceedings of the
AIAA SciTech, San Diego, CA, USA, 4–8 January 2016.
66. Pixhawk. Available online: https://pixhawk.org/ (accessed on 20 May 2020).
67. Mavros. Available online: https://github.com/mavlink/mavros/ (accessed on 20 May 2020).
68. Ros. Available online: http://www.ros.org/ (accessed on 20 May 2020).
69. Mathisen, S.H.; Gryte, K.; Johansen, T.; Fossen, T.I. Cross-Correlation-Based Structural System Identification
Using Unmanned Aerial Vehicles. Sensors 2017, 17, 2075.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like