Semantic Video Mining For Accident Detection
Semantic Video Mining For Accident Detection
Semantic Video Mining For Accident Detection
ISSN No:-2456-2165
Abstract:- This paper depicts the efficient use of CCTV for the system will try to extract the features of the vehicles.
traffic monitoring and accident detection. The system which The features like length, width and centroid are extracted to
is designed has the capability to classify the accident and classify the vehicle accordingly. The vehicle count is also
can give alerts when necessary. Nowadays we have CCTVs detected, which can be used for traffic congestion control.
on most of the roads, but its capabilities are being
underused. There also doesn’t exist an efficient system to Keywords:- YOLO V3, SSD , Faster RCNN , RCNN.
detect and classify accidents in real time. So many deaths
occur because of undetected accidents. It is difficult to I. INTRODUCTION
detect accidents in remote places and at night. The
proposed system can identify and classify accidents as Population is increasing day by day. Along with the
major and minor. It can automatically alert the authorities increase in population, the number of vehicles are also
if it deals with a major accident. Using this system the increasing. It is known that the present traffic management
response time on accident can be decreased by processing system is not efficient. Millions of people die in road accidents
the visuals of CCTV. every year. This is not only because of the increase in the
number of vehicles. There doesn’t exist any proper system to
In this system different image processing and machine detect accidents and to alert the authorities. The higher
learning techniques are used. The dataset for training is response time for the arrival of the emergency system causes
extracted from the visuals of already occurred accidents. many precious lives. Normally the road accidents are reported
Accidents mainly occur because of careless driving, alcohol by the people near the accident. Many of the cases the people
consumption and over speeding. Another main cause of who witness the accident are not willing to alert the authorities
death due to accidents are the delay in reporting and instead they are busy taking selfies. These types of
accidents since there doesn’t exist any automated systems. negligence are causing precious lives. Also we have CCTVs
Accidents are mainly reported by the public or by traffic installed on most of the roads. But the CCTV’s are not used
authorities. We can save many lives by detecting and efficiently. In the modern era where the technology is growing
reporting the accident quickly. In this system live video is faster we are still dependent on human power for traffic
captured from the CCTV’s and it is processed to detect monitoring.Since the number of traffic authorities is low and
accidents. In this system the YOLOV3 algorithm is used for the number of vehicle users is high it is difficult to control them.
object detection. Nowadays traffic monitoring has a greater Many people lose their life because of undetected accidents. It
significance. CCTV’s can be used to detect accidents since it is difficult to monitor vehicles all the time for humans. But it is
is present in most of the roads. It is only used for traffic easy and possible by using CCTV’s. The proposed system uses
monitoring. Normally accidents can be classified as two CCTV for traffic monitoring and accident detection with less
classes major and minor. The proposed system is able to human interventions.
classify the accident as major or minor by object detection
and tracking methodologies. Every accident doesn’t need The system captures live video from CCTV and processes
emergency support. Only major accidents must be handled it to detect accidents in real time. Surveillance cameras are
quickly. The proposed system captures the video and installed in most of the roads. This is mounted on a pole which
undergo object detection algorithms to identify the different can give clear vision of vehicles on the road. Present system
objects like vehicles and people. After the detection phase uses these visuals to monitor and control the traffic manually.
Table 1:- Comparison of Yolov3 With Other State-Of-The- C. Feature Extraction Module
Art-Models [35] In the feature extraction module the required features are
extracted from the set of features. The informations such as
YOLOV3 will predict the objects faster than SSD but overlapping between vehicles, stopping, velocity, differential
SSD stands a bit higher in terms of accuracy. When comparing motion vector of vehicles, and direction of each vehicle are
with other algorithms YOLOV3 will rank higher in terms of mainly considered. if more than one vehicle are detected in a
speed. A comparison between the performance of different frame, we cannot consider single object features. In this case
algorithms on the COCO dataset is shown in the below each pair of objects are considered and above mentioned factors
figure. are examined and calculate the probability of accidents.
If the user does not cancel the notification on the server, Faster R-CNN is a network where object detection is
the user’s location, coordinates and timestamp during the crash faster. Region proposal problems solution is found out by this
is sent to the server. The server accepts these coordinates, network. Compared to all models its computation speed is very
assuming the user is unconscious. The server can be in any less. So the image resolution is less than the input original
remote location and provide the appropriate service. It must be image.
available at all times. Since the amount of data sent and
received at any given time is very small, no additional costs YOLO-You Only Look Once is another object detection.
should be applied to ensure availability at all times. In an image what are the objects and where the images are
present can be detected by only one look at the image. Instead
The server contains a database of all hospital IP of classification YOLO uses regression. And it can separate the
addresses. Under the operation of the system and mechanism to bounding boxes and class probabilities for every part in an
identify the nearest hospital based on the coordinates obtained. image. Within a single analysis it can predict the bounding
After identifying the nearest hospital, the server notifies the boxes with class probabilities, only a single network is used for
hospital about the user’s coordinates (location). this process. A single CNN can predict multiple bounded boxes
and also weights are given for these bounding boxes.
The Hospital receives a notification from the server about
the user’s location. The hospital uses a graphical user interface
to display the user’s location coordinates. Hospital operators YOLOv3 is different as it is using logistic regression to
can easily map coordinates on a map. This way the victim can find the object within bounding boxes. If the ground truth is
provide medical emergency services in a short time. This can overlapped with the bounding box which is greater than any
reduce time and mortality rate. other bounding box then object score must be one. If a
bounding box overlaps the ground truth by a threshold which is
IV. EXPERIMENTAL EVALUATION not best, that type of predictions can be disregarded.
We propose a model that detects accidents from the video A. CNN vs YOLO
footage and inform the authorities about the accident. Here we Both Faster R-CNN and YOLO consider its core as
are extracting the accident images from the cctv footage. The Convo- lutional Neural Network. YOLO partitions the image
extraction of images comes under the field of computer vision before using CNN for processing the image whereas RCNN
along with image processing. Mainly object detection is used keeps the whole image as such and only division of proposals
to focus an object based on its size, coordinates and categorize take place later. In YOLO the image is partitioned into grids.
them to various fields. By object detection we can get two
dimensional images which will provide us more details about
space, size, orientation etc.