Generic Visual Perception

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 19
At a glance
Powered by AI
The key takeaways are that the GVPP is an artificial visual perception processor that was developed to mimic the human visual system. It can detect and track multiple moving objects in real-time from video streams.

The GVPP (Generic Visual Perception Processor) is an artificial visual perception processor that was developed to mimic the human visual system. It can detect and track multiple moving objects in real-time from video streams and has potential applications across many industries.

The GVPP works by mimicking the separate temporal and spatial processing of the human brain-eye system with layers of parallel-processing neurons. It processes images as histograms of object locations and velocities to detect and track objects.

Generic visual perception processor

INTRODUCTION
While computing technology is growing in leaps and bounds, the human brain continues to be the world's fastest computer. Combine brain-power with seeing power, and you have the fastest, cheapest, most extra ordinary processor ever-the human eye. Little wonder, research labs the world over are striving to produce a nearperfect electronic eye. The 'generic visual perception processor !"##$' has been developed after %& long years of scientific effort. !eneric "isual #erception #rocessor !"##$ can automatically detect ob'ects and trac( their movement in real-time . The !"##, which crunches )& billion instructions per second *+#,$, models the human perceptual process at the hardware level by mimic(ing the separate temporal and spatial functions of the eye-to-brain system. The processor sees its environment as a stream of histograms regarding the location and velocity of ob'ects. !"## has been demonstrated as capable of learning-in-place to solve a variety of pattern recognition problems. +t boasts automatic normali-ation for varying ob'ect si-e, orientation and lighting conditions, and can function in daylight or dar(ness.

This electronic .eye. on a chip can now handle most tas(s that a normal human eye can. That includes driving safely, selecting ripe fruits, reading and recogni-ing things. ,adly, though modeled on the visual perception capabilities of the human brain, the chip is not really a medical marvel, poised to cure the blind.

College of applied sciences, Mavelikkara

Generic visual perception processor

BAC GROUND O! T"# IN$#NTION


The invention relates generally to methods and devices for automatic visual perception, and more particularly to methods and devices for processing image signals using two or more histogram calculation units to locali-e one or more ob'ects in an image signal using one or more characteristics an ob'ect such as the shape, si-e and orientation of the ob'ect. ,uch devices can be termed an electronic spatio-temporal neuron, and are particularly useful for image processing, but may also be used for other signals, such as audio signals. The techni/ues of the present invention are also particularly useful for trac(ing one or more ob'ects in real time.

+t is desirable to provide devices including combined data processing units of a similar nature, each addressing a particular parameter extracted from the video signal. +n particular, it is desirable to provide devices including multiple units for calculating histograms, or electronic spatio-temporal neuron ,T0, each processing a 12T2 2$, by a function in order to generate individually an output value. The present invention also provides a method for perception of an ob'ect using characteristics, such as its shape, its si-e or its orientation, using a device composed of a set of histogram calculation units. 2

College of applied sciences, Mavelikkara

Generic visual perception processor

3sing the techni/ues of the present invention, a general outline of a moving ob'ect is determined with respect to a relatively stable bac(ground, then inside this outline, elements that are characteri-ed by their tone, color, relative position etc. are determined. .

%OTI#NTIA& 'IG"T#D
The !"## was invented in 1992, by *4" founder #atric #irim . +t would be relatively simple for a C56, chip to implement in hardware the separate contributions of temporal and spatial processing in the brain. The brain-eye system uses layers of parallel-processing neurons that pass the signal through a series of preprocessing steps, resulting in real-time trac(ing of multiple moving ob'ects within a visual scene. #irim created a chip architecture that mimic(ed the wor( of the neurons, with the help of multiplexing and memory. The result is an inexpensive device that can autonomously .perceive. and then trac( up to eight user-specified ob'ects in a video stream based on hue, luminance, saturation, spatial orientation, speed and direction of motion. The !"## trac(s an .ob'ect,. defined as a certain set of hue, luminance and saturation values in a specific shape, from frame to frame in a video stream by anticipating where it7s leading and trailing edges ma(e .differences. with the bac(ground. That means it can trac( an ob'ect through varying light sources or changes in si-e, as when an ob'ect gets closer to the viewer or moves farther away. The !"##7, ma'or performance strength over current-day vision systems is its adaptation to varying light conditions. Today7s vision systems dictate uniform shadow less illumination ,and even next generation prototype systems, designed to wor( under 8normal9 lighting conditions, can be used only dawn to dus(. The !"## on the other hand, adapt to real time changes in lighting without recalibration, day or light.

College of applied sciences, Mavelikkara

Generic visual perception processor


:or many decades the field of computing has been trapped by the limitations of the traditional processors. 5any futuristic technologies have been bound by limitations of these processors .These limitations stemmed from the basic architecture of these processors. Traditional processors wor( by slicing each and every complex program into simple tas(s that a processor could execute. This re/uires an existence of an algorithm for solution of the particular problem. *ut there are many situations where there is an inexistence of an algorithm or inability of a human to understand the algorithm. 4ven in these extreme cases !"## performs well. +t can solve a problem with its neural learning function. 0eural networ(s are extremely fault tolerant. *y their design even if a group of neurons get, the neural networ( only suffers a smooth degradation of the performance. +t won7t abruptly fail to wor(. This is a crucial difference, from traditional processors as they fail to wor( even if a few components are damaged. !"## recogni-es stores , matches and process patterns. 4ven if pattern is not recogni-able to a human programmer in input the neural networ(, it will dig it out from the input. Thus !"## becomes an efficient tool for applications li(e the pattern matching and recognition.

"O( IT (OR '


*asically the chip is made of neural networ( modeled resembling the structure of human brain. The basic element here is a neuron. There are large number of input lines and an output line to a neuron. 4ach neuron is capable of implementing a simple function. +t ta(es the weighted sum of its inputs and produces an output that is fed into the next layer. The weights assigned to each input are a variable /uantity. 2 large number of such neurons interconnected form a neural networ(. 4very input that is given to the neural networ( gets transmitted over entire networ( via direct connections called synaptic connections and feed bac( paths. Thus the signal ripples in the neural networ(, every time changing the weighted values associated with each input of every neuron. These changes in the ripples will naturally direct the weights to modify 4

College of applied sciences, Mavelikkara

Generic visual perception processor


into those values that will become stable .That is, those values does not change. 2t this point the information about the signal is stored as the weighted values of inputs in the neural networ(. 2 neural networ( geometri-es computation. When we draw the state diagram of a neural networ(, the networ( activity burrows a tra'ectory in this state space. The tra'ectory begins with a computation problem. The problem specifies initial conditions which define the beginning of tra'ectory in the state space. +n pattern learning, the pattern to be learned defines the initial conditions. Where as in pattern recognition, the pattern to be recogni-ed defines the initial conditions. 5ost of the tra'ectory consists of transient behavior or computations. The weights associated with inputs gradually change to learn new pattern information. The tra'ectory ends when the system reaches e/uilibrium. This is the final state of the neural networ(. +f the pattern was meant to be matched, the final neuronal state represents the pattern that is closest match to the input pattern. +nput to the electronic eye can come from video, infrared or radar signals. "ideo signal , is composed of a succession of frames, wherein each frame includes a succession of pixels whose assembly forms a space, for example an image for a twodimensional space. ;eal-time outputs perceive, recogni-e and analy-e both static images and time-varying patterns for specific ob'ects, their heading, speed, shading and color differences. *y mimic(ing the eye and the visual regions of the brain, the !"## puts together the salient features necessary for recognition. ,o instead of capturing frames of pixels, the chip identifies ob'ects of interest, determines each ob'ect's speed and direction, then follows them by trac(ing their color through the scene. The chip emulates the eye, which has < million cones sensitive to color, only %< per cent of which see .blue. the rest are red and green$ and %=& million monochromatic rods that are >< times more sensitive than the cones through two processing steps-tonic and phasic. The chip mimics the human eye7s two processing steps, tonic and phasic .Tonic processing auto-scales according to ambient light conditions, enabling it to adapt to a 5

College of applied sciences, Mavelikkara

Generic visual perception processor


range of luminosity. #hasic processing determines movement by using local variable in the feedbac( loops loops. .2s light7s edges pass over the cones and rods and cones of the eye, these local feedbac( loops detect contrast changes caused by ob'ects moving through the scene :or detecting smooth contours, rather than sharp contrast changes, the eye adds ocular movement. The eye typically sweeps a scene about two to three times a second as well as ma(ing vibratory movements at about %&& ?- . The faster 'itter accounts for the eye-sensitivity to the smallest detectable feature, which is an edge moving between ad'acent rods or cones. 2fter all this processing the visual signal is then sent to the brain for higher-level observation and recognition tas(s. *ecause only detected movements and color along with the shape and contour of ob'ects is sent to the brain, rather than raw pixels, the average compression ratio information is about %=<. :or the future, #irim is wor(ing on a .visual. mouse for hand-gesture interface to computers that ta(e advantage of that high compression ratio.

T"# %ROC#''ING D#TAI&'


The electronic eye follows exactly the same theoretical processing steps of the real eye with hard-wired silicon circuitry around each pixel in its sensor array. 2 sensor array is a set of several sensors that an information gathering device uses to gather information usually directional in nature$ that cannot be gathered from a single source for a central processing unit. 4ach pixel is read by the vision chip with hardware that determines and scales luminescence, trac(s color, remembers movement in the previous moment, recalls the direction of previous movement, and then deduces the speed of the various detected ob'ects from parallel phasic and tonic neural circuitry. *asically, each parameter has an associated neuron that handles its processing tas(s in parallel. +n addition, each pixel has two auxiliary neurons that define the -one in which an ob'ect is located-that is, from the direction in which an ob'ect is moving, these neurons deduce the leading and trailing edge of the ob'ect and mar( with registers 6

College of applied sciences, Mavelikkara

Generic visual perception processor


associated with the first leading-edge$ and last trailing-edge$ pixel belonging to the ob'ect. 4ach of these silicon neurons is built with ;25, a few registers, an adder and a comparator. ,upplied as a %&&-pin module, the chip accommodates analogue-input line levels for video input, with an input amplifier with programmable gain auto scaling the signal. The modules measure =& s/uare mm, have %&& pins and can handle )&-5?video signals. The chip is priced at @AB&. 2 card with a soc(eted !"## and 64 Cbytes of :lash ;25 comes at @1,500.

DI$ID# AND CON)U#R


,ince processing in each module on the !"## runs in parallel out of its own memory space, multiple !"## chips can be cascaded to expand the number of ob'ects that can be recogni-ed and trac(ed. When set in master-slave mode, any number of !"## chips can divide and con/uer, for instance, complex stereoscopic vision applications.

'O!T(AR# A'%#CT'
6n the software side, a host operating system running on an external #C communicates with the !"##'s evaluation board via an 6, (ernel within the on-chip microprocessor. *4" dubs the neural-learning capability of its development environment .programming by seeing and doing,. because of its ease of use. The engineer needs no (nowledge of the internal wor(ings of the !"##, the company said, only application-specific domain (nowledge. .#rogramming the !"## is as simple as setting a few registers, and then testing the results to gauge the application's success,. said ,teve ;owe, *4"'s director of research and development. .6nce debugged, these tiny application programs are loaded directly into the !"##'s internal ;65.. 2pplication programs themselves can use CDD, which ma(es calls to a library of assembly language algorithms for visual perception and trac(ing of ob'ects. The 7

College of applied sciences, Mavelikkara

Generic visual perception processor


system's modular approach permits the developer to create a hierarchy of application building bloc(s that simplify problems with inheritable software characteristics. *ased on the neural networ( learning functions, the chip allows application software to be developed /uic(ly through a combination of simple operational commands and immediate feedbac(. ,imple applications such as detecting and trac(ing a single moving ob'ect can be programmed in less than a day. 4ven complex applications such as detecting a driver falling sleep can be programmed in a month. The ability of a neural networ( to learn from the data is perhaps its greatest ability. This learning of neural networ( can be described as its configuration such that the application of a set of inputs produces the desired set of outputs. "arious methods to set the strengths of the connections exist. 6ne way is to set the synaptic connection explicitly, using a priori (nowledge. 2nother way is to EtrainEthe neural networ( by feeding it teaching patterns and letting it change its weights according to some learning rule.

R#COGNITION TA'
+n applications, each pixel may be described with respect to any of the six domains of information available to it F hue, luminance, saturation, speed, direction of motion and spatial orientation. The !"## further subcategori-es pixels by ranges, for instance luminance within %& percent and B< percent, hue of blue, saturation between )& and )< percent, and moving upward in scene. 2 set of second-level pattern recognition commands permits the !"## to search for different ob'ects in different parts of the scene F for instance, to loo( for a closed eyelid only within the rectangle bordered by the corners of the eye. ,ince some applications may also re/uire multiple levels of recognition, the !"## has software hoo(s to pass along the recognition tas( from level to level. :or instance, to detect when a driver is falling asleep F a capability that could find use in California, which is about to mandate that cars sound an .alarm. when drowsy drivers begin to nod off F the !"## is first programmed to detect the driver's 8

College of applied sciences, Mavelikkara

Generic visual perception processor


head, for which it creates histograms of head movement. The microprocessor reads these histograms to identify the area for the eye. Then the recognition tas( passes to the next level, which searches only within the eye area rectangles. ?igh-speed movement there, normally indicative of blin(ing, is discounted, but when blin(s become slower than a predetermined level, they are interpreted as the driver nodding off, and trigger an alarm.

G$%% ARC"IT#CTUR#
The chip houses 23 neural bloc(s, both temporal and spatial, each consisting of )& hardware input and output .synaptic. connections. The !"## multiplexes this neural hardware with off-chip scratchpad memory to simulate as many as %&&,&&& synaptic connections per neuron. 4ach of these synapses can be changed through the on-chip microprocessor for a combined processing total of over B.) billion synaptic connectionsGsecond. +n executing up to )& *+#, to analy-e successive frames of a video stream, the temporal neurons identify pixels that have changed over time and generate a >-bit value indicative of the magnitude of that change. The spatial-processing system analy-es the resulting .difference. histogram to calculate the speed and direction of the motion. ?istogram is a bar chart of the count of pixels of every tone of gray that occurs in the image. +t helps us analy-e, and more importantly, correct the contrast of the image. Technically, the histogram maps Luminance, which is defined from the way the human eye perceives the brightness of different colors. :or example, our eyes are most sensitive to greenH we see green as being brighter than we see blue. Luminance weighs the effect of this to indicate the actual perceived brightness of the image pixels due to the color components.

College of applied sciences, Mavelikkara

Generic visual perception processor

:ig A.% ;epresentation of !"##

MU&TI%&# %#RC#%TION'
The vision perception processor chip processes motion images in real-time, and is able to perceive and trac( ob'ects based on combinations of hue, luminance, speed, direction of motion and spatial orientation. The chip has three functionsI temporal processing, spatial processing and histogram processing. +n applications, the vision processor called generic visual perception processor !"##$ is used with a microprocessor, flash memory and peripherals. +n a case of detection of a driver falling sleep, the processor core detects movement of the eyelids. :irst, the driver is identified, and then the microprocessor directs the vision processor to search within the corner points of a rectangular area in which the nose of the driver would be expected to be located, based on a model, and to select pixels having characteristics of the shadows of nostrils. The results are calculated to the end of 10

College of applied sciences, Mavelikkara

Generic visual perception processor


the image frame being analy-ed, and the microprocessor analy-es the resulting histograms to determine characteristics indicative of nostrils, such as the spacing and shaping and shape of the nostrils. Then, the microprocessor directs the vision processor to isolate a rectangular area in which the eyes of the driver are expected to be located, based on the model, and to select pixels having characteristics of high-speed movement normally indicative of blin(ing. The microprocessor analy-es over time the histograms ta(en of the eye area to determine the duration of each blin( and the interval between blin(s. :rom the blin( duration and interval information, the microprocessor is able to determine when the driver is falling asleep and triggers an alarm accordingly. The highly parallel internal design of the vision processor allows it to perform over )& billion instructions per second *+#,$, and the internal three functions play the following roles. Temporal processing involves the processing of successive frames of an image to smooth the pixels of the image over time to prevent interference affecting ob'ect detection. ,patial processing involves the processing of pixels within a locali-ed area of the images to determine the speed and direction of movement of each pixel. ;eceiving the information about speed and direction of each pixel from spatial and temporal processors, the histogram processor perceives ob'ects in the images that have undergone significant change over time and the magnitude of the changes. The chip is able to detect and trac( eight ob'ects per frame, meaning )=& ob'ects per second in a >& framesGs systems such as T" or high definition T"s ?1T"$. The vision processor includes a total of )> spatio-temporal neural bloc(s, each having approximately )& different input and output connections. 4ach connection can be controlled through the on-chip microprocessor. The vision processor provides a mechanism to reconfigure the spatio-temporal neural bloc(s to perform a wide variety of vision processing, such as moving ob'ect detection.

11

College of applied sciences, Mavelikkara

Generic visual perception processor

:ig %&.% ;epresentation of histogram calculation unit

12

College of applied sciences, Mavelikkara

Generic visual perception processor

AD$ANTAG#' AND DI'AD$ANTAG#'


AD$ANTAG#'
1. +mitating the human eye's neural networ(s and the brain, the !"## can handle some )& billion instructions per second, compared to a mere millions handled by #entium-class processors of today. 2. !"## has been demonstrated as capable of learning-in-place to solve a variety of pattern recognition problems. 3. +t has automatic normali-ation for varying ob'ect si-e, orientation, and lighting conditions, and can function in daylight or dar(ness. 4. .+t is an inexpensive device that can autonomously .perceive. and then trac( up to eight user-specified ob'ects in a video stream. 5. ,ince processing in each module on the !"## runs in parallel out of its own memory space, multiple !"## chips can be cascaded to expand the number of ob'ects that can be recogni-ed and trac(ed. 6. The engineer needs no (nowledge of the internal wor(ings of the !"##, the company said, only application-specific domain (nowledge. 7. ,imple applications can be /uic(ly prototyped in a few days, with medium-si-e applications ta(ing a few wee(s and even big applications only a couple of months 8. The chip could be useful across a wide variety of industries where visual trac(ing is important 13

College of applied sciences, Mavelikkara

Generic visual perception processor

The !"##'s ma'or performance strength over current-day vision systems is its automatic adaptation to varying lighting conditions. Today's vision systems dictate uniform, shadow less illumination, and even next-generation prototype systems, designed to wor( under .normal. lighting conditions, can be used only from dawn to dus(. The !"##, on the other hand, adapts to real-time changes in lighting without recalibration, day or night.

DI'AD$ANTAG#'
,adly, though modeled on the visual perception capabilities of the human brain, the chip is not really a medical marvel, poised to cure the blind. The chip is an industrial, rather than medical, invention and is not aimed at con/uering human blindness. *ut with do-ens of potential applications in a myriad of industries, its inventors are confident that the !"## will /uic(ly blossom into a multibillion dollar business.

14

College of applied sciences, Mavelikkara

Generic visual perception processor

A%%&ICATION'
*4" lists possible applications for the !"## in process monitoring, /uality control and assemblyH automotive systems such as intelligent air bags that monitor passenger si-e and traffic congestion monitorsH pedestrian detection, license plate recognition, electronic toll collection, automatic par(ing management, automatic inspectionH and medical uses including disease identification. The chip could also prove useful in unmanned air vehicles, miniature smart weapons, ground reconnaissance and other military applications, as well as in security access using facial, iris, fingerprint, or height and gait identification

*+ Auto,otive industr?ere, !"##-type devices could (eep .watch. and trigger off an alarm if the driver were to nod off to sleep at the wheel. 6r act as a monitor for the car's progress to ma(e sure it does not get too close to the curb. 2lso with transportation, !"## could be used in developing systems for collision avoidance, automatic cruise control, smart air bag systems, license plate recognition, measurement of traffic flow, electronic toll collection, automatic cargo trac(ing, par(ing management and the inspection of crac(s in rails and tunnels. 2nother automotive application is warning erratic drivers. !"## does so by monitoring the left and right lanes of the road, signaling the driver when the vehicle deviates from the proscribed lane.

.+ Ro/otics
+n manufacturing, !"## have applications in robotics, particularly for dirty and dangerous 'obs such as feeding hot parts to forging presses, cleaning up ha-ardous waste, and spraying toxic coatings on aircraft parts.

15

College of applied sciences, Mavelikkara

Generic visual perception processor 0+ Agriculture and fis1eries


+n agriculture and fisheries engineering, !"## could help with tas(s that traditionally have been labor intensive such as disease and parasite identification, harvest control, ripeness detection and yield identification.

2+ Militar- applications
The chip can be used in other important pattern-recognition applications, such as military target ac/uisition and fire control.. 5ilitary applications include unmanned air vehicles, automatic target detection, tra'ectory correction, ground reconnaissance and surveillance. "ision systems are useful in home security, as well, and !"##-based systems could be inexpensively developed to detect intruders and fires. +n addition to the above applications, !"## should be able to wor( as medical scanners, blood analy-ers, cardiac monitoring, ban( chec(s, bar code reading, seal and signature verification, trademar( database indexing, construction of virtual reality environment models, human motion analysis, expression understanding, cloud identification, and many other fields.

16

College of applied sciences, Mavelikkara

Generic visual perception processor

PRELIMINARY INFORMATION OF GVPP-7B Generic Visual Perception Processor

2rray :ormat max$I J&&?xB&&" :rame rateI &-%&& "!2 frames per second progressive-scan +nterface 5odeI 5asterG,lave 1ata ;ate max$I =& 5egapixel per second 1ynamic ;angeI %&-bits, > channels #arametersI Luminance, ?ue, ,aturation, 5otion orientation, velocity$, 6riented edges, lines, curves, corners ComputationI B= ,T0 bloc(s 5ulti-scales possibilities optional$ 5ulti-chips connections capabilities ,emi-!raphic interface visuali-ation !3+ with mouse #2L, 0T,C, "!2, visuali-ation +nternal 6, C language programmation #C+, #,), +)C, ;,)>), #W5 &.> Watt, >,>"olts for #2LG0T,C format -=&,DJ< C Complete application in one ,i# %.x%.I C-56, +mager, !"##, %&5bits ";25, % 5bits ,erial :lash 5emory,)K 5?- Cristal.

17

College of applied sciences, Mavelikkara

Generic visual perception processor

!UTUR# 'CO%#'
:or future, scientists are wor(ing on a 8visual mouse9 for handgesture interface to computers that ta(e advantage of that high compression ratio. :uture studies also involve using this processor as an eye of the robots, which provides tremendous applications.

CONC&U'ION'

+mitating the human eye's neural networ(s and the brain, the generic visual perception processor can handle about )& billion instructions per second, and can manage most tas(s performed by the eye. 5odeled on the visual perception capabilities of the human brain, the !"## is a single chip that can detect ob'ects in a motion video signal and then to locate and trac( them in real time far more dependably than competing systems, which cost far more, according to company scientists. This is a generic chip, and we've already identified more than %&& applications in ten or more industries. The chip could be useful across a wide swathe of industries where visual trac(ing is important.

18

College of applied sciences, Mavelikkara

Generic visual perception processor

R#!#RCNC#'

L1M httpIGGwww.patentstorm.us L2M httpIGGwww.techweb.com L3M *art Cos(o., 0eural networ(s and fu--y systems, 2ddison Wesley L=MhttpIGGwww.pixelin(.comGsupport L<MhttpIGGgvpp.colorado.eduGcl.res.ppt.htm

19

College of applied sciences, Mavelikkara

You might also like