Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR)... more Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-theart face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network (MTDN) devised for super-resolving unaligned and very small face images of different resolutions ranging from 16×16 to 32×32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
Given a tiny face image, existing face hallucination methods aim at super-resolving its high-reso... more Given a tiny face image, existing face hallucination methods aim at super-resolving its high-resolution (HR) counterpart by learning a mapping from an exemplary dataset. Since a low-resolution (LR) input patch may correspond to many HR candidate patches, this ambiguity may lead to distorted HR facial details and wrong attributes such as gender reversal and rejuvenation. An LR input contains low-frequency facial components of its HR version while its residual face image, defined as the difference between the HR ground-truth and interpolated LR images, contains the missing high-frequency facial details. We demonstrate that supplementing residual images or feature maps with additional facial attribute information can significantly reduce the ambiguity in face super-resolution. To explore this idea, we develop an attribute-embedded upsampling network, which consists of an upsampling network and a discriminative network. The upsampling network is composed of an autoencoder with skip-connections, which incorporates facial attribute vectors into the residual features of LR inputs at the bottleneck of the autoencoder, and deconvolutional layers used for upsampling. The discriminative network is designed to examine whether super-resolved faces contain the desired attributes or not and then its loss is used for updating the upsampling network. In this manner, we can super-resolve tiny (16×16 pixels) unaligned face images with a large upscaling factor of 8× while reducing the uncertainty of one-to-many mappings remarkably. By conducting extensive evaluations on a large-scale dataset, we demonstrate that our method achieves superior face hallucination results and outperforms the state-of-the-art.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
We introduce a novel implicit representation for 2D and 3D shapes based on Support Vector Machine... more We introduce a novel implicit representation for 2D and 3D shapes based on Support Vector Machine (SVM) theory. Each shape is represented by an analytic decision function obtained by training SVM, with a Radial Basis Function (RBF) kernel so that the interior shape points are given higher values. This empowers support vector shape (SVS) with multifold advantages. First, the representation uses a sparse subset of feature points determined by the support vectors, which significantly improves the discriminative power against noise, fragmentation, and other artifacts that often come with the data. Second, the use of the RBF kernel provides scale, rotation, and translation invariant features, and allows any shape to be represented accurately regardless of its complexity. Finally, the decision function can be used to select reliable feature points. These features are described using gradients computed from highly consistent decision functions instead from conventional edges. Our experiments demonstrate promising results.
2016 23rd International Conference on Pattern Recognition (ICPR), 2016
A comprehensive framework for detection and characterization of overlapping intrinsic symmetry ov... more A comprehensive framework for detection and characterization of overlapping intrinsic symmetry over 3D shapes is proposed. To identify prominent symmetric regions which overlap in space and vary in form, the proposed framework is decoupled into a Correspondence Space Voting procedure followed by a Transformation Space Mapping procedure. In the correspondence space voting procedure, significant symmetries are first detected by identifying surface point pairs on the input shape that exhibit local similarity in terms of their intrinsic geometry while simultaneously maintaining an intrinsic distance structure at a global level. Since different point pairs can share a common point, the detected symmetric shape regions can potentially overlap. To this end, a global intrinsic distance-based voting technique is employed to ensure the inclusion of only those point pairs that exhibit significant symmetry. In the transformation space mapping procedure, the Functional Map framework is employed to generate the final map of symmetries between point pairs. The transformation space mapping procedure ensures the retrieval of the underlying dense correspondence map throughout the 3D shape that follows a particular symmetry. Additionally, the formulation of a novel cost matrix enables the inner product to succesfully indicate the complexity of the underlying symmetry transformation. The proposed transformation space mapping procedure is shown to result in the formulation of a semi-metric symmetry space where each point in the space represents a specific symmetry transformation and the distance between points represents the complexity between the corresponding transformations. Experimental results show that the proposed framework can successfully process complex 3D shapes that possess rich symmetries.
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009
In this paper, we introduce a novel technique called Geometric Sequence (GS) imaging, specificall... more In this paper, we introduce a novel technique called Geometric Sequence (GS) imaging, specifically for the purpose of low power and light weight tracking in human computer interface design. The imaging sensor is programmed to capture the scene with a train of packets, where each packet constitutes a few images. The delay or the baseline associated with consecutive image pairs in a packet follows a fixed ratio, as in a geometric sequence. The image pair with shorter baseline or delay captures fast motion, while the image pair with larger baseline or delay captures slow motion. Given an image packet, the motion confidence maps computed from the slow and the fast image pairs are fused into a single map. Next, we use a Bayesian update scheme to compute the motion hypotheses probability map, given the information of prior packets. We estimate the motion from this probability map. The GS imaging system reliably tracks slow movements as well as fast movements, a feature that is important in realizing applications such as a touchpad type system. Compared to continuous imaging with short delay between consecutive pairs, the GS imaging technique enjoys several advantages. The overall power consumption and the CPU load are significantly low. We present results in the domain of optical camera based human computer interface (HCI) applications, as well as for capacitive fingerprint imaging sensor based touch pad systems.
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005
We present a novel method, which we refer as an integral histogram, to compute the histograms of ... more We present a novel method, which we refer as an integral histogram, to compute the histograms of all possible target regions in a Cartesian data space. Our method has three distinct advantages: 1-It is computationally superior to the conventional approach. The integral histogram method makes it possible to employ even an exhaustive search process in real-time, which was impractical before. 2-It can be extended to higher data dimensions, uniform and nonuniform bin formations, and multiple target scales without sacrificing its computational advantages. 3-It enables the description of higher level histogram features. We exploit the spatial arrangement of data points, and recursively propagate an aggregated histogram by starting from the origin and traversing through the remaining points along either a scan-line or a wave-front. At each step, we update a single bin using the values of integral histogram at the previously visited neighboring data points. After the integral histogram is propagated, histogram of any target region can be computed easily by using simple arithmetic operations.
2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763)
We develop a trajectory pattern learning method that has two significant advantages over the past... more We develop a trajectory pattern learning method that has two significant advantages over the past work. First, we represent trajectories in the HMM parameter space, thus we overcome the normalization problems of the existing methods. Second, we determine common trajectory paths by analyzing the optimal cluster number rather than using a predefined number of clusters. We compute affinity matrices and apply eigenvector decomposition to find clusters. We prove that the number of clusters governs the number of eigenvectors used to span the feature affinity space. We are thus able to automatically determine the optimal number of patterns. We show that the proposed algorithm accurately detects common paths for various camera setups.
2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1, 2005
Instead fo the conventional background and foreground definition, we propose a novel method that ... more Instead fo the conventional background and foreground definition, we propose a novel method that decomposes a scene into time-varying background and foreground intrinsic images. The multiplication of these images reconstructs the scene. First, we form a set of previous images into a temporal scale and compute their spatial gradients. By taking advantage of the sparseness of the filter outputs, we estimate the background by median filtering the gradients, and compute the corresponding foreground using the background. We also propose a robust method to threshold foregrounds to obtain a change detection mask of the moving pixels. We show that a different set of filters can detect the static and moving lines. Computationally, the proposed method is comparable with the state of the art, and our simulations prove the effectiveness of the intrinsic background foreground decomposition even under sudden and severe illumination changes.
Modern visual tracking systems implement a computational process that is often divided into sever... more Modern visual tracking systems implement a computational process that is often divided into several modules such as localization, tracking, recognition, behavior analysis and classification of events. This book will focus on recent advances in computational approaches for detection and tracking of human body, road boundaries and lane markers as well as on recognition of human activities, drowsiness and distraction state. The book is composed of seven distinct parts. Part I covers people localization algorithms in video sequences. Part II describes successful approaches for tracking people and body parts. The third part focuses on tracking of pedestrian and vehicles in outdoor images. Part IV describes recent methods to track lane markers and road boundaries. In part V, methods to track head, hand and facial features are reviewed. The last two parts cover the topics of automatic recognition and classification of activity, gesture, behavior, drowsiness and visual distraction state of humans.
2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008
Online boosting methods have recently been used successfully for tracking, background subtraction... more Online boosting methods have recently been used successfully for tracking, background subtraction etc. Conventional online boosting algorithms emphasize on interchanging new weak classifiers/features to adapt with the change over time. We are proposing a new online boosting algorithm where the form of the weak classifiers themselves are modified to cope with scene changes. Instead of replacement, the parameters of the weak classifiers are altered in accordance with the new data subset presented to the online boosting process at each time step. Thus we may avoid altogether the issue of how many weak classifiers to be replaced to capture the change in the data or which efficient search algorithm to use for a fast retrieval of weak classifiers. A computationally efficient method has been used in this paper for the adaptation of linear weak classifiers. The proposed algorithm has been implemented to be used both as an online learning and a tracking method. We show quantitative and qualitative results on both UCI datasets and several video sequences to demonstrate improved performance of our algorithm.
The need for empirical evaluation metrics and algorithms is well acknowledged in the field of com... more The need for empirical evaluation metrics and algorithms is well acknowledged in the field of computer vision. The process leads to precise insights to understanding current technological capabilities and also helps in measuring progress. Hence designing good and meaningful performance measures is very critical. In this paper, we propose two comprehensive measures, one each for detection and tracking, for video domains where an object bounding approach to ground truthing can be followed. Thorough analysis explaining the behavior of the measures for different types of detection and tracking errors are discussed. Face detection and tracking is chosen as a prototype task where such an evaluation is relevant. Results on real data comparing existing algorithms are presented and the measures are shown to be effective in capturing the accuracy of the detection/tracking systems.
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012
Dictionary learning has emerged as a powerful tool for low level image processing tasks such as d... more Dictionary learning has emerged as a powerful tool for low level image processing tasks such as denoising and in-painting, as well as sparse coding and representation of images. While there has been extensive work on the development of online and offline dictionary learning algorithms to perform the aforementioned tasks, the problem of choosing an appropriate dictionary size is not as widely addressed. In this paper, we introduce a new scheme to reduce and optimize dictionary size in an online setting by synthesizing new atoms from multiple previous ones. We show that this method performs as well as existing offline and online dictionary learning algorithms in terms of representation accuracy while achieving significant speedup in dictionary reconstruction and image encoding times. Our method not only helps in choosing smaller and more representative dictionaries, but also enables learning of more incoherent ones.
In this contribution, we discuss a robust estimator based on image reconstruction technique for i... more In this contribution, we discuss a robust estimator based on image reconstruction technique for image filtering and simplification purposes. Instead of using the least-squares estimators that the measurement error is independently random and distributed as a normal distribution, a Lorentzian distribution based estimator is employed to fit a model function to the input image within the local windows. The estimator weights the outliers of the measurement inversely with respect to the their deviations unlike the least-squares that magnifies exponentially. We adapted the robust estimator to simplify image by treating image noise and texture as measurement deviations. The results prove that the simplification filters have potential of becoming popular image processing tools.
Previous attemps to perform figure-ground segmentation have universally made the assumption that ... more Previous attemps to perform figure-ground segmentation have universally made the assumption that observations of the scene are independent in time. In the vocabulary of the stochastic systems literature: the individual pixels are taken to be samples from a stationary, white random processes with independent increments. Many scenes that could loosley be referred to as static often contain cyclostationary processes: meaning that there is significant structure in the correlations between observations across time. A tree swaying in the wind or a wave lapping on a beach is not just a collection of randomly shuffled appearances, but a physical system that has characteristic frequency responses associated with its dynamics. Our novel method leverages this fact to perform object detection based solely on the dynamics, rather than the appearance, of the pixels in a scene. Results are presented for a challenging scene containing wave activity in the background that visually masks a low-contrast foreground target.
Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR)... more Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-theart face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network (MTDN) devised for super-resolving unaligned and very small face images of different resolutions ranging from 16×16 to 32×32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
Given a tiny face image, existing face hallucination methods aim at super-resolving its high-reso... more Given a tiny face image, existing face hallucination methods aim at super-resolving its high-resolution (HR) counterpart by learning a mapping from an exemplary dataset. Since a low-resolution (LR) input patch may correspond to many HR candidate patches, this ambiguity may lead to distorted HR facial details and wrong attributes such as gender reversal and rejuvenation. An LR input contains low-frequency facial components of its HR version while its residual face image, defined as the difference between the HR ground-truth and interpolated LR images, contains the missing high-frequency facial details. We demonstrate that supplementing residual images or feature maps with additional facial attribute information can significantly reduce the ambiguity in face super-resolution. To explore this idea, we develop an attribute-embedded upsampling network, which consists of an upsampling network and a discriminative network. The upsampling network is composed of an autoencoder with skip-connections, which incorporates facial attribute vectors into the residual features of LR inputs at the bottleneck of the autoencoder, and deconvolutional layers used for upsampling. The discriminative network is designed to examine whether super-resolved faces contain the desired attributes or not and then its loss is used for updating the upsampling network. In this manner, we can super-resolve tiny (16×16 pixels) unaligned face images with a large upscaling factor of 8× while reducing the uncertainty of one-to-many mappings remarkably. By conducting extensive evaluations on a large-scale dataset, we demonstrate that our method achieves superior face hallucination results and outperforms the state-of-the-art.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
We introduce a novel implicit representation for 2D and 3D shapes based on Support Vector Machine... more We introduce a novel implicit representation for 2D and 3D shapes based on Support Vector Machine (SVM) theory. Each shape is represented by an analytic decision function obtained by training SVM, with a Radial Basis Function (RBF) kernel so that the interior shape points are given higher values. This empowers support vector shape (SVS) with multifold advantages. First, the representation uses a sparse subset of feature points determined by the support vectors, which significantly improves the discriminative power against noise, fragmentation, and other artifacts that often come with the data. Second, the use of the RBF kernel provides scale, rotation, and translation invariant features, and allows any shape to be represented accurately regardless of its complexity. Finally, the decision function can be used to select reliable feature points. These features are described using gradients computed from highly consistent decision functions instead from conventional edges. Our experiments demonstrate promising results.
2016 23rd International Conference on Pattern Recognition (ICPR), 2016
A comprehensive framework for detection and characterization of overlapping intrinsic symmetry ov... more A comprehensive framework for detection and characterization of overlapping intrinsic symmetry over 3D shapes is proposed. To identify prominent symmetric regions which overlap in space and vary in form, the proposed framework is decoupled into a Correspondence Space Voting procedure followed by a Transformation Space Mapping procedure. In the correspondence space voting procedure, significant symmetries are first detected by identifying surface point pairs on the input shape that exhibit local similarity in terms of their intrinsic geometry while simultaneously maintaining an intrinsic distance structure at a global level. Since different point pairs can share a common point, the detected symmetric shape regions can potentially overlap. To this end, a global intrinsic distance-based voting technique is employed to ensure the inclusion of only those point pairs that exhibit significant symmetry. In the transformation space mapping procedure, the Functional Map framework is employed to generate the final map of symmetries between point pairs. The transformation space mapping procedure ensures the retrieval of the underlying dense correspondence map throughout the 3D shape that follows a particular symmetry. Additionally, the formulation of a novel cost matrix enables the inner product to succesfully indicate the complexity of the underlying symmetry transformation. The proposed transformation space mapping procedure is shown to result in the formulation of a semi-metric symmetry space where each point in the space represents a specific symmetry transformation and the distance between points represents the complexity between the corresponding transformations. Experimental results show that the proposed framework can successfully process complex 3D shapes that possess rich symmetries.
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009
In this paper, we introduce a novel technique called Geometric Sequence (GS) imaging, specificall... more In this paper, we introduce a novel technique called Geometric Sequence (GS) imaging, specifically for the purpose of low power and light weight tracking in human computer interface design. The imaging sensor is programmed to capture the scene with a train of packets, where each packet constitutes a few images. The delay or the baseline associated with consecutive image pairs in a packet follows a fixed ratio, as in a geometric sequence. The image pair with shorter baseline or delay captures fast motion, while the image pair with larger baseline or delay captures slow motion. Given an image packet, the motion confidence maps computed from the slow and the fast image pairs are fused into a single map. Next, we use a Bayesian update scheme to compute the motion hypotheses probability map, given the information of prior packets. We estimate the motion from this probability map. The GS imaging system reliably tracks slow movements as well as fast movements, a feature that is important in realizing applications such as a touchpad type system. Compared to continuous imaging with short delay between consecutive pairs, the GS imaging technique enjoys several advantages. The overall power consumption and the CPU load are significantly low. We present results in the domain of optical camera based human computer interface (HCI) applications, as well as for capacitive fingerprint imaging sensor based touch pad systems.
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005
We present a novel method, which we refer as an integral histogram, to compute the histograms of ... more We present a novel method, which we refer as an integral histogram, to compute the histograms of all possible target regions in a Cartesian data space. Our method has three distinct advantages: 1-It is computationally superior to the conventional approach. The integral histogram method makes it possible to employ even an exhaustive search process in real-time, which was impractical before. 2-It can be extended to higher data dimensions, uniform and nonuniform bin formations, and multiple target scales without sacrificing its computational advantages. 3-It enables the description of higher level histogram features. We exploit the spatial arrangement of data points, and recursively propagate an aggregated histogram by starting from the origin and traversing through the remaining points along either a scan-line or a wave-front. At each step, we update a single bin using the values of integral histogram at the previously visited neighboring data points. After the integral histogram is propagated, histogram of any target region can be computed easily by using simple arithmetic operations.
2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763)
We develop a trajectory pattern learning method that has two significant advantages over the past... more We develop a trajectory pattern learning method that has two significant advantages over the past work. First, we represent trajectories in the HMM parameter space, thus we overcome the normalization problems of the existing methods. Second, we determine common trajectory paths by analyzing the optimal cluster number rather than using a predefined number of clusters. We compute affinity matrices and apply eigenvector decomposition to find clusters. We prove that the number of clusters governs the number of eigenvectors used to span the feature affinity space. We are thus able to automatically determine the optimal number of patterns. We show that the proposed algorithm accurately detects common paths for various camera setups.
2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1, 2005
Instead fo the conventional background and foreground definition, we propose a novel method that ... more Instead fo the conventional background and foreground definition, we propose a novel method that decomposes a scene into time-varying background and foreground intrinsic images. The multiplication of these images reconstructs the scene. First, we form a set of previous images into a temporal scale and compute their spatial gradients. By taking advantage of the sparseness of the filter outputs, we estimate the background by median filtering the gradients, and compute the corresponding foreground using the background. We also propose a robust method to threshold foregrounds to obtain a change detection mask of the moving pixels. We show that a different set of filters can detect the static and moving lines. Computationally, the proposed method is comparable with the state of the art, and our simulations prove the effectiveness of the intrinsic background foreground decomposition even under sudden and severe illumination changes.
Modern visual tracking systems implement a computational process that is often divided into sever... more Modern visual tracking systems implement a computational process that is often divided into several modules such as localization, tracking, recognition, behavior analysis and classification of events. This book will focus on recent advances in computational approaches for detection and tracking of human body, road boundaries and lane markers as well as on recognition of human activities, drowsiness and distraction state. The book is composed of seven distinct parts. Part I covers people localization algorithms in video sequences. Part II describes successful approaches for tracking people and body parts. The third part focuses on tracking of pedestrian and vehicles in outdoor images. Part IV describes recent methods to track lane markers and road boundaries. In part V, methods to track head, hand and facial features are reviewed. The last two parts cover the topics of automatic recognition and classification of activity, gesture, behavior, drowsiness and visual distraction state of humans.
2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008
Online boosting methods have recently been used successfully for tracking, background subtraction... more Online boosting methods have recently been used successfully for tracking, background subtraction etc. Conventional online boosting algorithms emphasize on interchanging new weak classifiers/features to adapt with the change over time. We are proposing a new online boosting algorithm where the form of the weak classifiers themselves are modified to cope with scene changes. Instead of replacement, the parameters of the weak classifiers are altered in accordance with the new data subset presented to the online boosting process at each time step. Thus we may avoid altogether the issue of how many weak classifiers to be replaced to capture the change in the data or which efficient search algorithm to use for a fast retrieval of weak classifiers. A computationally efficient method has been used in this paper for the adaptation of linear weak classifiers. The proposed algorithm has been implemented to be used both as an online learning and a tracking method. We show quantitative and qualitative results on both UCI datasets and several video sequences to demonstrate improved performance of our algorithm.
The need for empirical evaluation metrics and algorithms is well acknowledged in the field of com... more The need for empirical evaluation metrics and algorithms is well acknowledged in the field of computer vision. The process leads to precise insights to understanding current technological capabilities and also helps in measuring progress. Hence designing good and meaningful performance measures is very critical. In this paper, we propose two comprehensive measures, one each for detection and tracking, for video domains where an object bounding approach to ground truthing can be followed. Thorough analysis explaining the behavior of the measures for different types of detection and tracking errors are discussed. Face detection and tracking is chosen as a prototype task where such an evaluation is relevant. Results on real data comparing existing algorithms are presented and the measures are shown to be effective in capturing the accuracy of the detection/tracking systems.
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012
Dictionary learning has emerged as a powerful tool for low level image processing tasks such as d... more Dictionary learning has emerged as a powerful tool for low level image processing tasks such as denoising and in-painting, as well as sparse coding and representation of images. While there has been extensive work on the development of online and offline dictionary learning algorithms to perform the aforementioned tasks, the problem of choosing an appropriate dictionary size is not as widely addressed. In this paper, we introduce a new scheme to reduce and optimize dictionary size in an online setting by synthesizing new atoms from multiple previous ones. We show that this method performs as well as existing offline and online dictionary learning algorithms in terms of representation accuracy while achieving significant speedup in dictionary reconstruction and image encoding times. Our method not only helps in choosing smaller and more representative dictionaries, but also enables learning of more incoherent ones.
In this contribution, we discuss a robust estimator based on image reconstruction technique for i... more In this contribution, we discuss a robust estimator based on image reconstruction technique for image filtering and simplification purposes. Instead of using the least-squares estimators that the measurement error is independently random and distributed as a normal distribution, a Lorentzian distribution based estimator is employed to fit a model function to the input image within the local windows. The estimator weights the outliers of the measurement inversely with respect to the their deviations unlike the least-squares that magnifies exponentially. We adapted the robust estimator to simplify image by treating image noise and texture as measurement deviations. The results prove that the simplification filters have potential of becoming popular image processing tools.
Previous attemps to perform figure-ground segmentation have universally made the assumption that ... more Previous attemps to perform figure-ground segmentation have universally made the assumption that observations of the scene are independent in time. In the vocabulary of the stochastic systems literature: the individual pixels are taken to be samples from a stationary, white random processes with independent increments. Many scenes that could loosley be referred to as static often contain cyclostationary processes: meaning that there is significant structure in the correlations between observations across time. A tree swaying in the wind or a wave lapping on a beach is not just a collection of randomly shuffled appearances, but a physical system that has characteristic frequency responses associated with its dynamics. Our novel method leverages this fact to perform object detection based solely on the dynamics, rather than the appearance, of the pixels in a scene. Results are presented for a challenging scene containing wave activity in the background that visually masks a low-contrast foreground target.
Uploads
Papers by F. Porikli