2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 2018
Off-line Handwritten Character Recognition (HCR) is the automatic conversion of text in an image ... more Off-line Handwritten Character Recognition (HCR) is the automatic conversion of text in an image into letter codes which are usable within computer and text-processing applications. In order to have a complete HCR system of a script, it is essential to consider all the characters present in the character set of the concerned script. This is essentially important for Indian scripts which have a rich set of characters including modifiers and compound characters. Although it is important to consider all the characters, most works have focused mainly on consonants and numerals of the character sets. This paper reports recognition of complete character set of Meitei Mayek script and the issues and challenges faced. A Convolutional Neural Network (CNN) model is proposed to recognize characters of a fairly large dataset consisting of 38,500 samples. Proposed model is also trained and tested against five mini subsets of the dataset to highlight the complexity in terms of recognition accuracy when complete character set is taken into account. It achieves an accuracy of 93.64% and 92.29% on the dataset of 54 and 55 classes respectively.
Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, 2021
Contemporary (parametric) scene parsing methods are learning-based and mostly operate in a closed... more Contemporary (parametric) scene parsing methods are learning-based and mostly operate in a closed-universe scenario. We introduce a non-parametric scene parsing framework that is model-free, data-driven, and scales naturally to growing data. The scene parsing performance in the non-parametric approach depends on reliable dense correspondence or alignment across scenes for label transfer. Incorrect correspondence is known to affect the scene parsing results adversely. We propose a label transfer approach that relies on the dense correspondence of super-pixel pairs (in a query and candidate image) matched by a homogeneous kernel map to guide semantic label transfer. The aggregation (fusing) of multiple labels is done through a simple heuristic aggregation scheme (simple majority voting). The Markov Random Field (MRF) provides a principled probabilistic framework for combining the disparate information in the smoothing stage and ensures plausible labeling results. Evaluation results show that our non-parametric system obtains competitive scene parsing performance on the standard SIFT Flow and MSRC-21 datasets.
Datasets are important for validation of any method or technique. The effectiveness of a method o... more Datasets are important for validation of any method or technique. The effectiveness of a method or technique can be well judged using an unbiased, complete and correct dataset. This paper presents a novel dataset to support validation of any computer vision method for recognition of Sattriya dance hand gestures, a fifteenth-century major Indian classical dance of the state of Assam. The dataset fulfils all the major requirements and has been established using five well-known classifiers. The sample of dataset is made available at http://agnigarh.tezu.ernet.in/~dkb/resources.html.
We revisit the implicit design choices in the popular vector of locally aggregated descriptors (V... more We revisit the implicit design choices in the popular vector of locally aggregated descriptors (VLAD), which aggregates the residuals of local image descriptors. Since original VLAD ignores high-order statistics the resultant vector is not discriminative enough. We address this issue by exploiting high-order statistics for gaining complementary information. Our contributions are two-fold: First, we present a novel high-order VLAD (HO-VLAD) with increased discriminative power. Next, we propose a light-weight retrieval framework to demonstrate HO-VLAD’s effectiveness for scalable image retrieval. Systematic experiments on two challenging public databases (INRIA Holidays, UKBench) exhibit a consistent improvement of performance with limited computational costs.
The single-hand gestures of Indian classical dance are termed as ‘Asamyukta Hastas,’ which is a c... more The single-hand gestures of Indian classical dance are termed as ‘Asamyukta Hastas,’ which is a combination of two Sanskrit words, asamyukta meaning ‘single’ and hastas meaning ‘hand gestures’. There are eight officially recognized classical dance forms in India. This paper focuses on the 29 single-hand gestures of Sattriya dance which is one of the Indian classical dance forms. It presents an analysis on recognition of single-hand gestures of Sattriya dance form images using different classifiers such as k-nearest neighbor (k-NN), naive Bayes, Bayesian network, decision tree, and Support Vector Machine (SVM). In this work, we have used Hu’s seven invariant moments, Zernike moments, and Legendre moments up to tenth order each. In this analysis, it indicates that Legendre moments show a better performance compared to other moments for all variation of dataset, and could achieve an accuracy of 96.03%.
Selecting and extracting a robust feature set is very much important in automatic identification ... more Selecting and extracting a robust feature set is very much important in automatic identification or classification of parasitic eggs in microscopic images. In this paper, we have explored different types of features and used in identification of the segmented microscopic images of three different types of parasitic eggs namely Ascaris Lumbricoides, Necator Americanus and Trichuris Trichiura. We have extracted four types of image moments as four different feature sets from the segmented gray-scale images of parasite eggs. We have also extracted several features from gray level pixel values and some texture features along with a few shape-based features to automatically identify the above mentioned types of parasite eggs. The performances of the different feature sets are examined using three different classifiers viz SVM, ANN and kNN. The highest classification accuracy of 96.5% is obtained by SVM using texture and shaped based features.
Human gait reveals feelings, intensions and identity. Gait is used as biometric feature to identi... more Human gait reveals feelings, intensions and identity. Gait is used as biometric feature to identify walking individual at a distance. This paper presents an approach to identify human gait patterns using features extracted from statistical moments. Post background subtraction, silhouette frames of walking subjects were segmented into 9-segments representing different human body parts. Statistical moments, viz. , geometric moments, Legendre moments and Krawtchouk moments were used individually to extract some distinguishable gait features namely centroid, aspect ratio and orientation from each segment of the silhouettes. In addition to these features, height and width of the personwere also included. Each walking person was represented by a gait pattern ora feature vector, generated using 38 features extracted from silhouette.A minimum distance classifier based on Euclidean distance was used to recognize the input image sequence in testing phase. All the experiments were conducted on...
Off-line Handwritten Character Recognition (HCR) is the process of automatic conversion of images... more Off-line Handwritten Character Recognition (HCR) is the process of automatic conversion of images of handwritten text into a form that computers can understand and process. Several research works for HCR of different scripts are found in literature. They make use of one or more feature sets and classification tools for recognition of characters. Recently, Convolutional Neural Network (CNN) based recognition is found to show significantly better results. However, only a handful of studies are found of Meitei Mayek script and none based on CNN. Also, no dataset is available publicly for the said script. In order to study the recognition of characters for a particular script, a significantly large dataset is needed. In this paper, for the first time, a dataset consisting of 60285 handwritten characters of Meitei Mayek script is introduced which will be made publicly available to the researchers for use at http://agnigarh.tezu.ernet.in/~sarat/resources.html. A CNN architecture is also proposed for the recognition of characters in the dataset. An accuracy of 96.24% is achieved which is promising as compared to state-of-the-art works for the concerned script.
We are interested in the encoding of local descriptors of an image (e.g. SIFT) to design a compac... more We are interested in the encoding of local descriptors of an image (e.g. SIFT) to design a compact representation vector and thereby address scalable image retrieval. We revisit the implicit design choices in the popular vector of locally aggregated descriptors (VLAD), which aggregates the residuals of descriptors to the codewords. VLAD’s use of a coarse codebook and first-order descriptor statistics in residual computation results in less discriminative residuals. To address this problem, we propose a division of codebook feature space using a novel fine-grained quantization strategy. After quantization, we embed the resulting residuals with high-order statistics of descriptor distribution. Experiments on three challenging image retrieval datasets (INRIA Holidays, UKBench, Oxford 5k) confirm the improved discriminative power of our novel encoding method called FhVLAD. We observe superior accuracy to baseline and competitive performance to state-of-the-art techniques with a limited increase in dimension.
This paper introduces a large-scale Meitei Mayek handwritten character database. It consists of t... more This paper introduces a large-scale Meitei Mayek handwritten character database. It consists of the complete character set of the script. There are a total of 85,124 character images of 55 character classes with 72,330 and 12,794 images in training and test sets, respectively. The present work focuses on collecting the natural handwriting of individuals by carrying out sample collection in two phases: (a) unconstrained handwriting in the form of answer sheets and classroom notes and (b) tabular forms. A total of nearly 500 individuals have contributed in the development of the database. Recognition of the character images in the database is carried out using different feature descriptors with four popular classifiers, namely KNN, Linear Support Vector Classifier, Random Forest and Support Vector Machine. The paper also proposes a convolutional neural network (CNN) model by enhancing a base CNN architecture by optimally tuning the hyperparameters. Experimental results show that the CNN model can be benchmarked against the concerned database with a test accuracy of 95.56%.
This paper presents an effective method for fingerprint classification using data mining approach... more This paper presents an effective method for fingerprint classification using data mining approach. Initially, it generates a numeric code sequence for each fingerprint image based on the ridge flow patterns. Then for each class, a seed is selected by using a frequent itemsets generation technique. These seeds are subsequently used for clustering the fingerprint images. The proposed method was tested and evaluated in terms of several real-life datasets and a significant improvement in reducing the misclassification errors has been noticed in comparison to its other counterparts.
International Journal of Computer Applications, 2016
Gesture is the most primitive way of communication among human being. Today in the era of modern ... more Gesture is the most primitive way of communication among human being. Today in the era of modern technology gesture recognition influences the world very diversely, from the physically challenged people to robot control to virtual reality environments. Compared to the systems which use extra devices (gloves, sensors), vision-based systems are more user-friendly and simple. Vision-based systems are easy to use, but most difficult to implement. This paper presents a comprehensive survey on the vision-based dynamic gesture recognition approaches, a comparative study on those methods, and find out the issues and challenges in this area.
2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 2018
Off-line Handwritten Character Recognition (HCR) is the automatic conversion of text in an image ... more Off-line Handwritten Character Recognition (HCR) is the automatic conversion of text in an image into letter codes which are usable within computer and text-processing applications. In order to have a complete HCR system of a script, it is essential to consider all the characters present in the character set of the concerned script. This is essentially important for Indian scripts which have a rich set of characters including modifiers and compound characters. Although it is important to consider all the characters, most works have focused mainly on consonants and numerals of the character sets. This paper reports recognition of complete character set of Meitei Mayek script and the issues and challenges faced. A Convolutional Neural Network (CNN) model is proposed to recognize characters of a fairly large dataset consisting of 38,500 samples. Proposed model is also trained and tested against five mini subsets of the dataset to highlight the complexity in terms of recognition accuracy when complete character set is taken into account. It achieves an accuracy of 93.64% and 92.29% on the dataset of 54 and 55 classes respectively.
Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, 2021
Contemporary (parametric) scene parsing methods are learning-based and mostly operate in a closed... more Contemporary (parametric) scene parsing methods are learning-based and mostly operate in a closed-universe scenario. We introduce a non-parametric scene parsing framework that is model-free, data-driven, and scales naturally to growing data. The scene parsing performance in the non-parametric approach depends on reliable dense correspondence or alignment across scenes for label transfer. Incorrect correspondence is known to affect the scene parsing results adversely. We propose a label transfer approach that relies on the dense correspondence of super-pixel pairs (in a query and candidate image) matched by a homogeneous kernel map to guide semantic label transfer. The aggregation (fusing) of multiple labels is done through a simple heuristic aggregation scheme (simple majority voting). The Markov Random Field (MRF) provides a principled probabilistic framework for combining the disparate information in the smoothing stage and ensures plausible labeling results. Evaluation results show that our non-parametric system obtains competitive scene parsing performance on the standard SIFT Flow and MSRC-21 datasets.
Datasets are important for validation of any method or technique. The effectiveness of a method o... more Datasets are important for validation of any method or technique. The effectiveness of a method or technique can be well judged using an unbiased, complete and correct dataset. This paper presents a novel dataset to support validation of any computer vision method for recognition of Sattriya dance hand gestures, a fifteenth-century major Indian classical dance of the state of Assam. The dataset fulfils all the major requirements and has been established using five well-known classifiers. The sample of dataset is made available at http://agnigarh.tezu.ernet.in/~dkb/resources.html.
We revisit the implicit design choices in the popular vector of locally aggregated descriptors (V... more We revisit the implicit design choices in the popular vector of locally aggregated descriptors (VLAD), which aggregates the residuals of local image descriptors. Since original VLAD ignores high-order statistics the resultant vector is not discriminative enough. We address this issue by exploiting high-order statistics for gaining complementary information. Our contributions are two-fold: First, we present a novel high-order VLAD (HO-VLAD) with increased discriminative power. Next, we propose a light-weight retrieval framework to demonstrate HO-VLAD’s effectiveness for scalable image retrieval. Systematic experiments on two challenging public databases (INRIA Holidays, UKBench) exhibit a consistent improvement of performance with limited computational costs.
The single-hand gestures of Indian classical dance are termed as ‘Asamyukta Hastas,’ which is a c... more The single-hand gestures of Indian classical dance are termed as ‘Asamyukta Hastas,’ which is a combination of two Sanskrit words, asamyukta meaning ‘single’ and hastas meaning ‘hand gestures’. There are eight officially recognized classical dance forms in India. This paper focuses on the 29 single-hand gestures of Sattriya dance which is one of the Indian classical dance forms. It presents an analysis on recognition of single-hand gestures of Sattriya dance form images using different classifiers such as k-nearest neighbor (k-NN), naive Bayes, Bayesian network, decision tree, and Support Vector Machine (SVM). In this work, we have used Hu’s seven invariant moments, Zernike moments, and Legendre moments up to tenth order each. In this analysis, it indicates that Legendre moments show a better performance compared to other moments for all variation of dataset, and could achieve an accuracy of 96.03%.
Selecting and extracting a robust feature set is very much important in automatic identification ... more Selecting and extracting a robust feature set is very much important in automatic identification or classification of parasitic eggs in microscopic images. In this paper, we have explored different types of features and used in identification of the segmented microscopic images of three different types of parasitic eggs namely Ascaris Lumbricoides, Necator Americanus and Trichuris Trichiura. We have extracted four types of image moments as four different feature sets from the segmented gray-scale images of parasite eggs. We have also extracted several features from gray level pixel values and some texture features along with a few shape-based features to automatically identify the above mentioned types of parasite eggs. The performances of the different feature sets are examined using three different classifiers viz SVM, ANN and kNN. The highest classification accuracy of 96.5% is obtained by SVM using texture and shaped based features.
Human gait reveals feelings, intensions and identity. Gait is used as biometric feature to identi... more Human gait reveals feelings, intensions and identity. Gait is used as biometric feature to identify walking individual at a distance. This paper presents an approach to identify human gait patterns using features extracted from statistical moments. Post background subtraction, silhouette frames of walking subjects were segmented into 9-segments representing different human body parts. Statistical moments, viz. , geometric moments, Legendre moments and Krawtchouk moments were used individually to extract some distinguishable gait features namely centroid, aspect ratio and orientation from each segment of the silhouettes. In addition to these features, height and width of the personwere also included. Each walking person was represented by a gait pattern ora feature vector, generated using 38 features extracted from silhouette.A minimum distance classifier based on Euclidean distance was used to recognize the input image sequence in testing phase. All the experiments were conducted on...
Off-line Handwritten Character Recognition (HCR) is the process of automatic conversion of images... more Off-line Handwritten Character Recognition (HCR) is the process of automatic conversion of images of handwritten text into a form that computers can understand and process. Several research works for HCR of different scripts are found in literature. They make use of one or more feature sets and classification tools for recognition of characters. Recently, Convolutional Neural Network (CNN) based recognition is found to show significantly better results. However, only a handful of studies are found of Meitei Mayek script and none based on CNN. Also, no dataset is available publicly for the said script. In order to study the recognition of characters for a particular script, a significantly large dataset is needed. In this paper, for the first time, a dataset consisting of 60285 handwritten characters of Meitei Mayek script is introduced which will be made publicly available to the researchers for use at http://agnigarh.tezu.ernet.in/~sarat/resources.html. A CNN architecture is also proposed for the recognition of characters in the dataset. An accuracy of 96.24% is achieved which is promising as compared to state-of-the-art works for the concerned script.
We are interested in the encoding of local descriptors of an image (e.g. SIFT) to design a compac... more We are interested in the encoding of local descriptors of an image (e.g. SIFT) to design a compact representation vector and thereby address scalable image retrieval. We revisit the implicit design choices in the popular vector of locally aggregated descriptors (VLAD), which aggregates the residuals of descriptors to the codewords. VLAD’s use of a coarse codebook and first-order descriptor statistics in residual computation results in less discriminative residuals. To address this problem, we propose a division of codebook feature space using a novel fine-grained quantization strategy. After quantization, we embed the resulting residuals with high-order statistics of descriptor distribution. Experiments on three challenging image retrieval datasets (INRIA Holidays, UKBench, Oxford 5k) confirm the improved discriminative power of our novel encoding method called FhVLAD. We observe superior accuracy to baseline and competitive performance to state-of-the-art techniques with a limited increase in dimension.
This paper introduces a large-scale Meitei Mayek handwritten character database. It consists of t... more This paper introduces a large-scale Meitei Mayek handwritten character database. It consists of the complete character set of the script. There are a total of 85,124 character images of 55 character classes with 72,330 and 12,794 images in training and test sets, respectively. The present work focuses on collecting the natural handwriting of individuals by carrying out sample collection in two phases: (a) unconstrained handwriting in the form of answer sheets and classroom notes and (b) tabular forms. A total of nearly 500 individuals have contributed in the development of the database. Recognition of the character images in the database is carried out using different feature descriptors with four popular classifiers, namely KNN, Linear Support Vector Classifier, Random Forest and Support Vector Machine. The paper also proposes a convolutional neural network (CNN) model by enhancing a base CNN architecture by optimally tuning the hyperparameters. Experimental results show that the CNN model can be benchmarked against the concerned database with a test accuracy of 95.56%.
This paper presents an effective method for fingerprint classification using data mining approach... more This paper presents an effective method for fingerprint classification using data mining approach. Initially, it generates a numeric code sequence for each fingerprint image based on the ridge flow patterns. Then for each class, a seed is selected by using a frequent itemsets generation technique. These seeds are subsequently used for clustering the fingerprint images. The proposed method was tested and evaluated in terms of several real-life datasets and a significant improvement in reducing the misclassification errors has been noticed in comparison to its other counterparts.
International Journal of Computer Applications, 2016
Gesture is the most primitive way of communication among human being. Today in the era of modern ... more Gesture is the most primitive way of communication among human being. Today in the era of modern technology gesture recognition influences the world very diversely, from the physically challenged people to robot control to virtual reality environments. Compared to the systems which use extra devices (gloves, sensors), vision-based systems are more user-friendly and simple. Vision-based systems are easy to use, but most difficult to implement. This paper presents a comprehensive survey on the vision-based dynamic gesture recognition approaches, a comparative study on those methods, and find out the issues and challenges in this area.
Uploads
Papers by Sarat Saharia