Pattern recognition involves identifying patterns in data through machine learning techniques. It includes sensing input data, segmenting patterns, extracting features, classifying patterns, and taking post-processing actions. There are four main approaches: statistical, syntactic, template matching, and neural networks. Feature selection techniques like wrapper and filter methods help reduce dimensionality and improve pattern recognition performance.
Pattern recognition involves identifying patterns in data through machine learning techniques. It includes sensing input data, segmenting patterns, extracting features, classifying patterns, and taking post-processing actions. There are four main approaches: statistical, syntactic, template matching, and neural networks. Feature selection techniques like wrapper and filter methods help reduce dimensionality and improve pattern recognition performance.
Pattern recognition involves identifying patterns in data through machine learning techniques. It includes sensing input data, segmenting patterns, extracting features, classifying patterns, and taking post-processing actions. There are four main approaches: statistical, syntactic, template matching, and neural networks. Feature selection techniques like wrapper and filter methods help reduce dimensionality and improve pattern recognition performance.
Pattern recognition involves identifying patterns in data through machine learning techniques. It includes sensing input data, segmenting patterns, extracting features, classifying patterns, and taking post-processing actions. There are four main approaches: statistical, syntactic, template matching, and neural networks. Feature selection techniques like wrapper and filter methods help reduce dimensionality and improve pattern recognition performance.
Dept. of AIML Netaji Subhas Engineering College Terminology Pattern: • A pattern is an entity that occurs according to some sort of order. • A pattern is a regularity in the world or in abstract notion. Recognition: • Identification • Matching • Classification Definitions of PR • PR is the ability of machine to identify patterns in data to make smart decision. • PR is the process of recognizing patterns by using some machine learning techniques. • PR is the automated recognition of patterns and regularities in data. • PR involves in finding the similarities or patterns among small or decomposed problems that can help us to solve more complex problems. Scholars view on PR • Duda & Hart: The assignment of a physical object or event to one of several pre-specified categories. • Schalkoff: The science that concerns the description or classification of measurements. • Morse: PR is concerned with answering the question ‘What is this?’. Traditional PR System Traditional PR System contd.. Advanced PR System Details of PR System Sensing: • Sensing refers to input to PR system. • Example- Camera, transducer, micro-phone etc. Segmentation: • Deepest problems in pattern recognition that deals with the problem of recognizing or grouping together the various parts of an object. • The segmentation process is a crucial step in any computer-based vision system or application, due to its inherent difficulty and the importance of its results, which are decisive for the global efficiency of the vision system. Feature extraction: Feature extraction is the process of determining the features to be used for learning. The description and properties of the patterns are known. Feature extraction refers to the process of transforming raw data into numerical features that can be processed while preserving the information in the original data set. Details of PR System contd..
• The traditional goal of feature extraction is to characterize an object to be
recognized by measurements whose values are very similar for objects in the same category and very different for objects in different categories. • Both of segmentation and feature extraction phases require domain knowledge , which is a problem in PR system. Classification: • Classification is the task of assigning a class label to an input pattern. The class label indicates one of a given set of classes. The classification is carried out with the help of a model obtained using a learning procedure. • Pattern recognition involves the classification and cluster of patterns. In classification, an appropriate class label is assigned to a pattern based on an abstraction that is generated using a set of training patterns or domain knowledge. Classification is used in supervised learning. Post-processing: • It deals with action decision-making by using the output of the classifier. Action such as minimum-error-rate classification will minimize the total expected cost. PR Methodologies • There are 4 methodologies (Approaches) in PR. These are Statistical approach, Syntactic approach, Template matching and Neural network. • Statistical approach: It is based on statistics and probabilities. It is the simplest method. Features are converted to numerical numbers which are placed into a vector to represent the pattern. Each pattern is represented by a point in the multidimensional features space. It uses distance measures between points. • Syntactic approach: It is also called structural method. It is based on relation between features. Patterns are described in hierarchical structure composed of sub-structures. The system parses the set of extracted features using a kind of predefined grammar. If the whole features extracted from a pattern can be parsed to the grammar then the system has recognized the pattern. Syntactic approach of PR PR Methodologies contd.. Template Matching • Template refers to a model. • Template matching approach is widely used in image processing to localize and identify shapes in an image. • In this approach, user looks for parts in an image which match a template. • Matching is done according to correlation or distance measureents. Neural Network Neural network is a self-adaptive trainable process that is able to learn to resolve complex problems based on available knowledge. A set of available data is supplied to the system so that it finds the most adaptive function among an allowed class. Feature selection technique • Wrapper methods : Forward selection Backward elimination Exhaustive selection Recursive elimination • Filter methods: ANOVA Pearson correlation variance thresholding • Embedded methods: Lasso Ridge Decision Tree Wrapper method • Wrapper method trains the algorithm by using a subset of features in an iterative manner. • The main advantage of wrapper methods over the filter methods is that they provide an optimal set of features for training the model, thus resulting in better accuracy than the filter methods • It is computationally more expensive. Wrapper method Wrapper method Forward selection – • This method is an iterative approach where we initially start with an empty set of features and keep adding a feature which best improves our model after each iteration. • The stopping criterion is till the addition of a new variable does not improve the performance of the model. Backward elimination – • This method is also an iterative approach where we initially start with all features and after each iteration, we remove the least significant feature. • The stopping criterion is till no improvement in the performance of the model is observed after the feature is removed. Wrapper method Exhaustive selection – • technique is considered as the brute force approach. • It creates all possible subsets and builds a learning algorithm for each subset. • selects the subset whose model’s performance is best. Recursive elimination – • This greedy optimization method selects features by recursively considering the smaller and smaller set of features. • The estimator is trained on an initial set of features and importance is measured. • The least important features are then removed from the current set of features till the required number of features are obtained. Filter Method • Filter methods measure the relevance of features by their correlation with dependent variable. • Filter methods are much faster compared to wrapper methods as they do not involve training the models. • The filter method ranks each feature based on some uni-variate metric and then selects the highest-ranking features. Some of the uni-variate metrics are variance: removing constant and quasi constant features chi-square: used for classification. It is a statistical test of independence to determine the dependency of two variables. correlation coefficients: removes duplicate features Information gain or mutual information: assess the dependency of the independent variable in predicting the target variable. In other words, it determines the ability of the independent feature to predict the target variable Filter Method Pearson’s correlation • Pearson’s Correlation: It is used as a measure for quantifying linear dependence between two continuous variables X and Y. Its value varies from -1 to +1. Pearson’s correlation is given as: Pearson’s correlation ANOVA & LDA • ANOVA: ANOVA stands for Analysis of variance. It is similar to LDA except for the fact that it is operated using one or more categorical independent features and one continuous dependent feature. It provides a statistical test of whether the means of several groups are equal or not. • LDA: Linear discriminant analysis is used to find a linear combination of features that characterizes or separates two or more classes (or levels) of a categorical variable. Advantage & Disadvantage of Filter Method
Advantages of Filter methods
• Filter methods are model agnostic • Rely entirely on features in the data set • Computationally very fast • Based on different statistical methods Disadvantage of Filter methods • The filter method looks at individual features for identifying it’s relative importance. A feature may not be useful on its own but maybe an important influencer when combined with other features. Filter methods may miss such features. Embedded Method • Embedded methods combine the qualities’ of filter and wrapper methods. It’s implemented by algorithms that have their own built-in feature selection methods. • Some of the most popular examples of these methods are LASSO and RIDGE regression which have inbuilt penalization functions to reduce overfitting. • Lasso regression performs L1 regularization which adds penalty equivalent to absolute value of the magnitude of coefficients. • Ridge regression performs L2 regularization which adds penalty equivalent to square of the magnitude of coefficients. Embedded Method