In this paper we discuss some problematic aspects of Newman’s modularity function QN. Given a gra... more In this paper we discuss some problematic aspects of Newman’s modularity function QN. Given a graph G, the modularity of G can be written as QN = Qf −Q0, where Qf is the intracluster edge fraction of G and Q0 is the expected intracluster edge fraction of the null model, i.e., a randomly connected graph with same expected degree distribution as G. It follows that the maximization of QN must accomodate two factors pulling in opposite directions: Qf favors a small number of clusters and Q0 favors many balanced (i.e., with ap-proximately equal degrees) clusters. In certain cases the Q0 term can cause overestimation of the true cluster number; this is the opposite of the well-known underestimation effect caused by the “resolution limit ” of modularity. We illustrate the overestimation effect by constructing families of graphs with a “natural ” community structure which, however, does not maximize modularity. In fact, we prove that we can always find a graph G with a “natural clustering ”...
We consider the problem of identifying multiple outliers in linear regression models. In robust r... more We consider the problem of identifying multiple outliers in linear regression models. In robust regression the unusual observations should be removed from the sample in order to obtain better fitting for the rest of the observations. Based on the LTS estimate, we propose a penalized trimmed square estimator PTS, where penalty costs for discarding outliers are inserted into the loss function. We search for suitable penalty costs for multiple high-leverage outliers, which are based on robust leverage and scale. Thus, the best fit for the majority of the data is obtained after eliminating only outliers from the data set. The robust estimation is obtained by minimizing the loss function with a mathematical programming technique, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of e-insensitive tube from support vectors machine regression. The PTS loss function is transformed to an e-Insensiti...
Networks are frequently studied algebraically through matrices. In this work, we show that networ... more Networks are frequently studied algebraically through matrices. In this work, we show that networks may be studied in a more abstract level using results from the theory of matroids by establishing connections to networks by decomposition results of matroids. First, we present the implications of the decomposition of regular matroids to networks and related classes of matrices, and secondly we show that strongly unimodular matrices are closed under k-sums for k = 1, 2 implying a decomposition into highly connected network-representing blocks, which are also shown to have a special structure.
We consider the problem of identifying multiple outliers in linear regression models. We propose ... more We consider the problem of identifying multiple outliers in linear regression models. We propose a penalized trimmed squares (PTS) estimator, where penalty costs for discarding outliers are inserted into the loss function. We propose suitable penalties for unmasking the multiple high-leverage outliers. The robust procedure is formu-lated as a Quadratic Mixed Integer Programming (QMIP) problem, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of ǫ-insensitive loss function from sup-port vector machines regression. The small errors are ignored, and the mathematical formula gains the sparseness property. The good performance of the PTS estimator allows identification of multiple outliers avoiding masking effects.
We consider the problem of identifying multiple outliers in linear regression models. In robust r... more We consider the problem of identifying multiple outliers in linear regression models. In robust regression the unusual observations should be removed from the sample in order to obtain better fitting for the rest of the observations. Based on the LTS estimate, we propose a penalized trimmed square estimator PTS, where penalty costs for discarding outliers are inserted into the loss function. We search for suitable penalty costs for multiple high-leverage outliers, which are based on robust leverage and scale. Thus, the best fit for the majority of the data is obtained after eliminating only outliers from the data set. The robust estimation is obtained by minimizing the loss function with a mathematical programming technique, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of e-insensitive tube from support vectors machine regression. The PTS loss function is transformed to an e-Insensiti...
The IMA Volumes in Mathematics and its Applications, 1999
Abstract. Data association multidimensional assignment problems appear in many applications such ... more Abstract. Data association multidimensional assignment problems appear in many applications such as MultiTarget MultiSensor Tracking, and particle tracking. The problem is characterized by the large input data and is very difficult to solve exactly. A Greedy Randomized Adaptive Search Procedure (GRASP) has been developed and computational results show good quality solutions can be obtained. Furthermore, the efficiency of the GRASP can be easily improved by parallelization of the code in the MPI environment. Key ...
The weighted maximum satis ability (MAX-SAT) problem is central in mathematical logic, computing ... more The weighted maximum satis ability (MAX-SAT) problem is central in mathematical logic, computing theory, and many industrial applications. In this paper, we present a parallel greedy randomized adaptive search procedure (GRASP) for solving MAX-SAT problems. Experimental results indicate that almost linear speedup is achieved.
2006 2nd International Conference on Information & Communication Technologies, 2006
In this paper a combinatorial algorithm is presented and implemented for the identification and m... more In this paper a combinatorial algorithm is presented and implemented for the identification and measurement of apertures from a noisy image. These apertures appear in the form of highly irregular shapes, and are approximated by circular disks. Three circular disks are defined for each aperture. The indisk which the largest disk contained in the aperture, the outdisk which is the
In this paper we provide two recognition algorithms for the class of signed-graphic matroids alon... more In this paper we provide two recognition algorithms for the class of signed-graphic matroids along with necessary and sufficient conditions for a matroid to be signed-graphic. Specifically, we provide a polynomial-time algorithm which determines whether a given binary matroid is signed-graphic and an algorithm which determines whether a general matroid given by an independence oracle is binary signed-graphic.
ABSTRACT Computation of typical statistical sample estimates such as the median or least squares ... more ABSTRACT Computation of typical statistical sample estimates such as the median or least squares fit usually require the solution of an unconstrained optimization problem with a convex objective function, that can be solved efficiently by various methods. The presence of outliers in the data dictates the computation of a robust estimate, which can be defined as the optimum statistical estimate for a subset that contains at least half of the observations. The resulting problem is now a combinatorial optimization problem which is often computationally intractable. Classical statistical methods for multivariate location \(\varvec{\mu }\) and scatter matrix \(\varvec{\varSigma }\) estimation are based on the sample mean vector and covariance matrix, which are very sensitive in the presence of outlier observations. We propose a new method for robust location and scatter estimation which is composed of two stages. In the first stage an unbiased multivariate \(L_{1}\) -median center for all the observations is attained by a novel procedure called the least trimmed Euclidean deviations estimator. This robust median defines a coverage set of observations which is used in the second stage to iteratively compute the set of outliers which violate the correlational structure of the data set. Extensive computational experiments indicate that the proposed method outperforms existing methods in accuracy, robustness and computational time.
The material in this chapter constitutes a brief look at what can be considered as the fundamenta... more The material in this chapter constitutes a brief look at what can be considered as the fundamental core of matroid theory.
In this chapter we will present a set of propositions that characterize common properties of grap... more In this chapter we will present a set of propositions that characterize common properties of graphs, vector spaces, and transversals.
In this study a combinatorial algorithm is developed for the detection of regions that could be a... more In this study a combinatorial algorithm is developed for the detection of regions that could be approximated by circular disks. The figure of a region is a section where the boundary is inordinately asymmetrical and jagged. Three circular disks are computed for each region. The indisk which is the largest disk contained in the region, the outdisk which is the smallest disk that contains the region and has the same center as the indisk, and the approximation disk with area equal to the area of the region and has the maximum intersection with the region. Reasonably there are some problems described as a considerable remark for efficient use of the proposed algorithm: in meteorology, where hail pads have to be analyzed in order to determine the number and characteristics of the hailstones that collided with the pad, in biology especially in cytology where cells have to be identified for the reason that contains useful information that should be extracted. Computational results on a set of benchmark images from actual data are presented.
In this paper we discuss some problematic aspects of Newman’s modularity function QN. Given a gra... more In this paper we discuss some problematic aspects of Newman’s modularity function QN. Given a graph G, the modularity of G can be written as QN = Qf −Q0, where Qf is the intracluster edge fraction of G and Q0 is the expected intracluster edge fraction of the null model, i.e., a randomly connected graph with same expected degree distribution as G. It follows that the maximization of QN must accomodate two factors pulling in opposite directions: Qf favors a small number of clusters and Q0 favors many balanced (i.e., with ap-proximately equal degrees) clusters. In certain cases the Q0 term can cause overestimation of the true cluster number; this is the opposite of the well-known underestimation effect caused by the “resolution limit ” of modularity. We illustrate the overestimation effect by constructing families of graphs with a “natural ” community structure which, however, does not maximize modularity. In fact, we prove that we can always find a graph G with a “natural clustering ”...
We consider the problem of identifying multiple outliers in linear regression models. In robust r... more We consider the problem of identifying multiple outliers in linear regression models. In robust regression the unusual observations should be removed from the sample in order to obtain better fitting for the rest of the observations. Based on the LTS estimate, we propose a penalized trimmed square estimator PTS, where penalty costs for discarding outliers are inserted into the loss function. We search for suitable penalty costs for multiple high-leverage outliers, which are based on robust leverage and scale. Thus, the best fit for the majority of the data is obtained after eliminating only outliers from the data set. The robust estimation is obtained by minimizing the loss function with a mathematical programming technique, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of e-insensitive tube from support vectors machine regression. The PTS loss function is transformed to an e-Insensiti...
Networks are frequently studied algebraically through matrices. In this work, we show that networ... more Networks are frequently studied algebraically through matrices. In this work, we show that networks may be studied in a more abstract level using results from the theory of matroids by establishing connections to networks by decomposition results of matroids. First, we present the implications of the decomposition of regular matroids to networks and related classes of matrices, and secondly we show that strongly unimodular matrices are closed under k-sums for k = 1, 2 implying a decomposition into highly connected network-representing blocks, which are also shown to have a special structure.
We consider the problem of identifying multiple outliers in linear regression models. We propose ... more We consider the problem of identifying multiple outliers in linear regression models. We propose a penalized trimmed squares (PTS) estimator, where penalty costs for discarding outliers are inserted into the loss function. We propose suitable penalties for unmasking the multiple high-leverage outliers. The robust procedure is formu-lated as a Quadratic Mixed Integer Programming (QMIP) problem, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of ǫ-insensitive loss function from sup-port vector machines regression. The small errors are ignored, and the mathematical formula gains the sparseness property. The good performance of the PTS estimator allows identification of multiple outliers avoiding masking effects.
We consider the problem of identifying multiple outliers in linear regression models. In robust r... more We consider the problem of identifying multiple outliers in linear regression models. In robust regression the unusual observations should be removed from the sample in order to obtain better fitting for the rest of the observations. Based on the LTS estimate, we propose a penalized trimmed square estimator PTS, where penalty costs for discarding outliers are inserted into the loss function. We search for suitable penalty costs for multiple high-leverage outliers, which are based on robust leverage and scale. Thus, the best fit for the majority of the data is obtained after eliminating only outliers from the data set. The robust estimation is obtained by minimizing the loss function with a mathematical programming technique, computationally suitable for small sample data. The computational load and the effectiveness of the new procedure are improved by using the idea of e-insensitive tube from support vectors machine regression. The PTS loss function is transformed to an e-Insensiti...
The IMA Volumes in Mathematics and its Applications, 1999
Abstract. Data association multidimensional assignment problems appear in many applications such ... more Abstract. Data association multidimensional assignment problems appear in many applications such as MultiTarget MultiSensor Tracking, and particle tracking. The problem is characterized by the large input data and is very difficult to solve exactly. A Greedy Randomized Adaptive Search Procedure (GRASP) has been developed and computational results show good quality solutions can be obtained. Furthermore, the efficiency of the GRASP can be easily improved by parallelization of the code in the MPI environment. Key ...
The weighted maximum satis ability (MAX-SAT) problem is central in mathematical logic, computing ... more The weighted maximum satis ability (MAX-SAT) problem is central in mathematical logic, computing theory, and many industrial applications. In this paper, we present a parallel greedy randomized adaptive search procedure (GRASP) for solving MAX-SAT problems. Experimental results indicate that almost linear speedup is achieved.
2006 2nd International Conference on Information & Communication Technologies, 2006
In this paper a combinatorial algorithm is presented and implemented for the identification and m... more In this paper a combinatorial algorithm is presented and implemented for the identification and measurement of apertures from a noisy image. These apertures appear in the form of highly irregular shapes, and are approximated by circular disks. Three circular disks are defined for each aperture. The indisk which the largest disk contained in the aperture, the outdisk which is the
In this paper we provide two recognition algorithms for the class of signed-graphic matroids alon... more In this paper we provide two recognition algorithms for the class of signed-graphic matroids along with necessary and sufficient conditions for a matroid to be signed-graphic. Specifically, we provide a polynomial-time algorithm which determines whether a given binary matroid is signed-graphic and an algorithm which determines whether a general matroid given by an independence oracle is binary signed-graphic.
ABSTRACT Computation of typical statistical sample estimates such as the median or least squares ... more ABSTRACT Computation of typical statistical sample estimates such as the median or least squares fit usually require the solution of an unconstrained optimization problem with a convex objective function, that can be solved efficiently by various methods. The presence of outliers in the data dictates the computation of a robust estimate, which can be defined as the optimum statistical estimate for a subset that contains at least half of the observations. The resulting problem is now a combinatorial optimization problem which is often computationally intractable. Classical statistical methods for multivariate location \(\varvec{\mu }\) and scatter matrix \(\varvec{\varSigma }\) estimation are based on the sample mean vector and covariance matrix, which are very sensitive in the presence of outlier observations. We propose a new method for robust location and scatter estimation which is composed of two stages. In the first stage an unbiased multivariate \(L_{1}\) -median center for all the observations is attained by a novel procedure called the least trimmed Euclidean deviations estimator. This robust median defines a coverage set of observations which is used in the second stage to iteratively compute the set of outliers which violate the correlational structure of the data set. Extensive computational experiments indicate that the proposed method outperforms existing methods in accuracy, robustness and computational time.
The material in this chapter constitutes a brief look at what can be considered as the fundamenta... more The material in this chapter constitutes a brief look at what can be considered as the fundamental core of matroid theory.
In this chapter we will present a set of propositions that characterize common properties of grap... more In this chapter we will present a set of propositions that characterize common properties of graphs, vector spaces, and transversals.
In this study a combinatorial algorithm is developed for the detection of regions that could be a... more In this study a combinatorial algorithm is developed for the detection of regions that could be approximated by circular disks. The figure of a region is a section where the boundary is inordinately asymmetrical and jagged. Three circular disks are computed for each region. The indisk which is the largest disk contained in the region, the outdisk which is the smallest disk that contains the region and has the same center as the indisk, and the approximation disk with area equal to the area of the region and has the maximum intersection with the region. Reasonably there are some problems described as a considerable remark for efficient use of the proposed algorithm: in meteorology, where hail pads have to be analyzed in order to determine the number and characteristics of the hailstones that collided with the pad, in biology especially in cytology where cells have to be identified for the reason that contains useful information that should be extracted. Computational results on a set of benchmark images from actual data are presented.
Uploads
Papers by L. Pitsoulis