We investigate the diameter problem in the streaming and sliding-window models. We show that, for... more We investigate the diameter problem in the streaming and sliding-window models. We show that, for a stream of n points or a sliding window of size n, any exact algorithm for diameter requires Ω(n) bits of space. We present a simple ɛ-approximation 6 algorithm for com-puting the diameter in the streaming model. Our main result is an ɛ-approximation algorithm that maintains the diameter in two dimen-sions in the sliding-window model using O ( 1 ɛ 3/2 log 3 n(log R+log log n+ log 1 ɛ)) bits of space, where R is the maximum, over all windows, of the ratio of the diameter to the minimum non-zero distance between any two points in the window.
We investigate the diameter problem in the streaming and slidingwindow models. We show that, for ... more We investigate the diameter problem in the streaming and slidingwindow models. We show that, for a stream of n points or a sliding window of size n, any exact algorithm for diameter requires \Omega (n) bits of space. We present a simple ffl-approximation6 algorithm for computing the diameter in the streaming model. Our main result is an ffl-approximation algorithm that maintains the diameter in two dimensions in the sliding-window model using O ( 1ffl3/2 log3 n(log R+log log n+ log 1ffl)) bits of space, where R is the maximum, over all windows, of the ratio of the diameter to the minimum non-zero distance between any two points in the window.
We consider the problem of selecting the rth-smallest element from a list of n elements under a m... more We consider the problem of selecting the rth-smallest element from a list of n elements under a model where the comparisons may have different costs depending on the elements being compared. This model was introduced by [3] and is realistic in the context of comparisons between complex objects. An important special case of this general cost model is one where the comparison costs are monotone in the sizes of the elements being compared. This monotone cost model covers most "natural" cost models that arise and the selection problem turns out to be the most challenging one among the usual problems for comparison-based algorithms. We present an O(log2 n)-competitive algorithm for selection under the monotone cost model. This is in contrast to an Ω(n) lower bound that is known for arbitrary comparison costs. We also consider selection under a special case of monotone costs --- the min model where the cost of comparing two elements is the minimum of the sizes. We give a randomi...
A discrete-time Markov chain consists of a set of states and a transition matrix that specifies t... more A discrete-time Markov chain consists of a set of states and a transition matrix that specifies the probability of going to some next state y, given only the current state x, and thus, independent of the history of the Markov chain (Black; Khamsi; Weisstein). Batu, Guha and Kannan defined the problem of " inferring a 'mixture of Markov chains' based on observing a stream of interleaved outputs from these chains, " (186), and gave algorithms for solving various versions of the problem (191-199). In this project, I have created and implemented a versatile " stream generator " that generates streams of output that are used as test data by implementing Markov chains in a given state space under the various models set forth by Batu et al. (186-199). I have implemented the algorithms for both the simple mixture model and chain-dependant mixture model (under the first condition given) on disjoint state spaces as set forth by Batu et al (191-197), and I have expe...
Computer systems are often monitored for performance evaluation and enhancement, debugging and te... more Computer systems are often monitored for performance evaluation and enhancement, debugging and testing, control or to check for the correctness of the system. Recently, the problem of designing monitors to check for the correctness of system implementation has received increased attention from the research community. Traditionally, verification has been used to increase the confidence that a system will be correct by making sure that a design specification is correct. However, even if a design has been formally verified, it still does not ensure the correctness of an implementation of the design. This is because the implementation often is much more detailed, and may not strictly follow the formal design. So, there is possibility for introduction of errors into an implementation of a design that has been verified. One way that people have traditionally tried to overcome this gap between the design and the implementation has been to test the implementation's behavior on a pre-det...
We present symmetric and asymmetric similarity measures for labeled directed rooted graphs that a... more We present symmetric and asymmetric similarity measures for labeled directed rooted graphs that are inspired by the simulation and bisimulation relations on labeled transition systems. Computation of the similarity measures has close connections to discounted Markov decision processes in the asymmetric case and to perfect-information stochastic games in the symmetric case. For the symmetric case, we also give a polynomial-time algorithm that approximates the similarity to any desired precision. Comments Postprint version. Published in Lecture Notes in Computer Science, Volume 3920, Tools and Algorithms for the Construction and Analysis of Systems: Proceedings of 12th International Conference on Tools and Algorithms for the Construction and Analysis of Systems(TACAS 2006), pages 426-440. Publisher URL: http://dx.doi.org/10.1007/11691372_28 This conference paper is available at ScholarlyCommons: http://repository.upenn.edu/cis_papers/237 Simulation-Based Graph Similarity? Oleg Sokolsk...
We explore problems related to computing graph distances in the data-stream model. The goal is to... more We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, (O) over tilde (tn(1+1/ t))-space, (O) over tilde (t(2)n(1/t))-time-per-edge algorithm that constructs a (2t + 1)-spanner. For t = Omega(log n/log log n), the alg...
We introduce the \emph{pipeline intervention} problem, defined by a layered directed acyclic grap... more We introduce the \emph{pipeline intervention} problem, defined by a layered directed acyclic graph and a set of stochastic matrices governing transitions between successive layers. The graph is a stylized model for how people from different populations are presented opportunities, eventually leading to some reward. In our model, individuals are born into an initial position (i.e. some node in the first layer of the graph) according to a fixed probability distribution, and then stochastically progress through the graph according to the transition matrices, until they reach a node in the final layer of the graph; each node in the final layer has a \emph{reward} associated with it. The pipeline intervention problem asks how to best make costly changes to the transition matrices governing people's stochastic transitions through the graph, subject to a budget constraint. We consider two objectives: social welfare maximization, and a fairness-motivated maximin objective that seeks to ...
We initiate a systematic study of linear sketching over $\mathbb F_2$. For a given Boolean functi... more We initiate a systematic study of linear sketching over $\mathbb F_2$. For a given Boolean function $f \colon \{0,1\}^n \to \{0,1\}$ a randomized $\mathbb F_2$-sketch is a distribution $\mathcal M$ over $d \times n$ matrices with elements over $\mathbb F_2$ such that $\mathcal Mx$ suffices for computing $f(x)$ with high probability. We study a connection between $\mathbb F_2$-sketching and a two-player one-way communication game for the corresponding XOR-function. Our results show that this communication game characterizes $\mathbb F_2$-sketching under the uniform distribution (up to dependence on error). Implications of this result include: 1) a composition theorem for $\mathbb F_2$-sketching complexity of a recursive majority function, 2) a tight relationship between $\mathbb F_2$-sketching complexity and Fourier sparsity, 3) lower bounds for a certain subclass of symmetric functions. We also fully resolve a conjecture of Montanaro and Osborne regarding one-way communication compl...
The classical model of Markov decision processes with costs or rewards, while widely used to form... more The classical model of Markov decision processes with costs or rewards, while widely used to formalize optimal decision making, cannot capture scenarios where there are multiple objectives for the agent during the system evolution, but only one of these objectives gets actualized upon termination. We introduce the model of Markov decision processes with alternative objectives (MDPAO) for formalizing optimization in such scenarios. To compute the strategy to optimize the expected cost/reward upon termination, we need to figure out how to balance the values of the alternative objectives. This requires analysis of the underlying infinite-state process that tracks the accumulated values of all the objectives. While the decidability of the problem of computing the exact optimal strategy for the general model remains open, we present the following results. First, for a Markov chain with alternative objectives, the optimal expected cost/reward can be computed in polynomial-time. Second, fo...
We initiate a systematic study of linear sketching over F2. For a given Boolean function treated ... more We initiate a systematic study of linear sketching over F2. For a given Boolean function treated as f: Fn2 → F2 a randomized F2-sketch is a distribution M over d × n matrices with elements over F2 such that Mx suffices for computing f(x) with high probability. Such sketches for d L n can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between F2-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that F2-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree F2-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that F2-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming al...
In this paper, we address the problem of finding the minimal number of viewpoints outside a polyh... more In this paper, we address the problem of finding the minimal number of viewpoints outside a polyhedron in two or three dimensions such that every point on the exterior of the polyhedron is visible from at least one of the chosen viewpoints. This problem which we call the minimum fortress guard problem (MFGP) is the optimization version of a variant of the art-gallery problem (sometimes called the fortress problem with point guards) and has practical importance in surveillance and image-based rendering. Solutions in the vision and graphics literature are based on image quality constraints and are not concerned with the number of viewpoints needed. The corresponding question for art galleries (minimum number of viewpoints in the interior of a polygon to see the interior of the polygon) which we call the minimum art-gallery guard problem (MAGP) has been shown to be NP-complete. A simple reduction from this problem shows the NP-completeness of MFGP. Instead of relying on heuristic searc...
We investigate the security properties of isotropic channels, broadcast media in which a receiver... more We investigate the security properties of isotropic channels, broadcast media in which a receiver cannot reliably determine whether a message originated from any particular sender and a sender cannot reliably direct a message away from any particular receiver. We show that perfect isotropism implies perfect (informationtheoretic) secrecy, and that asymptotically close to perfect secrecy can be achieved on any channel that provides some (bounded) uncertainty as to sender identity. We give isotropic security protocols under both passive and active adversary models, and discuss the practicality of realizing isotropic channels over various media.
We consider the problem of designing sublinear time algorithms for estimating the cost of a minim... more We consider the problem of designing sublinear time algorithms for estimating the cost of a minimum metric traveling salesman (TSP) tour. Specifically, given access to a $n \times n$ distance matrix $D$ that specifies pairwise distances between $n$ points, the goal is to estimate the TSP cost by performing only sublinear (in the size of $D$) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any $\varepsilon > 0$, there exists an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm that returns a $(1 + \varepsilon)$-approximate estimate of the MST cost. This result immediately implies an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm to estimate the TSP cost to within a $(2 + \varepsilon)$ factor for any $\varepsilon > 0$. However, no $o(n^2)$ time algorithms are known to approximate metric TSP to a factor that is strictly better than $2$. On the other hand, there were also no known barriers that rule o...
We describe the Monitoring and Checking (MaC) framework which provides assurance on the correctne... more We describe the Monitoring and Checking (MaC) framework which provides assurance on the correctness of program execution at run-time. Our approach complements the two traditional approaches for ensuring that a system is correct, namely static analysis and testing. Unlike these approaches, which try to ensure that all possible executions of the system are correct, our approach concentrates on the correctness of the current execution of the system. The MaC architecture consists of three components: a lter, an event recognizer, and a run-time checker. The lter extracts low-level information, e.g., values of program variables and function calls, from the system code, and sends it to the event recognizer. From this low-level information, the event recognizer detects the occurrence of \abstract" requirements-level events, and informs the run-time checker about them. The run-time checker uses these events to check that the current system execution conforms to the formal requirements s...
We are given a collection of m random subsequences (traces) of a string t of length n where each ... more We are given a collection of m random subsequences (traces) of a string t of length n where each trace is obtained by deleting each bit in the string with probability q. Our goal is to exactly reconstruct the string t from these observed traces. We initiate here a study of deletion rates for which we can successfully reconstruct the original string using a small number of samples. We investigate a simple reconstruction algorithm called Bitwise Majority Alignment that uses majority voting (with suitable shifts) to determine each bit of the original string. We show that for random strings t, we can reconstruct the original string (w.h.p.) for q = O(1/ log n) using only O(log n) samples. For arbitrary strings t, we show that a simple modification of Bitwise Majority Alignment reconstructs a string that has identical structure to the original string (w.h.p.) for q = O(1/n1/2+ε) using O(1) samples. In this case, using O(n log n) samples, we can reconstruct the original string exactly. Ou...
When selecting locations for a set of centers, standard clustering algorithms may place unfair bu... more When selecting locations for a set of centers, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given k centers to locate and a population of size n, we define the “neighborhood radius” of an individual i as the minimum radius of a ball centered at i that contains at least n/k individuals. Our objective is to ensure that each individual has a center that is within at most a small constant factor of her neighborhood radius. We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function an...
We investigate the diameter problem in the streaming and sliding-window models. We show that, for... more We investigate the diameter problem in the streaming and sliding-window models. We show that, for a stream of n points or a sliding window of size n, any exact algorithm for diameter requires Ω(n) bits of space. We present a simple ɛ-approximation 6 algorithm for com-puting the diameter in the streaming model. Our main result is an ɛ-approximation algorithm that maintains the diameter in two dimen-sions in the sliding-window model using O ( 1 ɛ 3/2 log 3 n(log R+log log n+ log 1 ɛ)) bits of space, where R is the maximum, over all windows, of the ratio of the diameter to the minimum non-zero distance between any two points in the window.
We investigate the diameter problem in the streaming and slidingwindow models. We show that, for ... more We investigate the diameter problem in the streaming and slidingwindow models. We show that, for a stream of n points or a sliding window of size n, any exact algorithm for diameter requires \Omega (n) bits of space. We present a simple ffl-approximation6 algorithm for computing the diameter in the streaming model. Our main result is an ffl-approximation algorithm that maintains the diameter in two dimensions in the sliding-window model using O ( 1ffl3/2 log3 n(log R+log log n+ log 1ffl)) bits of space, where R is the maximum, over all windows, of the ratio of the diameter to the minimum non-zero distance between any two points in the window.
We consider the problem of selecting the rth-smallest element from a list of n elements under a m... more We consider the problem of selecting the rth-smallest element from a list of n elements under a model where the comparisons may have different costs depending on the elements being compared. This model was introduced by [3] and is realistic in the context of comparisons between complex objects. An important special case of this general cost model is one where the comparison costs are monotone in the sizes of the elements being compared. This monotone cost model covers most "natural" cost models that arise and the selection problem turns out to be the most challenging one among the usual problems for comparison-based algorithms. We present an O(log2 n)-competitive algorithm for selection under the monotone cost model. This is in contrast to an Ω(n) lower bound that is known for arbitrary comparison costs. We also consider selection under a special case of monotone costs --- the min model where the cost of comparing two elements is the minimum of the sizes. We give a randomi...
A discrete-time Markov chain consists of a set of states and a transition matrix that specifies t... more A discrete-time Markov chain consists of a set of states and a transition matrix that specifies the probability of going to some next state y, given only the current state x, and thus, independent of the history of the Markov chain (Black; Khamsi; Weisstein). Batu, Guha and Kannan defined the problem of " inferring a 'mixture of Markov chains' based on observing a stream of interleaved outputs from these chains, " (186), and gave algorithms for solving various versions of the problem (191-199). In this project, I have created and implemented a versatile " stream generator " that generates streams of output that are used as test data by implementing Markov chains in a given state space under the various models set forth by Batu et al. (186-199). I have implemented the algorithms for both the simple mixture model and chain-dependant mixture model (under the first condition given) on disjoint state spaces as set forth by Batu et al (191-197), and I have expe...
Computer systems are often monitored for performance evaluation and enhancement, debugging and te... more Computer systems are often monitored for performance evaluation and enhancement, debugging and testing, control or to check for the correctness of the system. Recently, the problem of designing monitors to check for the correctness of system implementation has received increased attention from the research community. Traditionally, verification has been used to increase the confidence that a system will be correct by making sure that a design specification is correct. However, even if a design has been formally verified, it still does not ensure the correctness of an implementation of the design. This is because the implementation often is much more detailed, and may not strictly follow the formal design. So, there is possibility for introduction of errors into an implementation of a design that has been verified. One way that people have traditionally tried to overcome this gap between the design and the implementation has been to test the implementation's behavior on a pre-det...
We present symmetric and asymmetric similarity measures for labeled directed rooted graphs that a... more We present symmetric and asymmetric similarity measures for labeled directed rooted graphs that are inspired by the simulation and bisimulation relations on labeled transition systems. Computation of the similarity measures has close connections to discounted Markov decision processes in the asymmetric case and to perfect-information stochastic games in the symmetric case. For the symmetric case, we also give a polynomial-time algorithm that approximates the similarity to any desired precision. Comments Postprint version. Published in Lecture Notes in Computer Science, Volume 3920, Tools and Algorithms for the Construction and Analysis of Systems: Proceedings of 12th International Conference on Tools and Algorithms for the Construction and Analysis of Systems(TACAS 2006), pages 426-440. Publisher URL: http://dx.doi.org/10.1007/11691372_28 This conference paper is available at ScholarlyCommons: http://repository.upenn.edu/cis_papers/237 Simulation-Based Graph Similarity? Oleg Sokolsk...
We explore problems related to computing graph distances in the data-stream model. The goal is to... more We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, (O) over tilde (tn(1+1/ t))-space, (O) over tilde (t(2)n(1/t))-time-per-edge algorithm that constructs a (2t + 1)-spanner. For t = Omega(log n/log log n), the alg...
We introduce the \emph{pipeline intervention} problem, defined by a layered directed acyclic grap... more We introduce the \emph{pipeline intervention} problem, defined by a layered directed acyclic graph and a set of stochastic matrices governing transitions between successive layers. The graph is a stylized model for how people from different populations are presented opportunities, eventually leading to some reward. In our model, individuals are born into an initial position (i.e. some node in the first layer of the graph) according to a fixed probability distribution, and then stochastically progress through the graph according to the transition matrices, until they reach a node in the final layer of the graph; each node in the final layer has a \emph{reward} associated with it. The pipeline intervention problem asks how to best make costly changes to the transition matrices governing people's stochastic transitions through the graph, subject to a budget constraint. We consider two objectives: social welfare maximization, and a fairness-motivated maximin objective that seeks to ...
We initiate a systematic study of linear sketching over $\mathbb F_2$. For a given Boolean functi... more We initiate a systematic study of linear sketching over $\mathbb F_2$. For a given Boolean function $f \colon \{0,1\}^n \to \{0,1\}$ a randomized $\mathbb F_2$-sketch is a distribution $\mathcal M$ over $d \times n$ matrices with elements over $\mathbb F_2$ such that $\mathcal Mx$ suffices for computing $f(x)$ with high probability. We study a connection between $\mathbb F_2$-sketching and a two-player one-way communication game for the corresponding XOR-function. Our results show that this communication game characterizes $\mathbb F_2$-sketching under the uniform distribution (up to dependence on error). Implications of this result include: 1) a composition theorem for $\mathbb F_2$-sketching complexity of a recursive majority function, 2) a tight relationship between $\mathbb F_2$-sketching complexity and Fourier sparsity, 3) lower bounds for a certain subclass of symmetric functions. We also fully resolve a conjecture of Montanaro and Osborne regarding one-way communication compl...
The classical model of Markov decision processes with costs or rewards, while widely used to form... more The classical model of Markov decision processes with costs or rewards, while widely used to formalize optimal decision making, cannot capture scenarios where there are multiple objectives for the agent during the system evolution, but only one of these objectives gets actualized upon termination. We introduce the model of Markov decision processes with alternative objectives (MDPAO) for formalizing optimization in such scenarios. To compute the strategy to optimize the expected cost/reward upon termination, we need to figure out how to balance the values of the alternative objectives. This requires analysis of the underlying infinite-state process that tracks the accumulated values of all the objectives. While the decidability of the problem of computing the exact optimal strategy for the general model remains open, we present the following results. First, for a Markov chain with alternative objectives, the optimal expected cost/reward can be computed in polynomial-time. Second, fo...
We initiate a systematic study of linear sketching over F2. For a given Boolean function treated ... more We initiate a systematic study of linear sketching over F2. For a given Boolean function treated as f: Fn2 → F2 a randomized F2-sketch is a distribution M over d × n matrices with elements over F2 such that Mx suffices for computing f(x) with high probability. Such sketches for d L n can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between F2-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that F2-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree F2-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that F2-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming al...
In this paper, we address the problem of finding the minimal number of viewpoints outside a polyh... more In this paper, we address the problem of finding the minimal number of viewpoints outside a polyhedron in two or three dimensions such that every point on the exterior of the polyhedron is visible from at least one of the chosen viewpoints. This problem which we call the minimum fortress guard problem (MFGP) is the optimization version of a variant of the art-gallery problem (sometimes called the fortress problem with point guards) and has practical importance in surveillance and image-based rendering. Solutions in the vision and graphics literature are based on image quality constraints and are not concerned with the number of viewpoints needed. The corresponding question for art galleries (minimum number of viewpoints in the interior of a polygon to see the interior of the polygon) which we call the minimum art-gallery guard problem (MAGP) has been shown to be NP-complete. A simple reduction from this problem shows the NP-completeness of MFGP. Instead of relying on heuristic searc...
We investigate the security properties of isotropic channels, broadcast media in which a receiver... more We investigate the security properties of isotropic channels, broadcast media in which a receiver cannot reliably determine whether a message originated from any particular sender and a sender cannot reliably direct a message away from any particular receiver. We show that perfect isotropism implies perfect (informationtheoretic) secrecy, and that asymptotically close to perfect secrecy can be achieved on any channel that provides some (bounded) uncertainty as to sender identity. We give isotropic security protocols under both passive and active adversary models, and discuss the practicality of realizing isotropic channels over various media.
We consider the problem of designing sublinear time algorithms for estimating the cost of a minim... more We consider the problem of designing sublinear time algorithms for estimating the cost of a minimum metric traveling salesman (TSP) tour. Specifically, given access to a $n \times n$ distance matrix $D$ that specifies pairwise distances between $n$ points, the goal is to estimate the TSP cost by performing only sublinear (in the size of $D$) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any $\varepsilon > 0$, there exists an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm that returns a $(1 + \varepsilon)$-approximate estimate of the MST cost. This result immediately implies an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm to estimate the TSP cost to within a $(2 + \varepsilon)$ factor for any $\varepsilon > 0$. However, no $o(n^2)$ time algorithms are known to approximate metric TSP to a factor that is strictly better than $2$. On the other hand, there were also no known barriers that rule o...
We describe the Monitoring and Checking (MaC) framework which provides assurance on the correctne... more We describe the Monitoring and Checking (MaC) framework which provides assurance on the correctness of program execution at run-time. Our approach complements the two traditional approaches for ensuring that a system is correct, namely static analysis and testing. Unlike these approaches, which try to ensure that all possible executions of the system are correct, our approach concentrates on the correctness of the current execution of the system. The MaC architecture consists of three components: a lter, an event recognizer, and a run-time checker. The lter extracts low-level information, e.g., values of program variables and function calls, from the system code, and sends it to the event recognizer. From this low-level information, the event recognizer detects the occurrence of \abstract" requirements-level events, and informs the run-time checker about them. The run-time checker uses these events to check that the current system execution conforms to the formal requirements s...
We are given a collection of m random subsequences (traces) of a string t of length n where each ... more We are given a collection of m random subsequences (traces) of a string t of length n where each trace is obtained by deleting each bit in the string with probability q. Our goal is to exactly reconstruct the string t from these observed traces. We initiate here a study of deletion rates for which we can successfully reconstruct the original string using a small number of samples. We investigate a simple reconstruction algorithm called Bitwise Majority Alignment that uses majority voting (with suitable shifts) to determine each bit of the original string. We show that for random strings t, we can reconstruct the original string (w.h.p.) for q = O(1/ log n) using only O(log n) samples. For arbitrary strings t, we show that a simple modification of Bitwise Majority Alignment reconstructs a string that has identical structure to the original string (w.h.p.) for q = O(1/n1/2+ε) using O(1) samples. In this case, using O(n log n) samples, we can reconstruct the original string exactly. Ou...
When selecting locations for a set of centers, standard clustering algorithms may place unfair bu... more When selecting locations for a set of centers, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given k centers to locate and a population of size n, we define the “neighborhood radius” of an individual i as the minimum radius of a ball centered at i that contains at least n/k individuals. Our objective is to ensure that each individual has a center that is within at most a small constant factor of her neighborhood radius. We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function an...
Uploads
Papers by Sampath Kannan