2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), Nov 1, 2017
Fig. 1: In a sweeping task, the position of the trash (colored circles) can be considered as the ... more Fig. 1: In a sweeping task, the position of the trash (colored circles) can be considered as the task parameter, governing variations in the demonstrations.
Teleoperating a robot for complex and intricate tasks demands a high mental workload from a human... more Teleoperating a robot for complex and intricate tasks demands a high mental workload from a human operator. Deploying multiple operators can mitigate this problem, but it can be also a costly solution. Learning from Demonstrations can reduce the human operator's burden by learning repetitive teleoperation tasks. Yet, the demonstrations via teleoperation tend to be inconsistent compared to other modalities of human demonstrations. In order to handle less consistent and asynchronous demonstrations effectively, this paper proposes a learning scheme based on Dynamic Movement Primitives. In particular, a new Expectation Maximization algorithm which synchronizes and encodes demonstrations with high temporal and spatial variances is proposed. Furthermore, we discuss two shared teleoperation architectures, where, instead of multiple human operators, a learned artificial agent and a human operator share authority over a task while teleoperating cooperatively. The agent controls the more mundane and repetitive motion in the task whereas human takes charge of the more critical and uncertain motion. The proposed algorithm together with the two shared teleoperation architectures (human-synchronized and agent-synchronized shared teleoperation) has been tested and validated through simulation and experiments on 3 Degrees-of-Freedom Phantom-to-Phantom teleoperation. Conclusively, the both proposed shared teleoperation architectures have shown superior performance when compared with the human-only teleoperation for a peg-in-hole task.
Task-parameterized skill learning aims at adaptive motion encoding to new situations. While exist... more Task-parameterized skill learning aims at adaptive motion encoding to new situations. While existing approaches for task parameterized skill learning have demonstrated good adaptation within the demonstrated region, the extrapolation problem of task parameterized skills has not been investigated enough. In this work, with the aim of good adaptation not only within the demonstrated region but also outside of the region, we propose to combine a generative model with a Dynamic Movement Primitive (DMP) by formulating learning as a density estimation problem. Moreover, for efficient learning from relatively few demonstrations, we propose to augment training data with additional incomplete data. The proposed method is tested and compared with existing works in simulations and real robot experiments. Experimental results verified its generalization in the extrapolation region.
This paper addresses the problem of fitting finite Gaussian Mixture Model (GMM) with unknown numb... more This paper addresses the problem of fitting finite Gaussian Mixture Model (GMM) with unknown number of components to the univariate and multivariate data. The typical method for fitting a GMM is Expectation Maximization (EM) in which many challenges are involved i.e. how to initialize the GMM, how to restrict the covariance matrix of a component from becoming singular and setting the number of components in advance. This paper presents a simulated annealing EM algorithm along with a systematic initialization procedure by using the principals of stochastic exploration. The experiments have demonstrated the robustness of our approach on different datasets.
2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2012
In learning by exploration problems such as reinforcement learning (RL), direct policy search, st... more In learning by exploration problems such as reinforcement learning (RL), direct policy search, stochastic optimization or evolutionary computation, the goal of an agent is to maximize some form of reward function (or minimize a cost function). Often, these algorithms are designed to find a single policy solution. We address the problem of representing the space of control policy solutions by considering exploration as a density estimation problem. Such representation provides additional information such as shape and curvature of local peaks that can be exploited to analyze the discovered solutions and guide the exploration. We show that the search process can easily be generalized to multi-peaked distributions by employing a Gaussian mixture model (GMM) with an adaptive number of components. The GMM has a dual role: representing the space of possible control policies, and guiding the exploration of new policies. A variation of expectation-maximization (EM) applied to reward-weighted policy parameters is presented to model the space of possible solutions, as if this space was a probability distribution. The approach is tested in a dart game experiment formulated as a black-box optimization problem, where the agent's throwing capability increases while it chases for the best strategy to play the game. This experiment is used to study how the proposed approach can exploit new promising solution alternatives in the search process, when the optimality criterion slowly drifts over time. The results show that the proposed multi-optima search approach can anticipate such changes by exploiting promising candidates to smoothly adapt to the change of global optimum.
This paper presents a complete Simultaneous Localization and Mapping (SLAM) solution for indoor m... more This paper presents a complete Simultaneous Localization and Mapping (SLAM) solution for indoor mobile robots, addressing feature extraction, autonomous exploration and navigation using the continuously updating map. The platform used is Pioneer PeopleBot equipped with SICK Laser Measurment System (LMS) and odometery. Our algorithm uses Hough Transform to extract the major representative features of indoor environment such as lines and
This thesis focuses on having a robot learn to play the game of darts. Playing darts involves mul... more This thesis focuses on having a robot learn to play the game of darts. Playing darts involves multiple tasks. For e.g. how to throw a dart and where to target on the board. It has been shown that with a few demonstrations by the user, the robot can learn how to produce trajectories for hitting a given point on the board; with improvement in accuracy along with the experience. On the other hand we showed how a robot can discover regions for hitting on the board so that it can maximize its expected score.
2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), Nov 1, 2017
Fig. 1: In a sweeping task, the position of the trash (colored circles) can be considered as the ... more Fig. 1: In a sweeping task, the position of the trash (colored circles) can be considered as the task parameter, governing variations in the demonstrations.
Teleoperating a robot for complex and intricate tasks demands a high mental workload from a human... more Teleoperating a robot for complex and intricate tasks demands a high mental workload from a human operator. Deploying multiple operators can mitigate this problem, but it can be also a costly solution. Learning from Demonstrations can reduce the human operator's burden by learning repetitive teleoperation tasks. Yet, the demonstrations via teleoperation tend to be inconsistent compared to other modalities of human demonstrations. In order to handle less consistent and asynchronous demonstrations effectively, this paper proposes a learning scheme based on Dynamic Movement Primitives. In particular, a new Expectation Maximization algorithm which synchronizes and encodes demonstrations with high temporal and spatial variances is proposed. Furthermore, we discuss two shared teleoperation architectures, where, instead of multiple human operators, a learned artificial agent and a human operator share authority over a task while teleoperating cooperatively. The agent controls the more mundane and repetitive motion in the task whereas human takes charge of the more critical and uncertain motion. The proposed algorithm together with the two shared teleoperation architectures (human-synchronized and agent-synchronized shared teleoperation) has been tested and validated through simulation and experiments on 3 Degrees-of-Freedom Phantom-to-Phantom teleoperation. Conclusively, the both proposed shared teleoperation architectures have shown superior performance when compared with the human-only teleoperation for a peg-in-hole task.
Task-parameterized skill learning aims at adaptive motion encoding to new situations. While exist... more Task-parameterized skill learning aims at adaptive motion encoding to new situations. While existing approaches for task parameterized skill learning have demonstrated good adaptation within the demonstrated region, the extrapolation problem of task parameterized skills has not been investigated enough. In this work, with the aim of good adaptation not only within the demonstrated region but also outside of the region, we propose to combine a generative model with a Dynamic Movement Primitive (DMP) by formulating learning as a density estimation problem. Moreover, for efficient learning from relatively few demonstrations, we propose to augment training data with additional incomplete data. The proposed method is tested and compared with existing works in simulations and real robot experiments. Experimental results verified its generalization in the extrapolation region.
This paper addresses the problem of fitting finite Gaussian Mixture Model (GMM) with unknown numb... more This paper addresses the problem of fitting finite Gaussian Mixture Model (GMM) with unknown number of components to the univariate and multivariate data. The typical method for fitting a GMM is Expectation Maximization (EM) in which many challenges are involved i.e. how to initialize the GMM, how to restrict the covariance matrix of a component from becoming singular and setting the number of components in advance. This paper presents a simulated annealing EM algorithm along with a systematic initialization procedure by using the principals of stochastic exploration. The experiments have demonstrated the robustness of our approach on different datasets.
2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2012
In learning by exploration problems such as reinforcement learning (RL), direct policy search, st... more In learning by exploration problems such as reinforcement learning (RL), direct policy search, stochastic optimization or evolutionary computation, the goal of an agent is to maximize some form of reward function (or minimize a cost function). Often, these algorithms are designed to find a single policy solution. We address the problem of representing the space of control policy solutions by considering exploration as a density estimation problem. Such representation provides additional information such as shape and curvature of local peaks that can be exploited to analyze the discovered solutions and guide the exploration. We show that the search process can easily be generalized to multi-peaked distributions by employing a Gaussian mixture model (GMM) with an adaptive number of components. The GMM has a dual role: representing the space of possible control policies, and guiding the exploration of new policies. A variation of expectation-maximization (EM) applied to reward-weighted policy parameters is presented to model the space of possible solutions, as if this space was a probability distribution. The approach is tested in a dart game experiment formulated as a black-box optimization problem, where the agent's throwing capability increases while it chases for the best strategy to play the game. This experiment is used to study how the proposed approach can exploit new promising solution alternatives in the search process, when the optimality criterion slowly drifts over time. The results show that the proposed multi-optima search approach can anticipate such changes by exploiting promising candidates to smoothly adapt to the change of global optimum.
This paper presents a complete Simultaneous Localization and Mapping (SLAM) solution for indoor m... more This paper presents a complete Simultaneous Localization and Mapping (SLAM) solution for indoor mobile robots, addressing feature extraction, autonomous exploration and navigation using the continuously updating map. The platform used is Pioneer PeopleBot equipped with SICK Laser Measurment System (LMS) and odometery. Our algorithm uses Hough Transform to extract the major representative features of indoor environment such as lines and
This thesis focuses on having a robot learn to play the game of darts. Playing darts involves mul... more This thesis focuses on having a robot learn to play the game of darts. Playing darts involves multiple tasks. For e.g. how to throw a dart and where to target on the board. It has been shown that with a few demonstrations by the user, the robot can learn how to produce trajectories for hitting a given point on the board; with improvement in accuracy along with the experience. On the other hand we showed how a robot can discover regions for hitting on the board so that it can maximize its expected score.
Uploads
Papers by Affan Pervez