Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world r... more Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world robot control tasks in recent years. Nevertheless, applying DRL to novel robot control tasks is still challenging, especially when researchers have to design the action and observation space and the reward function. In this paper, we investigate partial observability as a potential failure source of applying DRL to robot control tasks, which can occur when researchers are not confident whether the observation space fully represents the underlying state. We compare the performance of three common DRL algorithms, TD3, SAC and PPO under various partial observability conditions. We find that TD3 and SAC become easily stuck in local optima and underperform PPO. We propose multi-step versions of the vanilla TD3 and SAC to improve robustness to partial observability based on one-step bootstrapping.
The heating of shape memory alloy (SMA) materials leads to a thermally driven phase change which ... more The heating of shape memory alloy (SMA) materials leads to a thermally driven phase change which can be used to do work. An SMA wire can be thermally cycled by controlling electric current through the wire, creating an electro-mechanical actuator. Such actuators are typically heated electrically and cooled through convection. The thermal time constants and lack of active cooling limit
The mechanisms of reactive sintering in a Ni + Ti powder compact were investigated using differen... more The mechanisms of reactive sintering in a Ni + Ti powder compact were investigated using differential scanning calorimetry and microstructural analysis. Heating these mixtures up to 900°C involves the slow growth of the three intermetallic compounds and the transformation of a-to b-Ti followed by its rapid saturation with Ni. When samples were heated above 942°C a thermal explosion (TE) mode of self-propagating high-temperature synthesis (SHS) was ignited by the melting of the (b-Ti) solid solution at 944°C. Increasing hold time at 900°C prior to SHS decreased the volume fraction of (b-Ti) in the powder compact and reduced the magnitude of combustion. The amount of (b-Ti) was quantitatively determined using DSC and found to decay according to a two-stage parabolic law. In addition, the magnitude of the exothermic reaction occurring during SHS was found to decrease linearly with a decrease in the volume fraction of (b-Ti) developed at 900°C.
2007 IEEE/ASME international conference on advanced intelligent mechatronics, 2007
The phase change in shape memory alloys (SMA) is highly nonlinear, and the development of advance... more The phase change in shape memory alloys (SMA) is highly nonlinear, and the development of advanced position ing applications for SMA actuators benefits from the availability of good models of this behaviour. One phenomenological model for SMA transformation kinetics is Madill’s model, which has recently been extended to include the effect of time varying stress. This extension allows for the
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optima... more A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Processes (MDPs). In real-world robotics, this assumption is unpractical, because of issues such as sensor sensitivity limitations and sensor noise, and the lack of knowledge about whether the observation design is complete or not. These scenarios lead to Partially Observable MDPs (POMDPs). In this paper, we propose Long-Short-Term-Memory-based Twin Delayed Deep Deterministic Policy Gradient (LSTM-TD3) by introducing a memory component to TD3, and compare its performance with other DRL algorithms in both MDPs and POMDPs. Our results demonstrate the significant advantages of the memory component in addressing POMDPs, including the ability to handle missing and noisy observation data.
Canada’s Living Architecture Systems Group (LASG) combines scientists, engineers, architects and ... more Canada’s Living Architecture Systems Group (LASG) combines scientists, engineers, architects and artists working together to create large-scale prototypes of immersive architectural spaces with qualities that come strikingly close to those of living systems. Working in interdis- ciplinary groups combining architects, engineers and scientists, LASG is building environments that can move, respond, and learn; environments that renew themselves with chemical exchanges and that are adaptive and empathic toward their inhabitants. This paper provides a detailed review of the history and current research of the group, including a detailed case study of Epiphyte Chamber, a complex immersive environment presented at the new Museum of Modern and Contemporary Art in Seoul, 2014. We also include a context of precedents in the rapidly evolving field of responsive architecture, and a description of the current implementation of a new proprioceptive distributed control system employing a curiosity-...
Physical agents that can autonomously generate engaging, life-like behaviour will lead to more re... more Physical agents that can autonomously generate engaging, life-like behaviour will lead to more responsive and interesting robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well controlled settings, future physical agents should be capable of interacting with humans in natural settings, including group interaction. In order to generate engaging behaviours, the autonomous system must first be able to estimate its human partners' engagement level. In this paper, we propose an approach for estimating engagement during group interaction by simultaneously taking into account active and passive interaction, i.e. occupancy, and use the measure as the reward signal within a reinforcement learning framework to learn engaging interactive behaviours. The proposed approach is implemented in an interactive sculptural system in a museum setting. We compare the learning system to a baseline using pre-scripted interactive behaviours. Analys...
Body movements are an important communication medium through which affective states can be discer... more Body movements are an important communication medium through which affective states can be discerned. Movements that convey affect can also give machines life-like attributes and help to create a more engaging human-machine interaction. This paper presents an approach for automatic affective movement generation that makes use of two movement abstractions: 1) Laban movement analysis (LMA), and 2) hidden Markov modeling. The LMA provides a systematic tool for an abstract representation of the kinematic and expressive characteristics of movements. Given a desired motion path on which a target emotion is to be overlaid, the proposed approach searches a labeled dataset in the LMA Effort and Shape space for similar movements to the desired motion path that convey the target emotion. An HMM abstraction of the identified movements is obtained and used with the desired motion path to generate a novel movement that is a modulated version of the desired motion path that conveys the target emot...
One of the questions frequently asked by artists and non-artists alike is whether it is possible ... more One of the questions frequently asked by artists and non-artists alike is whether it is possible to make ‘good’ technology art. Like any artistic medium, one can find examples of varying quality, and since evaluation is often subjective, the easy answer is “yes.” It can be difficult, however, to apply the traditional tests of quality—particularly recognition in gallery exhibitions and museum collections—to technology-mediated works, because rapid obsolescence and difficult documentation pathways make them expensive or impossible to maintain in exhibition environments. Revealing the Hylozoic Ground Interaction Layer
Shape memory alloys (SMA), materials which convert heat energy (usually through Joule-heating) to... more Shape memory alloys (SMA), materials which convert heat energy (usually through Joule-heating) to mechanical energy, provide compact and effective actuation for a variety of mechanical systems. They are attractive options to be used in an automotive context as lightweight, scalable actuators that have a very high power/weight ratio. However, the materials must be protected from overheating during actuation. This leads to the need for direct temperature measurement methods for use either in direct temperature feedback controllers or indirectly for validating models of the material’s thermo-electric behaviour. Developing a proven experimental method for measurement of wire temperature is the goal of this work. Various methods were applied and tested, including contact methods using thermocouples and thermistors, as well as non-contact infrared thermal imaging. The latter two are briefly described, while the paper focuses on our results achieved using thermocouples. Several different m...
Physical agents that can autonomously generate engaging, life-like behavior will lead to more res... more Physical agents that can autonomously generate engaging, life-like behavior will lead to more responsive and user-friendly robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well-controlled settings, physical agents should be capable of interacting with humans in natural settings, including group interaction. To generate engaging behaviors, the autonomous system must first be able to estimate its human partners’ engagement level. In this article, we propose an approach for estimating engagement during group interaction by simultaneously taking into account active and passive interaction, and use the measure as the reward signal within a reinforcement learning framework to learn engaging interactive behaviors. The proposed approach is implemented in an interactive sculptural system in a museum setting. We compare the learning system to a baseline using pre-scripted interactive behaviors. Analysis based on sensory data and survey ...
Shape Memory Alloys (SMAs) have been implemented as actuators in a wide range of applications spa... more Shape Memory Alloys (SMAs) have been implemented as actuators in a wide range of applications spanning fields such as robotics, aeronautics, automotive and medicine. However, controlling SMA actuators is no simple task as they are highly nonlinear due to the inherent hysteresis. In particular, the thermal nature of the SMA phase transformation means that the surrounding ambient conditions, such as temperature and air flow, have a direct effect on the time needed for the SMA wire to actuate. For example, if the surrounding temperature is high, the wire will contract in a shorter period of time at fixed current, compared to when the surrounding temperature is low. In some applications, such as automotive, this is a very important factor to consider as one key objective in such applications is attaining consistent actuation times across a broad range of ambient conditions. Thus, the focus of this work is devising a method to actuate an SMA wire in a more consistent time regardless of t...
2018 IEEE International Conference on Robotics and Automation (ICRA), 2018
One of the main challenges of bipedal gait is to avoid falling due to unknown disturbances. Compe... more One of the main challenges of bipedal gait is to avoid falling due to unknown disturbances. Compensating for these disturbances in bipeds is often achieved by leaning or stepping. In this work, the Spherical Foot Placement Estimator (SFPE) is introduced, which uses the biped's current kinematics and dynamics to predict if a step is needed, and if so where to step, to restore balance in 3D. An example of a controller using the SFPE is shown, which augments an existing optimal controller with both leaning and stepping: SFPE-based feedback is used to generate a desired momentum for momentum-based leaning while the SFPE point is used as a control reference for stepping. The new estimator outperforms existing balance criteria by providing both recovery step location prediction and momentum objectives with smooth dynamics.
International Journal of Technology and Design Education, 2019
Many design frameworks introduced to novices are not compatible with the behaviours and habits of... more Many design frameworks introduced to novices are not compatible with the behaviours and habits of mind of expert designers. This creates a barrier to effective practice, especially when novice designers tackle ill-defined, wicked problems. The W-model is a pedagogical framework that provides a prescriptive design model for novices, enabling them to effectively engage with ill-defined problems. Co-evolution of the problem and solution are mandated through rapid iterations of five design phases: define, ideate, synthesize, assess and reflect. The W-model was tested on pre-college novice designers who used this framework to solve a wicked problem. Results demonstrate that students using the W-model engaged in behaviours associated with informed designers, and were able to effectively and confidently tackle a real-world problem.
Transactions of the Canadian Society for Mechanical Engineering, 2007
Macro-micro systems allow high-resolution positioning over greater ranges of operation that would... more Macro-micro systems allow high-resolution positioning over greater ranges of operation that would be achievable with precision positioning systems. Piezoceramic actuators have established themselves as the principle technology for commercial micro-positioning applications, and the trend in research is to push the limits of resolution down to the nanometer and sub-nanometer scales. Other smart materials offer the potential for lightweight, continuous actuation over small ranges, and hence may be useful in micro-positioning applications. This work focuses on the potential for SMA actuators to enable low-cost micro-positioning. Compared to piezos, SMA offer longer range and lower actuation voltages, enabling lower-cost drive electronics and removing the need for costly precision mechanical amplification stages. A prototype single-axis macro-micro positioning system is described, with a macro range of 200 mm and relative positioning precision of better than 5 5μm. The micro stage is dri...
Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world r... more Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world robot control tasks in recent years. Nevertheless, applying DRL to novel robot control tasks is still challenging, especially when researchers have to design the action and observation space and the reward function. In this paper, we investigate partial observability as a potential failure source of applying DRL to robot control tasks, which can occur when researchers are not confident whether the observation space fully represents the underlying state. We compare the performance of three common DRL algorithms, TD3, SAC and PPO under various partial observability conditions. We find that TD3 and SAC become easily stuck in local optima and underperform PPO. We propose multi-step versions of the vanilla TD3 and SAC to improve robustness to partial observability based on one-step bootstrapping.
The heating of shape memory alloy (SMA) materials leads to a thermally driven phase change which ... more The heating of shape memory alloy (SMA) materials leads to a thermally driven phase change which can be used to do work. An SMA wire can be thermally cycled by controlling electric current through the wire, creating an electro-mechanical actuator. Such actuators are typically heated electrically and cooled through convection. The thermal time constants and lack of active cooling limit
The mechanisms of reactive sintering in a Ni + Ti powder compact were investigated using differen... more The mechanisms of reactive sintering in a Ni + Ti powder compact were investigated using differential scanning calorimetry and microstructural analysis. Heating these mixtures up to 900°C involves the slow growth of the three intermetallic compounds and the transformation of a-to b-Ti followed by its rapid saturation with Ni. When samples were heated above 942°C a thermal explosion (TE) mode of self-propagating high-temperature synthesis (SHS) was ignited by the melting of the (b-Ti) solid solution at 944°C. Increasing hold time at 900°C prior to SHS decreased the volume fraction of (b-Ti) in the powder compact and reduced the magnitude of combustion. The amount of (b-Ti) was quantitatively determined using DSC and found to decay according to a two-stage parabolic law. In addition, the magnitude of the exothermic reaction occurring during SHS was found to decrease linearly with a decrease in the volume fraction of (b-Ti) developed at 900°C.
2007 IEEE/ASME international conference on advanced intelligent mechatronics, 2007
The phase change in shape memory alloys (SMA) is highly nonlinear, and the development of advance... more The phase change in shape memory alloys (SMA) is highly nonlinear, and the development of advanced position ing applications for SMA actuators benefits from the availability of good models of this behaviour. One phenomenological model for SMA transformation kinetics is Madill’s model, which has recently been extended to include the effect of time varying stress. This extension allows for the
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optima... more A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Processes (MDPs). In real-world robotics, this assumption is unpractical, because of issues such as sensor sensitivity limitations and sensor noise, and the lack of knowledge about whether the observation design is complete or not. These scenarios lead to Partially Observable MDPs (POMDPs). In this paper, we propose Long-Short-Term-Memory-based Twin Delayed Deep Deterministic Policy Gradient (LSTM-TD3) by introducing a memory component to TD3, and compare its performance with other DRL algorithms in both MDPs and POMDPs. Our results demonstrate the significant advantages of the memory component in addressing POMDPs, including the ability to handle missing and noisy observation data.
Canada’s Living Architecture Systems Group (LASG) combines scientists, engineers, architects and ... more Canada’s Living Architecture Systems Group (LASG) combines scientists, engineers, architects and artists working together to create large-scale prototypes of immersive architectural spaces with qualities that come strikingly close to those of living systems. Working in interdis- ciplinary groups combining architects, engineers and scientists, LASG is building environments that can move, respond, and learn; environments that renew themselves with chemical exchanges and that are adaptive and empathic toward their inhabitants. This paper provides a detailed review of the history and current research of the group, including a detailed case study of Epiphyte Chamber, a complex immersive environment presented at the new Museum of Modern and Contemporary Art in Seoul, 2014. We also include a context of precedents in the rapidly evolving field of responsive architecture, and a description of the current implementation of a new proprioceptive distributed control system employing a curiosity-...
Physical agents that can autonomously generate engaging, life-like behaviour will lead to more re... more Physical agents that can autonomously generate engaging, life-like behaviour will lead to more responsive and interesting robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well controlled settings, future physical agents should be capable of interacting with humans in natural settings, including group interaction. In order to generate engaging behaviours, the autonomous system must first be able to estimate its human partners' engagement level. In this paper, we propose an approach for estimating engagement during group interaction by simultaneously taking into account active and passive interaction, i.e. occupancy, and use the measure as the reward signal within a reinforcement learning framework to learn engaging interactive behaviours. The proposed approach is implemented in an interactive sculptural system in a museum setting. We compare the learning system to a baseline using pre-scripted interactive behaviours. Analys...
Body movements are an important communication medium through which affective states can be discer... more Body movements are an important communication medium through which affective states can be discerned. Movements that convey affect can also give machines life-like attributes and help to create a more engaging human-machine interaction. This paper presents an approach for automatic affective movement generation that makes use of two movement abstractions: 1) Laban movement analysis (LMA), and 2) hidden Markov modeling. The LMA provides a systematic tool for an abstract representation of the kinematic and expressive characteristics of movements. Given a desired motion path on which a target emotion is to be overlaid, the proposed approach searches a labeled dataset in the LMA Effort and Shape space for similar movements to the desired motion path that convey the target emotion. An HMM abstraction of the identified movements is obtained and used with the desired motion path to generate a novel movement that is a modulated version of the desired motion path that conveys the target emot...
One of the questions frequently asked by artists and non-artists alike is whether it is possible ... more One of the questions frequently asked by artists and non-artists alike is whether it is possible to make ‘good’ technology art. Like any artistic medium, one can find examples of varying quality, and since evaluation is often subjective, the easy answer is “yes.” It can be difficult, however, to apply the traditional tests of quality—particularly recognition in gallery exhibitions and museum collections—to technology-mediated works, because rapid obsolescence and difficult documentation pathways make them expensive or impossible to maintain in exhibition environments. Revealing the Hylozoic Ground Interaction Layer
Shape memory alloys (SMA), materials which convert heat energy (usually through Joule-heating) to... more Shape memory alloys (SMA), materials which convert heat energy (usually through Joule-heating) to mechanical energy, provide compact and effective actuation for a variety of mechanical systems. They are attractive options to be used in an automotive context as lightweight, scalable actuators that have a very high power/weight ratio. However, the materials must be protected from overheating during actuation. This leads to the need for direct temperature measurement methods for use either in direct temperature feedback controllers or indirectly for validating models of the material’s thermo-electric behaviour. Developing a proven experimental method for measurement of wire temperature is the goal of this work. Various methods were applied and tested, including contact methods using thermocouples and thermistors, as well as non-contact infrared thermal imaging. The latter two are briefly described, while the paper focuses on our results achieved using thermocouples. Several different m...
Physical agents that can autonomously generate engaging, life-like behavior will lead to more res... more Physical agents that can autonomously generate engaging, life-like behavior will lead to more responsive and user-friendly robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well-controlled settings, physical agents should be capable of interacting with humans in natural settings, including group interaction. To generate engaging behaviors, the autonomous system must first be able to estimate its human partners’ engagement level. In this article, we propose an approach for estimating engagement during group interaction by simultaneously taking into account active and passive interaction, and use the measure as the reward signal within a reinforcement learning framework to learn engaging interactive behaviors. The proposed approach is implemented in an interactive sculptural system in a museum setting. We compare the learning system to a baseline using pre-scripted interactive behaviors. Analysis based on sensory data and survey ...
Shape Memory Alloys (SMAs) have been implemented as actuators in a wide range of applications spa... more Shape Memory Alloys (SMAs) have been implemented as actuators in a wide range of applications spanning fields such as robotics, aeronautics, automotive and medicine. However, controlling SMA actuators is no simple task as they are highly nonlinear due to the inherent hysteresis. In particular, the thermal nature of the SMA phase transformation means that the surrounding ambient conditions, such as temperature and air flow, have a direct effect on the time needed for the SMA wire to actuate. For example, if the surrounding temperature is high, the wire will contract in a shorter period of time at fixed current, compared to when the surrounding temperature is low. In some applications, such as automotive, this is a very important factor to consider as one key objective in such applications is attaining consistent actuation times across a broad range of ambient conditions. Thus, the focus of this work is devising a method to actuate an SMA wire in a more consistent time regardless of t...
2018 IEEE International Conference on Robotics and Automation (ICRA), 2018
One of the main challenges of bipedal gait is to avoid falling due to unknown disturbances. Compe... more One of the main challenges of bipedal gait is to avoid falling due to unknown disturbances. Compensating for these disturbances in bipeds is often achieved by leaning or stepping. In this work, the Spherical Foot Placement Estimator (SFPE) is introduced, which uses the biped's current kinematics and dynamics to predict if a step is needed, and if so where to step, to restore balance in 3D. An example of a controller using the SFPE is shown, which augments an existing optimal controller with both leaning and stepping: SFPE-based feedback is used to generate a desired momentum for momentum-based leaning while the SFPE point is used as a control reference for stepping. The new estimator outperforms existing balance criteria by providing both recovery step location prediction and momentum objectives with smooth dynamics.
International Journal of Technology and Design Education, 2019
Many design frameworks introduced to novices are not compatible with the behaviours and habits of... more Many design frameworks introduced to novices are not compatible with the behaviours and habits of mind of expert designers. This creates a barrier to effective practice, especially when novice designers tackle ill-defined, wicked problems. The W-model is a pedagogical framework that provides a prescriptive design model for novices, enabling them to effectively engage with ill-defined problems. Co-evolution of the problem and solution are mandated through rapid iterations of five design phases: define, ideate, synthesize, assess and reflect. The W-model was tested on pre-college novice designers who used this framework to solve a wicked problem. Results demonstrate that students using the W-model engaged in behaviours associated with informed designers, and were able to effectively and confidently tackle a real-world problem.
Transactions of the Canadian Society for Mechanical Engineering, 2007
Macro-micro systems allow high-resolution positioning over greater ranges of operation that would... more Macro-micro systems allow high-resolution positioning over greater ranges of operation that would be achievable with precision positioning systems. Piezoceramic actuators have established themselves as the principle technology for commercial micro-positioning applications, and the trend in research is to push the limits of resolution down to the nanometer and sub-nanometer scales. Other smart materials offer the potential for lightweight, continuous actuation over small ranges, and hence may be useful in micro-positioning applications. This work focuses on the potential for SMA actuators to enable low-cost micro-positioning. Compared to piezos, SMA offer longer range and lower actuation voltages, enabling lower-cost drive electronics and removing the need for costly precision mechanical amplification stages. A prototype single-axis macro-micro positioning system is described, with a macro range of 200 mm and relative positioning precision of better than 5 5μm. The micro stage is dri...
To enable long term, engaging social human-machine interaction, robots and other autonomous syste... more To enable long term, engaging social human-machine interaction, robots and other autonomous systems must be able to move beyond purely reactive interaction control strategies, and engage in shared-initiative interaction. In this paper, we describe an implementation of an interactive art sculpture which generates interactive behaviors using curiosity-based learning. Using its own internal motivation formulated as a curiosity drive, the system initiates interaction with and responds to human visitors, generating continuously evolving interactive behaviors. The proposed system was tested in a user study with a prototype interactive sculpture installation.
Uploads
Papers by Rob Gorbet