Adaptive Agents and Multi-Agents Systems, May 6, 2013
Virtual agent research on gesture is increasingly relying on data-driven algorithms, which requir... more Virtual agent research on gesture is increasingly relying on data-driven algorithms, which require large corpora to be effectively trained. This work presents a method for automatically segmenting human motion into gesture phases based on input motion capture data. By reducing the need for manual annotation, the method allows gesture researchers to more easily build large corpora for gesture analysis and animation modeling. An effective rule set has been developed for identifying gesture phase boundaries using both joint angle and positional data of the fingers and hands. A set of Support Vector Machines trained from a database of annotated clips, is used to classify the type of each detected phase boundary into stroke, preparation or retraction. The approach has been tested on motion capture data obtained from different people with varied gesturing styles and in different moods and the results give us an indication of the extent to which variation in gesturing style affects the accuracy of segmentation.
This paper presents a pedagogical agent designed to support students in an embodied, discovery-ba... more This paper presents a pedagogical agent designed to support students in an embodied, discovery-based learning environment. Discovery-based learning guides students through a set of activities designed to foster particular insights. In this case, the animated agent explains how to use the Mathematical Imagery Trainer for Proportionality, provides performance feedback, leads students to have different experiences and provides remedial instruction when required. It is a challenging task for agent technology as the amount of concrete feedback from the learner is very limited, here restricted to the location of two markers on the screen. A Dynamic Decision Network is used to automatically determine agent behavior, based on a deep understanding of the tutorial protocol. A pilot evaluation showed that all participants developed movement schemes supporting proto-proportional reasoning. They were able to provide verbal proto-proportional expressions for one of the taught strategies, but not the other.
The expense, danger, planning and precision required to create explosions suggests that the compu... more The expense, danger, planning and precision required to create explosions suggests that the computational visual modelling of explosions is worthwhile. However, the short time scale at which explosions occur, and their sheer complexity, poses a difficult modelling challenge. After describing the basic phenomenology of explosion events, we present an efficient computational model of isotropic blast wave transport and an algorithm for fracturing objects in their wake. Our model is based on the notion of a blast curve that gives the force-loading profile of an explosive material on an object as a function of distance from the explosion's centre. We also describe a technique for fracturing materials loaded by a blast. Our approach is based on the notion of rapid fracture: that microfractures in a material together with loading forces seed a fracturing process that quickly spreads across the material and causes it to fragment.
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019
There is evidence that adding motion-tracked avatars to virtual environments increases users' sen... more There is evidence that adding motion-tracked avatars to virtual environments increases users' sense of presence. High quality motion capture systems are cost sensitive for the average user and low cost resource-constrained systems introduce various forms of error to the tracking. Much research has looked at the impact of particular kinds of error, primarily latency, on factors such as body ownership, but it is still not known what level of tracking error is permissible in these systems to afford compelling social interaction. This paper presents a series of experiments employing a sizable subject pool (n=96) that study the impact of motion tracking errors on user experience for activities including social interaction and virtual object manipulation. Diverse forms of error that arise in tracking are examined, including latency, popping (jumps in position), stuttering (positions held in time) and constant noise. The focus is on error on a person's own avatar, but some conditions also include error on an interlocutor, which appears underexplored. The picture that emerges is complex. Certain forms of error impact performance, a person's sense of embodiment, enjoyment and perceived usability, while others do not. Notably, evidence was not found that tracking errors impact social presence, even when those errors are severe.
Animation tools have benefited greatly from advances in skinning and surface deformation techniqu... more Animation tools have benefited greatly from advances in skinning and surface deformation techniques, yet it still remains difficult to author articulated character animations that display the free and highly expressive shape change that characterize hand-drawn animation. We present a new skinning representation that allows skeletal deformation and more flexible shape control to be combined in a single framework, along with an intuitive, sketch-based interface. Our approach offers the convenience of skeletal control and smooth skinning with the functionality to embed surface deformation and animation as a core component of the skinning technique. The approach binds vertices to attachment points on the skeleton, defining a vector from bone to surface. Three types of springs are defined: intervertex springs help maintain surface relationships, springs from vertices to the attachment point help maintain appropriate bone offsets, and torsion springs around these attachment vectors help w...
We compared nonverbal expressive behavior across matched and mismatched extravert/introvert pairs... more We compared nonverbal expressive behavior across matched and mismatched extravert/introvert pairs. We found that participants’ gestures changed over time, adapting to the gesture style of their partner. Results will be used as the basis for the implementation of adaptable personality expression in interactive virtual agents.
The utility of an interactive tool can be measured by how pervasively it is embedded into a user&... more The utility of an interactive tool can be measured by how pervasively it is embedded into a user's workflow. Tools for artists additionally must provide an appropriate level of control over expressive aspects of their work while suppressing unwanted intrusions due to details that are, for the moment, unnecessary. Our focus is on tools that target editing the expressive aspects of character motion. These tools allow animators to work in a way that is more expedient than modifying low-level details, and offers finer control than high level, directorial approaches. To illustrate this approach, we present three such tools, one for varying timing (succession), and two for varying motion shape (amplitude and extent). Succession editing allows the animator to vary the activation times of the joints in the motion. Amplitude editing allows the animator to vary the joint ranges covered during a motion. Extent editing allows an animator to vary how fully a character occupies space during a...
We present a corpus of 44 human-agent verbal and gestural story retellings designed to explore wh... more We present a corpus of 44 human-agent verbal and gestural story retellings designed to explore whether humans would gesturally entrain to an embodied intelligent virtual agent. We used a novel data collection method where an agent presented story components in installments, which the human would then retell to the agent. At the end of the installments, the human would then retell the embodied animated agent the story as a whole. This method was designed to allow us to observe whether changes in the agent’s gestural behavior would result in human gestural changes. The agent modified its gestures over the course of the story, by starting out the first installment with gestural behaviors designed to manifest extraversion, and slowly modifying gestures to express introversion over time, or the reverse. The corpus contains the verbal and gestural transcripts of the human story retellings. The gestures were coded for type, handedness, temporal structure, spatial extent, and the degree to ...
This paper presents a new corpus, the Personality Dyads Corpus, consisting of multimodal data for... more This paper presents a new corpus, the Personality Dyads Corpus, consisting of multimodal data for three conversations between three personality-matched, two-person dyads (a total of 9 separate dialogues). Participants were selected from a larger sample to be 0.8 of a standard deviation above or below the mean on the Big-Five Personality extraversion scale, to produce an Extravert-Extravert dyad, an Introvert-Introvert dyad, and an Extravert-Introvert dyad. Each pair carried out conversations for three different tasks. The conversations were recorded using optical motion capture for the body and data gloves for the hands. Dyads’ speech was transcribed and the gestural and postural behavior was annotated with ANVIL. The released corpus includes personality profiles, ANVIL files containing speech transcriptions and the gestural annotations, and BVH files containing body and hand motion in 3D.
Story-telling is a fundamental and prevalent aspect of human social behavior. In the wild, storie... more Story-telling is a fundamental and prevalent aspect of human social behavior. In the wild, stories are told conversationally in social settings, often as a dialogue and with accompanying gestures and other nonverbal behavior. This paper presents a new corpus, the Story Dialogue with Gestures (SDG) corpus, consisting of 50 personal narratives regenerated as dialogues, complete with annotations of gesture placement and accompanying gesture forms. The corpus includes dialogues generated by human annotators, gesture annotations on the human generated dialogues, videos of story dialogues generated from this representation, video clips of each gesture used in the gesture annotations, and annotations of the original personal narratives with a deep representation of story called a Story Intention Graph. Our long term goal is the automatic generation of story co-tellings as animated dialogues from the Story Intention Graph. We expect this corpus to be a useful resource for researchers intere...
Evaluating Personality Trait Attribution Based on Gestures by Virtual Agents Kris Liu University ... more Evaluating Personality Trait Attribution Based on Gestures by Virtual Agents Kris Liu University of California, Santa Cruz, Santa Cruz, CA, United States Jackson Tolins University of California, Santa Cruz, Santa Cruz, CA, United States Jean Fox Tree University of California, Santa Cruz, Santa Cruz, CA, United States Marilyn Walker University of California, Santa Cruz, Santa Cruz, CA, United States Michael Neff University of California, Davis Abstract: Can human-normed personality scales alone give us a full picture of the attribution of personality to virtual agents? In humans, personality traits are often immediately ascribed: first impressions, influenced by posture and gesture, can be strong, lasting and accurate. Is there a similar immediate attribution of personality for virtual agents that can be similarly influenced by posture and gesture? Our study uses an open-ended question alongside a traditional Big Five personality inventory to probe the perceived personality of agents...
Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, 2015
Data-driven motion research requires effective tools to compress, index, retrieve and reconstruct... more Data-driven motion research requires effective tools to compress, index, retrieve and reconstruct captured motion data. In this paper, we present a novel method to perform these tasks using a deep learning architecture. Our deep autoencoder, a form of artificial neural network, encodes motion segments into "deep signatures". This signature is formed by concatenating signatures for functionally different parts of the body. The deep signature is a highly condensed representation of a motion segment, requiring only 20 bytes, yet still encoding high level motion features. It can be used to produce a very compact representation of a motion database that can be effectively used for motion indexing and retrieval, with a very small memory footprint. Database searches are reduced to low cost binary comparisons of signatures. Motion reconstruction is achieved by fixing a "deep signature" that is missing a section using Gibbs Sampling. We tested both manually and automatically segmented motion databases and our experiments show that extracting the deep signature is fast and scales well with large databases. Given a query motion, similar motion segments can be retrieved at interactive speed with excellent match quality.
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018
Embodied virtual reality faithfully renders users' movements onto an avatar in a virtual 3D envir... more Embodied virtual reality faithfully renders users' movements onto an avatar in a virtual 3D environment, supporting nuanced nonverbal behavior alongside verbal communication. To investigate communication behavior within this medium, we had 30 dyads complete two tasks using a shared visual workspace: negotiating an apartment layout and placing model furniture on an apartment floor plan. Dyads completed both tasks under three different conditions: face-to-face, embodied VR with visible full-body avatars, and no embodiment VR, where the participants shared a virtual space, but had no visible avatars. Both subjective measures of users' experiences and detailed annotations of verbal and nonverbal behavior are used to understand how the media impact communication behavior. Embodied VR provides a high level of social presence with conversation patterns that are very similar to face-to-face interaction. In contrast, providing only the shared environment was generally found to be lonely and appears to lead to degraded communication.
Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization, 2018
Sensor stream data, particularly those collected at the millisecond of granularity, have been not... more Sensor stream data, particularly those collected at the millisecond of granularity, have been notoriously difficult to leverage classifiable signal out of. Adding to the challenge is the limited domain knowledge that exists at these biological sensor levels of interaction that prohibits a comprehensive manual feature engineering approach to classification of those streams. In this paper, we attempt to enhance the assessment capability of a touchscreen based ratio tutoring system by using Recurrent Neural Networks (RNNs) to predict the strategy being demonstrated by students from their 60hz data streams. We hypothesize that the ability of neural networks to learn representations automatically, instead of relying on human feature engineering, may benefit this classification task. Our RNN and baseline models were trained and cross-validated at several levels on historical data which had been human coded with the task strategy believed to be exhibited by the learner. Our RNN approach to this historically difficult high frequency data classification task moderately advances performance above baselines and we discuss what implication this level of assessment performance has on enabling greater adaptive supports in the tutoring system.
Proceedings of the ACM on Computer Graphics and Interactive Techniques, 2019
Style is an intrinsic, inescapable part of human motion. It complements the content of motion to ... more Style is an intrinsic, inescapable part of human motion. It complements the content of motion to convey meaning, mood, and personality. Existing state-of-the-art motion style methods require large quantities of example data and intensive computational resources at runtime. To ensure output quality, such style transfer applications are often run on desktop machine with GPUs and significant memory. In this paper, we present a fast and expressive neural network-based motion style transfer method that generates stylized motion with quality comparable to the state of the art method, but uses much less computational power and a much smaller memory footprint. Our method also allows the output to be adjusted in a latent style space, something not offered in previous approaches. Our style transfer model is implemented using three multi-layered networks: a pose network, a timing network and a foot-contact network. A one-hot style vector serves as an input control knob and determines the styli...
Gestures can take on complex forms that convey both pragmatic and expressive information. When cr... more Gestures can take on complex forms that convey both pragmatic and expressive information. When creating virtual agents, it is necessary to make fine grained manipulations of these forms to precisely adjust the gesture’s meaning to reflect the communicative content an agent is trying to deliver, character mood and spatial arrangement of the characters and objects. This paper describes a gesture schema that affords the required, rich description of gesture form. Novel features include the representation of multiphase gestures consisting of several segments, repetitions of gesture form, a map of referential locations and a rich set of spatial and orientation constraints. In our prototype implementation, gestures are generated from this representation by editing and combining small snippets of motion captured data to meet the specification. This allows a very diverse set of gestures to be generated from a small set of input data. Gestures can be refined by simply adjusting the parameters of the schema.
American Sign Language (ASL) fingerspelling is the act of spelling a word letter-by-letter when a... more American Sign Language (ASL) fingerspelling is the act of spelling a word letter-by-letter when a specific sign does not exist to represent it. Synthesizing intelligible ASL, which includes fingerspelling as an integral part, is important to create signing virtual characters for training and communicating in virtual environments or further applications. The rhythm and speed of fingerspelling play a large role in how well fingerspelling is understood. Using motion capture technologies, we record fingerspelling and analyze timing information about letters in the words. Our goal is to identify fingerspelling timing information and use it to create fingerspelling animations that are natural and understandable.
Figure 1: Diverse locomotion outputs (in a happy style) synthesized by a linear mixture of combin... more Figure 1: Diverse locomotion outputs (in a happy style) synthesized by a linear mixture of combinatorial components decomposed from the 10 example set.
We explore the expression of personality and adaptivity through the gestures of virtual agents in... more We explore the expression of personality and adaptivity through the gestures of virtual agents in a storytelling task. We conduct two experiments using four different dialogic stories. We manipulate agent personality on the extraversion scale, whether the agents adapt to one another in their gestural performance and agent gender. Our results show that subjects are able to perceive the intended variation in extraversion between different virtual agents, independently of the story they are telling and the gender of the agent. A second study shows that subjects also prefer adaptive to nonadaptive virtual agents.
Little is known about the collaborative learning processes of interdisciplinary teams designing t... more Little is known about the collaborative learning processes of interdisciplinary teams designing technology-enabled immersive learning systems. In this conceptual paper, we reflect on the role of digitally captured embodied performances as boundary objects within our heterogeneous two-team collective of learning scientists and computer scientists as we design an embodied, animated virtual tutor embedded in a physically immersive mathematics learning system. Beyond just a communicative resource, we demonstrate how these digitized, embodied performances constitute a powerful mode for both inter-and intra-team learning and innovation. Our work illustrates the utility of mobilizing the material conditions of learning.
Adaptive Agents and Multi-Agents Systems, May 6, 2013
Virtual agent research on gesture is increasingly relying on data-driven algorithms, which requir... more Virtual agent research on gesture is increasingly relying on data-driven algorithms, which require large corpora to be effectively trained. This work presents a method for automatically segmenting human motion into gesture phases based on input motion capture data. By reducing the need for manual annotation, the method allows gesture researchers to more easily build large corpora for gesture analysis and animation modeling. An effective rule set has been developed for identifying gesture phase boundaries using both joint angle and positional data of the fingers and hands. A set of Support Vector Machines trained from a database of annotated clips, is used to classify the type of each detected phase boundary into stroke, preparation or retraction. The approach has been tested on motion capture data obtained from different people with varied gesturing styles and in different moods and the results give us an indication of the extent to which variation in gesturing style affects the accuracy of segmentation.
This paper presents a pedagogical agent designed to support students in an embodied, discovery-ba... more This paper presents a pedagogical agent designed to support students in an embodied, discovery-based learning environment. Discovery-based learning guides students through a set of activities designed to foster particular insights. In this case, the animated agent explains how to use the Mathematical Imagery Trainer for Proportionality, provides performance feedback, leads students to have different experiences and provides remedial instruction when required. It is a challenging task for agent technology as the amount of concrete feedback from the learner is very limited, here restricted to the location of two markers on the screen. A Dynamic Decision Network is used to automatically determine agent behavior, based on a deep understanding of the tutorial protocol. A pilot evaluation showed that all participants developed movement schemes supporting proto-proportional reasoning. They were able to provide verbal proto-proportional expressions for one of the taught strategies, but not the other.
The expense, danger, planning and precision required to create explosions suggests that the compu... more The expense, danger, planning and precision required to create explosions suggests that the computational visual modelling of explosions is worthwhile. However, the short time scale at which explosions occur, and their sheer complexity, poses a difficult modelling challenge. After describing the basic phenomenology of explosion events, we present an efficient computational model of isotropic blast wave transport and an algorithm for fracturing objects in their wake. Our model is based on the notion of a blast curve that gives the force-loading profile of an explosive material on an object as a function of distance from the explosion's centre. We also describe a technique for fracturing materials loaded by a blast. Our approach is based on the notion of rapid fracture: that microfractures in a material together with loading forces seed a fracturing process that quickly spreads across the material and causes it to fragment.
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019
There is evidence that adding motion-tracked avatars to virtual environments increases users' sen... more There is evidence that adding motion-tracked avatars to virtual environments increases users' sense of presence. High quality motion capture systems are cost sensitive for the average user and low cost resource-constrained systems introduce various forms of error to the tracking. Much research has looked at the impact of particular kinds of error, primarily latency, on factors such as body ownership, but it is still not known what level of tracking error is permissible in these systems to afford compelling social interaction. This paper presents a series of experiments employing a sizable subject pool (n=96) that study the impact of motion tracking errors on user experience for activities including social interaction and virtual object manipulation. Diverse forms of error that arise in tracking are examined, including latency, popping (jumps in position), stuttering (positions held in time) and constant noise. The focus is on error on a person's own avatar, but some conditions also include error on an interlocutor, which appears underexplored. The picture that emerges is complex. Certain forms of error impact performance, a person's sense of embodiment, enjoyment and perceived usability, while others do not. Notably, evidence was not found that tracking errors impact social presence, even when those errors are severe.
Animation tools have benefited greatly from advances in skinning and surface deformation techniqu... more Animation tools have benefited greatly from advances in skinning and surface deformation techniques, yet it still remains difficult to author articulated character animations that display the free and highly expressive shape change that characterize hand-drawn animation. We present a new skinning representation that allows skeletal deformation and more flexible shape control to be combined in a single framework, along with an intuitive, sketch-based interface. Our approach offers the convenience of skeletal control and smooth skinning with the functionality to embed surface deformation and animation as a core component of the skinning technique. The approach binds vertices to attachment points on the skeleton, defining a vector from bone to surface. Three types of springs are defined: intervertex springs help maintain surface relationships, springs from vertices to the attachment point help maintain appropriate bone offsets, and torsion springs around these attachment vectors help w...
We compared nonverbal expressive behavior across matched and mismatched extravert/introvert pairs... more We compared nonverbal expressive behavior across matched and mismatched extravert/introvert pairs. We found that participants’ gestures changed over time, adapting to the gesture style of their partner. Results will be used as the basis for the implementation of adaptable personality expression in interactive virtual agents.
The utility of an interactive tool can be measured by how pervasively it is embedded into a user&... more The utility of an interactive tool can be measured by how pervasively it is embedded into a user's workflow. Tools for artists additionally must provide an appropriate level of control over expressive aspects of their work while suppressing unwanted intrusions due to details that are, for the moment, unnecessary. Our focus is on tools that target editing the expressive aspects of character motion. These tools allow animators to work in a way that is more expedient than modifying low-level details, and offers finer control than high level, directorial approaches. To illustrate this approach, we present three such tools, one for varying timing (succession), and two for varying motion shape (amplitude and extent). Succession editing allows the animator to vary the activation times of the joints in the motion. Amplitude editing allows the animator to vary the joint ranges covered during a motion. Extent editing allows an animator to vary how fully a character occupies space during a...
We present a corpus of 44 human-agent verbal and gestural story retellings designed to explore wh... more We present a corpus of 44 human-agent verbal and gestural story retellings designed to explore whether humans would gesturally entrain to an embodied intelligent virtual agent. We used a novel data collection method where an agent presented story components in installments, which the human would then retell to the agent. At the end of the installments, the human would then retell the embodied animated agent the story as a whole. This method was designed to allow us to observe whether changes in the agent’s gestural behavior would result in human gestural changes. The agent modified its gestures over the course of the story, by starting out the first installment with gestural behaviors designed to manifest extraversion, and slowly modifying gestures to express introversion over time, or the reverse. The corpus contains the verbal and gestural transcripts of the human story retellings. The gestures were coded for type, handedness, temporal structure, spatial extent, and the degree to ...
This paper presents a new corpus, the Personality Dyads Corpus, consisting of multimodal data for... more This paper presents a new corpus, the Personality Dyads Corpus, consisting of multimodal data for three conversations between three personality-matched, two-person dyads (a total of 9 separate dialogues). Participants were selected from a larger sample to be 0.8 of a standard deviation above or below the mean on the Big-Five Personality extraversion scale, to produce an Extravert-Extravert dyad, an Introvert-Introvert dyad, and an Extravert-Introvert dyad. Each pair carried out conversations for three different tasks. The conversations were recorded using optical motion capture for the body and data gloves for the hands. Dyads’ speech was transcribed and the gestural and postural behavior was annotated with ANVIL. The released corpus includes personality profiles, ANVIL files containing speech transcriptions and the gestural annotations, and BVH files containing body and hand motion in 3D.
Story-telling is a fundamental and prevalent aspect of human social behavior. In the wild, storie... more Story-telling is a fundamental and prevalent aspect of human social behavior. In the wild, stories are told conversationally in social settings, often as a dialogue and with accompanying gestures and other nonverbal behavior. This paper presents a new corpus, the Story Dialogue with Gestures (SDG) corpus, consisting of 50 personal narratives regenerated as dialogues, complete with annotations of gesture placement and accompanying gesture forms. The corpus includes dialogues generated by human annotators, gesture annotations on the human generated dialogues, videos of story dialogues generated from this representation, video clips of each gesture used in the gesture annotations, and annotations of the original personal narratives with a deep representation of story called a Story Intention Graph. Our long term goal is the automatic generation of story co-tellings as animated dialogues from the Story Intention Graph. We expect this corpus to be a useful resource for researchers intere...
Evaluating Personality Trait Attribution Based on Gestures by Virtual Agents Kris Liu University ... more Evaluating Personality Trait Attribution Based on Gestures by Virtual Agents Kris Liu University of California, Santa Cruz, Santa Cruz, CA, United States Jackson Tolins University of California, Santa Cruz, Santa Cruz, CA, United States Jean Fox Tree University of California, Santa Cruz, Santa Cruz, CA, United States Marilyn Walker University of California, Santa Cruz, Santa Cruz, CA, United States Michael Neff University of California, Davis Abstract: Can human-normed personality scales alone give us a full picture of the attribution of personality to virtual agents? In humans, personality traits are often immediately ascribed: first impressions, influenced by posture and gesture, can be strong, lasting and accurate. Is there a similar immediate attribution of personality for virtual agents that can be similarly influenced by posture and gesture? Our study uses an open-ended question alongside a traditional Big Five personality inventory to probe the perceived personality of agents...
Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, 2015
Data-driven motion research requires effective tools to compress, index, retrieve and reconstruct... more Data-driven motion research requires effective tools to compress, index, retrieve and reconstruct captured motion data. In this paper, we present a novel method to perform these tasks using a deep learning architecture. Our deep autoencoder, a form of artificial neural network, encodes motion segments into "deep signatures". This signature is formed by concatenating signatures for functionally different parts of the body. The deep signature is a highly condensed representation of a motion segment, requiring only 20 bytes, yet still encoding high level motion features. It can be used to produce a very compact representation of a motion database that can be effectively used for motion indexing and retrieval, with a very small memory footprint. Database searches are reduced to low cost binary comparisons of signatures. Motion reconstruction is achieved by fixing a "deep signature" that is missing a section using Gibbs Sampling. We tested both manually and automatically segmented motion databases and our experiments show that extracting the deep signature is fast and scales well with large databases. Given a query motion, similar motion segments can be retrieved at interactive speed with excellent match quality.
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018
Embodied virtual reality faithfully renders users' movements onto an avatar in a virtual 3D envir... more Embodied virtual reality faithfully renders users' movements onto an avatar in a virtual 3D environment, supporting nuanced nonverbal behavior alongside verbal communication. To investigate communication behavior within this medium, we had 30 dyads complete two tasks using a shared visual workspace: negotiating an apartment layout and placing model furniture on an apartment floor plan. Dyads completed both tasks under three different conditions: face-to-face, embodied VR with visible full-body avatars, and no embodiment VR, where the participants shared a virtual space, but had no visible avatars. Both subjective measures of users' experiences and detailed annotations of verbal and nonverbal behavior are used to understand how the media impact communication behavior. Embodied VR provides a high level of social presence with conversation patterns that are very similar to face-to-face interaction. In contrast, providing only the shared environment was generally found to be lonely and appears to lead to degraded communication.
Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization, 2018
Sensor stream data, particularly those collected at the millisecond of granularity, have been not... more Sensor stream data, particularly those collected at the millisecond of granularity, have been notoriously difficult to leverage classifiable signal out of. Adding to the challenge is the limited domain knowledge that exists at these biological sensor levels of interaction that prohibits a comprehensive manual feature engineering approach to classification of those streams. In this paper, we attempt to enhance the assessment capability of a touchscreen based ratio tutoring system by using Recurrent Neural Networks (RNNs) to predict the strategy being demonstrated by students from their 60hz data streams. We hypothesize that the ability of neural networks to learn representations automatically, instead of relying on human feature engineering, may benefit this classification task. Our RNN and baseline models were trained and cross-validated at several levels on historical data which had been human coded with the task strategy believed to be exhibited by the learner. Our RNN approach to this historically difficult high frequency data classification task moderately advances performance above baselines and we discuss what implication this level of assessment performance has on enabling greater adaptive supports in the tutoring system.
Proceedings of the ACM on Computer Graphics and Interactive Techniques, 2019
Style is an intrinsic, inescapable part of human motion. It complements the content of motion to ... more Style is an intrinsic, inescapable part of human motion. It complements the content of motion to convey meaning, mood, and personality. Existing state-of-the-art motion style methods require large quantities of example data and intensive computational resources at runtime. To ensure output quality, such style transfer applications are often run on desktop machine with GPUs and significant memory. In this paper, we present a fast and expressive neural network-based motion style transfer method that generates stylized motion with quality comparable to the state of the art method, but uses much less computational power and a much smaller memory footprint. Our method also allows the output to be adjusted in a latent style space, something not offered in previous approaches. Our style transfer model is implemented using three multi-layered networks: a pose network, a timing network and a foot-contact network. A one-hot style vector serves as an input control knob and determines the styli...
Gestures can take on complex forms that convey both pragmatic and expressive information. When cr... more Gestures can take on complex forms that convey both pragmatic and expressive information. When creating virtual agents, it is necessary to make fine grained manipulations of these forms to precisely adjust the gesture’s meaning to reflect the communicative content an agent is trying to deliver, character mood and spatial arrangement of the characters and objects. This paper describes a gesture schema that affords the required, rich description of gesture form. Novel features include the representation of multiphase gestures consisting of several segments, repetitions of gesture form, a map of referential locations and a rich set of spatial and orientation constraints. In our prototype implementation, gestures are generated from this representation by editing and combining small snippets of motion captured data to meet the specification. This allows a very diverse set of gestures to be generated from a small set of input data. Gestures can be refined by simply adjusting the parameters of the schema.
American Sign Language (ASL) fingerspelling is the act of spelling a word letter-by-letter when a... more American Sign Language (ASL) fingerspelling is the act of spelling a word letter-by-letter when a specific sign does not exist to represent it. Synthesizing intelligible ASL, which includes fingerspelling as an integral part, is important to create signing virtual characters for training and communicating in virtual environments or further applications. The rhythm and speed of fingerspelling play a large role in how well fingerspelling is understood. Using motion capture technologies, we record fingerspelling and analyze timing information about letters in the words. Our goal is to identify fingerspelling timing information and use it to create fingerspelling animations that are natural and understandable.
Figure 1: Diverse locomotion outputs (in a happy style) synthesized by a linear mixture of combin... more Figure 1: Diverse locomotion outputs (in a happy style) synthesized by a linear mixture of combinatorial components decomposed from the 10 example set.
We explore the expression of personality and adaptivity through the gestures of virtual agents in... more We explore the expression of personality and adaptivity through the gestures of virtual agents in a storytelling task. We conduct two experiments using four different dialogic stories. We manipulate agent personality on the extraversion scale, whether the agents adapt to one another in their gestural performance and agent gender. Our results show that subjects are able to perceive the intended variation in extraversion between different virtual agents, independently of the story they are telling and the gender of the agent. A second study shows that subjects also prefer adaptive to nonadaptive virtual agents.
Little is known about the collaborative learning processes of interdisciplinary teams designing t... more Little is known about the collaborative learning processes of interdisciplinary teams designing technology-enabled immersive learning systems. In this conceptual paper, we reflect on the role of digitally captured embodied performances as boundary objects within our heterogeneous two-team collective of learning scientists and computer scientists as we design an embodied, animated virtual tutor embedded in a physically immersive mathematics learning system. Beyond just a communicative resource, we demonstrate how these digitized, embodied performances constitute a powerful mode for both inter-and intra-team learning and innovation. Our work illustrates the utility of mobilizing the material conditions of learning.
Uploads
Papers by Michael Neff