Recent studies have shown that deep learning achieves excellent performance in reconstructing 3D ... more Recent studies have shown that deep learning achieves excellent performance in reconstructing 3D scenes from multiview images or videos. However, these reconstructions do not provide the identities of objects, and object identification is necessary for a scene to be functional in virtual reality or interactive applications. The objects in a scene reconstructed as one mesh are treated as a single object, rather than individual entities that can be interacted with or manipulated. Reconstructing an object-aware 3D scene from a single 2D image is challenging because the image conversion process from a 3D scene to a 2D image is irreversible, and the projection from 3D to 2D reduces a dimension. To alleviate the effects of dimension reduction, we proposed a module to generate depth features that can aid the 3D pose estimation of objects. Additionally, we developed a novel approach to mesh reconstruction that combines two decoders that estimate 3D shapes with different shape representation...
Three-dimensional (3D) reconstruction techniques are playing an increasingly important role in ed... more Three-dimensional (3D) reconstruction techniques are playing an increasingly important role in education and entertainment. Real and recognizable avatars can enhance the immersion and interactivity of virtual systems. In 3D face modeling technology, face texture carries vital face recognition information. Therefore, this study proposes a panoramic 3D face texture generation method for 3D face reconstruction from a single 2D face image based on a 3D Morphable model (3DMM). Realistic and comprehensive panoramic facial textures can be obtained using generative networks as texture converters. Furthermore, we propose a low-cost method for generating face texture datasets for data collection. Experimental results show that the proposed method can generate panoramic face textures for 3D face meshes from a single image input, resulting in the final generation of textured 3D models that look realistic from different viewpoints.
Excessive training time is a major issue face when training autonomous vehicle agents with neural... more Excessive training time is a major issue face when training autonomous vehicle agents with neural networks by using images as input. This paper proposes a deep time-economical Q network (DQN) input image preprocessing method to train an autonomous vehicle agent in a virtual environment. The environmental information is extracted from the virtual environment. A top-view image of the entire environment is then redrawn according to the environmental information. During training of the DQN model, the top-view image is cropped to place the vehicle agent at the center of the cropped image. The current frame top-view image is combined with the images from the previous two training iterations. The DQN model use this combined image as input. The experimental results indicate higher performance and shorter training time for the DQN model trained with the preprocessed images compared with that trained without preprocessing.
Recently, virtual environment-based techniques to train sensor-based autonomous driving models ha... more Recently, virtual environment-based techniques to train sensor-based autonomous driving models have been widely employed due to their efficiency. However, a simulated virtual environment is required to be highly similar to its real-world counterpart to ensure the applicability of such models to actual autonomous vehicles. Though advances in hardware and three-dimensional graphics engine technology have enabled the creation of realistic virtual driving environments, the myriad of scenarios occurring in the real world can only be simulated up to a limited extent. In this study, a scenario simulation and modeling framework that simulates the behavior of objects that may be encountered while driving is proposed to address this problem. This framework maximizes the number of scenarios, their types, and the driving experience in a virtual environment. Furthermore, a simulator was implemented and employed to evaluate the performance of the proposed framework.
Today multimedia technologies are playing an increasingly important role in games, movies, and li... more Today multimedia technologies are playing an increasingly important role in games, movies, and live performances. In this paper, we design a flexible interactive system integrated with gesture recognition, skeleton tracking, internet communication, and content edition using multi-sensors to direct and control the performance on stage. In this system, the performer can control the elements showed on stage through corresponding gestures and body movements during the performance. The system provides an easier way for users to change the content of the performance if they intent to do.
Applications related to smart cities require virtual cities in the experimental development stage... more Applications related to smart cities require virtual cities in the experimental development stage. To build a virtual city that are close to a real city, a large number of various types of human models need to be created. To reduce the cost of acquiring models, this paper proposes a method to reconstruct 3D human meshes from single images captured using a normal camera. It presents a method for reconstructing the complete mesh of the human body from a single RGB image and a generative adversarial network consisting of a newly designed shape–pose-based generator (based on deep convolutional neural networks) and an enhanced multi-source discriminator. Using a machine learning approach, the reliance on multiple sensors is reduced and 3D human meshes can be recovered using a single camera, thereby reducing the cost of building smart cities. The proposed method achieves an accuracy of 92.1% in body shape recovery; it can also process 34 images per second. The method proposed in this pape...
In this paper, we propose a novel approach to improve the accuracy for multiple low-cost drones i... more In this paper, we propose a novel approach to improve the accuracy for multiple low-cost drones in indoor environment. When the drones are flying, we employ sensors for checking their position in real-time. If the drones move out of their correct positions, the corresponding instructions are sent immediately. In another thread, we calibrate direction of the drones by checking yaw value. The adjustment is repeated until the drones locate at right position and direction.
This study focuses on reconstructing accurate meshes with high-resolution textures from single im... more This study focuses on reconstructing accurate meshes with high-resolution textures from single images. The reconstruction process involves two networks: a mesh-reconstruction network and a texture-reconstruction network. The mesh-reconstruction network estimates a deformation map, which is used to deform a template mesh to the shape of the target object in the input image, and a low-resolution texture. We propose reconstructing a mesh with a high-resolution texture by enhancing the low-resolution texture through use of the super-resolution method. The architecture of the texture-reconstruction network is like that of a generative adversarial network comprising a generator and a discriminator. During the training of the texture-reconstruction network, the discriminator must focus on learning high-quality texture predictions and to ignore the difference between the generated mesh and the actual mesh. To achieve this objective, we used meshes reconstructed using the mesh-reconstruction...
Human-centric Computing and Information Sciences, 2020
To develop a realistic simulator for autonomous vehicle testing, the simulation of various scenar... more To develop a realistic simulator for autonomous vehicle testing, the simulation of various scenarios that may occur near vehicles in the real world is necessary. In this paper, we propose a new scenario generation pipeline focused on generating scenarios in a specific area near an autonomous vehicle. In this method, a scenario map is generated to define the scenario simulation area. A convolutional neural network (CNN)-based scenario agent selector is introduced to evaluate whether the selected agents can generate a realistic scenario, and a collision event detector handles the collision message to trigger an accident event. The proposed event-centric action dispatcher in the pipeline enables agents near events to perform related actions when the events occur near the autonomous vehicle. The proposed scenario generation pipeline can generate scenarios containing pedestrians, animals, and vehicles, and, advantageously, no user intervention is required during the simulation. In additi...
Nowadays, deep learning methods based on a virtual environment are widely applied to research and... more Nowadays, deep learning methods based on a virtual environment are widely applied to research and technology development for autonomous vehicle’s smart sensors and devices. Learning various driving environments in advance is important to handle unexpected situations that can exist in the real world and to continue driving without accident. For training smart sensors and devices of an autonomous vehicle well, a virtual simulator should create scenarios of various possible real-world situations. To create reality-based scenarios, data on the real environment must be collected from a real driving vehicle or a scenario analysis process conducted by experts. However, these two approaches increase the period and the cost of scenario generation as more scenarios are created. This paper proposes a scenario generation method based on deep learning to create scenarios automatically for training autonomous vehicle smart sensors and devices. To generate various scenarios, the proposed method ex...
International Journal of Advanced Robotic Systems, 2018
Clustering plays an important role in processing light detection and ranging points in the autono... more Clustering plays an important role in processing light detection and ranging points in the autonomous perception tasks of robots. Clustering usually occurs near the start of processing three-dimensional point clouds obtained from light detection and ranging for detection and classification. Therefore, errors caused by clustering will directly affect the detection and classification accuracy. In this article, a clustering method is presented that combines density-based spatial clustering of application with noise and two-dimensional range image composed by scan lines of light detection and ranging based on the order of generation time. The results show that the proposed method achieves state-of-the-art performance in aspect of time efficiency and clustering accuracy. A ground extraction method based on scan line is also presented in this article, which has strong ability to separate ground points and non-ground points.
Recently, the diverse virtual reality devices are developed and utilized. Particularly, the devic... more Recently, the diverse virtual reality devices are developed and utilized. Particularly, the devices that recognize the motions of users such as griping hands and opening hands are issued to utilize the motions of the users as input methodology. Traditional research about motion recognition suggests user's motion estimation methods by calculating Bayesian probability after measuring the orientation of the motions by a Myo, which is one of contact-type motion recognition devices. However, the motion estimation methods have the problem of low motion estimation accuracy, given that orientation is defined by x, y, and z, which are calculated separately only considering the values of the corresponding axis. In order to improve motion estimation accuracy, motions should be estimated by considering the values of all axis. This paper proposes a method using genetic algorithm to calculate weights, which are applied to estimate motions through Bayesian probability by considering the values of all axis after measuring user's motions with a Myo. The proposed method consists of three steps. First, the Bayesian probability is calculated by considering the correlations of x, y, and z of the orientation of a Myo. Second, weights are determined by applying genetic algorithm. Third, motions are estimated through the Bayesian probability with the determined weights. Experiments were conducted to compare the Bayesian probability between the traditional method based on min/max and the proposed method, which showed that the proposed method had reduced the difference of the orientations by 32%.
Remote conferencing systems provide a shared environment where people in different locations can ... more Remote conferencing systems provide a shared environment where people in different locations can communicate and collaborate in real time. Currently, remote video conferencing systems present separate video images of the individual participants. To achieve a more realistic conference experience, we enhance video conferencing by integrating the remote images into a shared virtual environment. This paper proposes a collaborative client participant fusion system using a real-time foreground segmentation method. In each client system, the foreground pixels are extracted from the participant images using a feedback background modeling method. Because the segmentation results often contain noise and holes caused by adverse environmental lighting conditions and substandard camera resolution, a Markov Random Field model is applied in the morphological operations of dilation and erosion. This foreground segmentation refining process is implemented using graphics processing unit programming, to facilitate real-time image processing. Subsequently, segmented foreground pixels are transmitted to a server, which fuses the remote images of the participants into a shared virtual environment. The fused conference scene is represented by a realistic holographic projection.
2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)
This paper proposes a photorealistic 3D city simulation method for training autonomous vehicles. ... more This paper proposes a photorealistic 3D city simulation method for training autonomous vehicles. The proposed method incorporates human simulation, animal simulation, vehicle simulation, and traffic light simulation. To generate natural actions for humans and animals, a motivation-based approach is first applied; then the Q-Network is used to select optimal goals depending on the motivations, and action plans are made based on a hierarchical task network. For vehicles, affinity propagation, data augmentation, and convolutional neural network are employed to generate driver driving data for realistic vehicle movement simulation. A traffic light system is also implemented based on rules derived from real-life observation. The results of experiments in which a virtual city was created demonstrate that the proposed method can simulate city environments naturally. The proposed method can be applied to various smart city applications, such as autonomous vehicle training systems.
Recent studies have shown that deep learning achieves excellent performance in reconstructing 3D ... more Recent studies have shown that deep learning achieves excellent performance in reconstructing 3D scenes from multiview images or videos. However, these reconstructions do not provide the identities of objects, and object identification is necessary for a scene to be functional in virtual reality or interactive applications. The objects in a scene reconstructed as one mesh are treated as a single object, rather than individual entities that can be interacted with or manipulated. Reconstructing an object-aware 3D scene from a single 2D image is challenging because the image conversion process from a 3D scene to a 2D image is irreversible, and the projection from 3D to 2D reduces a dimension. To alleviate the effects of dimension reduction, we proposed a module to generate depth features that can aid the 3D pose estimation of objects. Additionally, we developed a novel approach to mesh reconstruction that combines two decoders that estimate 3D shapes with different shape representation...
Three-dimensional (3D) reconstruction techniques are playing an increasingly important role in ed... more Three-dimensional (3D) reconstruction techniques are playing an increasingly important role in education and entertainment. Real and recognizable avatars can enhance the immersion and interactivity of virtual systems. In 3D face modeling technology, face texture carries vital face recognition information. Therefore, this study proposes a panoramic 3D face texture generation method for 3D face reconstruction from a single 2D face image based on a 3D Morphable model (3DMM). Realistic and comprehensive panoramic facial textures can be obtained using generative networks as texture converters. Furthermore, we propose a low-cost method for generating face texture datasets for data collection. Experimental results show that the proposed method can generate panoramic face textures for 3D face meshes from a single image input, resulting in the final generation of textured 3D models that look realistic from different viewpoints.
Excessive training time is a major issue face when training autonomous vehicle agents with neural... more Excessive training time is a major issue face when training autonomous vehicle agents with neural networks by using images as input. This paper proposes a deep time-economical Q network (DQN) input image preprocessing method to train an autonomous vehicle agent in a virtual environment. The environmental information is extracted from the virtual environment. A top-view image of the entire environment is then redrawn according to the environmental information. During training of the DQN model, the top-view image is cropped to place the vehicle agent at the center of the cropped image. The current frame top-view image is combined with the images from the previous two training iterations. The DQN model use this combined image as input. The experimental results indicate higher performance and shorter training time for the DQN model trained with the preprocessed images compared with that trained without preprocessing.
Recently, virtual environment-based techniques to train sensor-based autonomous driving models ha... more Recently, virtual environment-based techniques to train sensor-based autonomous driving models have been widely employed due to their efficiency. However, a simulated virtual environment is required to be highly similar to its real-world counterpart to ensure the applicability of such models to actual autonomous vehicles. Though advances in hardware and three-dimensional graphics engine technology have enabled the creation of realistic virtual driving environments, the myriad of scenarios occurring in the real world can only be simulated up to a limited extent. In this study, a scenario simulation and modeling framework that simulates the behavior of objects that may be encountered while driving is proposed to address this problem. This framework maximizes the number of scenarios, their types, and the driving experience in a virtual environment. Furthermore, a simulator was implemented and employed to evaluate the performance of the proposed framework.
Today multimedia technologies are playing an increasingly important role in games, movies, and li... more Today multimedia technologies are playing an increasingly important role in games, movies, and live performances. In this paper, we design a flexible interactive system integrated with gesture recognition, skeleton tracking, internet communication, and content edition using multi-sensors to direct and control the performance on stage. In this system, the performer can control the elements showed on stage through corresponding gestures and body movements during the performance. The system provides an easier way for users to change the content of the performance if they intent to do.
Applications related to smart cities require virtual cities in the experimental development stage... more Applications related to smart cities require virtual cities in the experimental development stage. To build a virtual city that are close to a real city, a large number of various types of human models need to be created. To reduce the cost of acquiring models, this paper proposes a method to reconstruct 3D human meshes from single images captured using a normal camera. It presents a method for reconstructing the complete mesh of the human body from a single RGB image and a generative adversarial network consisting of a newly designed shape–pose-based generator (based on deep convolutional neural networks) and an enhanced multi-source discriminator. Using a machine learning approach, the reliance on multiple sensors is reduced and 3D human meshes can be recovered using a single camera, thereby reducing the cost of building smart cities. The proposed method achieves an accuracy of 92.1% in body shape recovery; it can also process 34 images per second. The method proposed in this pape...
In this paper, we propose a novel approach to improve the accuracy for multiple low-cost drones i... more In this paper, we propose a novel approach to improve the accuracy for multiple low-cost drones in indoor environment. When the drones are flying, we employ sensors for checking their position in real-time. If the drones move out of their correct positions, the corresponding instructions are sent immediately. In another thread, we calibrate direction of the drones by checking yaw value. The adjustment is repeated until the drones locate at right position and direction.
This study focuses on reconstructing accurate meshes with high-resolution textures from single im... more This study focuses on reconstructing accurate meshes with high-resolution textures from single images. The reconstruction process involves two networks: a mesh-reconstruction network and a texture-reconstruction network. The mesh-reconstruction network estimates a deformation map, which is used to deform a template mesh to the shape of the target object in the input image, and a low-resolution texture. We propose reconstructing a mesh with a high-resolution texture by enhancing the low-resolution texture through use of the super-resolution method. The architecture of the texture-reconstruction network is like that of a generative adversarial network comprising a generator and a discriminator. During the training of the texture-reconstruction network, the discriminator must focus on learning high-quality texture predictions and to ignore the difference between the generated mesh and the actual mesh. To achieve this objective, we used meshes reconstructed using the mesh-reconstruction...
Human-centric Computing and Information Sciences, 2020
To develop a realistic simulator for autonomous vehicle testing, the simulation of various scenar... more To develop a realistic simulator for autonomous vehicle testing, the simulation of various scenarios that may occur near vehicles in the real world is necessary. In this paper, we propose a new scenario generation pipeline focused on generating scenarios in a specific area near an autonomous vehicle. In this method, a scenario map is generated to define the scenario simulation area. A convolutional neural network (CNN)-based scenario agent selector is introduced to evaluate whether the selected agents can generate a realistic scenario, and a collision event detector handles the collision message to trigger an accident event. The proposed event-centric action dispatcher in the pipeline enables agents near events to perform related actions when the events occur near the autonomous vehicle. The proposed scenario generation pipeline can generate scenarios containing pedestrians, animals, and vehicles, and, advantageously, no user intervention is required during the simulation. In additi...
Nowadays, deep learning methods based on a virtual environment are widely applied to research and... more Nowadays, deep learning methods based on a virtual environment are widely applied to research and technology development for autonomous vehicle’s smart sensors and devices. Learning various driving environments in advance is important to handle unexpected situations that can exist in the real world and to continue driving without accident. For training smart sensors and devices of an autonomous vehicle well, a virtual simulator should create scenarios of various possible real-world situations. To create reality-based scenarios, data on the real environment must be collected from a real driving vehicle or a scenario analysis process conducted by experts. However, these two approaches increase the period and the cost of scenario generation as more scenarios are created. This paper proposes a scenario generation method based on deep learning to create scenarios automatically for training autonomous vehicle smart sensors and devices. To generate various scenarios, the proposed method ex...
International Journal of Advanced Robotic Systems, 2018
Clustering plays an important role in processing light detection and ranging points in the autono... more Clustering plays an important role in processing light detection and ranging points in the autonomous perception tasks of robots. Clustering usually occurs near the start of processing three-dimensional point clouds obtained from light detection and ranging for detection and classification. Therefore, errors caused by clustering will directly affect the detection and classification accuracy. In this article, a clustering method is presented that combines density-based spatial clustering of application with noise and two-dimensional range image composed by scan lines of light detection and ranging based on the order of generation time. The results show that the proposed method achieves state-of-the-art performance in aspect of time efficiency and clustering accuracy. A ground extraction method based on scan line is also presented in this article, which has strong ability to separate ground points and non-ground points.
Recently, the diverse virtual reality devices are developed and utilized. Particularly, the devic... more Recently, the diverse virtual reality devices are developed and utilized. Particularly, the devices that recognize the motions of users such as griping hands and opening hands are issued to utilize the motions of the users as input methodology. Traditional research about motion recognition suggests user's motion estimation methods by calculating Bayesian probability after measuring the orientation of the motions by a Myo, which is one of contact-type motion recognition devices. However, the motion estimation methods have the problem of low motion estimation accuracy, given that orientation is defined by x, y, and z, which are calculated separately only considering the values of the corresponding axis. In order to improve motion estimation accuracy, motions should be estimated by considering the values of all axis. This paper proposes a method using genetic algorithm to calculate weights, which are applied to estimate motions through Bayesian probability by considering the values of all axis after measuring user's motions with a Myo. The proposed method consists of three steps. First, the Bayesian probability is calculated by considering the correlations of x, y, and z of the orientation of a Myo. Second, weights are determined by applying genetic algorithm. Third, motions are estimated through the Bayesian probability with the determined weights. Experiments were conducted to compare the Bayesian probability between the traditional method based on min/max and the proposed method, which showed that the proposed method had reduced the difference of the orientations by 32%.
Remote conferencing systems provide a shared environment where people in different locations can ... more Remote conferencing systems provide a shared environment where people in different locations can communicate and collaborate in real time. Currently, remote video conferencing systems present separate video images of the individual participants. To achieve a more realistic conference experience, we enhance video conferencing by integrating the remote images into a shared virtual environment. This paper proposes a collaborative client participant fusion system using a real-time foreground segmentation method. In each client system, the foreground pixels are extracted from the participant images using a feedback background modeling method. Because the segmentation results often contain noise and holes caused by adverse environmental lighting conditions and substandard camera resolution, a Markov Random Field model is applied in the morphological operations of dilation and erosion. This foreground segmentation refining process is implemented using graphics processing unit programming, to facilitate real-time image processing. Subsequently, segmented foreground pixels are transmitted to a server, which fuses the remote images of the participants into a shared virtual environment. The fused conference scene is represented by a realistic holographic projection.
2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)
This paper proposes a photorealistic 3D city simulation method for training autonomous vehicles. ... more This paper proposes a photorealistic 3D city simulation method for training autonomous vehicles. The proposed method incorporates human simulation, animal simulation, vehicle simulation, and traffic light simulation. To generate natural actions for humans and animals, a motivation-based approach is first applied; then the Q-Network is used to select optimal goals depending on the motivations, and action plans are made based on a hierarchical task network. For vehicles, affinity propagation, data augmentation, and convolutional neural network are employed to generate driver driving data for realistic vehicle movement simulation. A traffic light system is also implemented based on rules derived from real-life observation. The results of experiments in which a virtual city was created demonstrate that the proposed method can simulate city environments naturally. The proposed method can be applied to various smart city applications, such as autonomous vehicle training systems.
Uploads
Papers by Mingyun Wen