Purpose: A typical clinical MR examination includes multiple scans to acquire images with differe... more Purpose: A typical clinical MR examination includes multiple scans to acquire images with different contrasts for complementary diagnostic information. The multicontrast scheme requires long scanning time. The combination of partially parallel imaging and compressed sensing (CS-PPI) has been used to reconstruct accelerated scans. However, there are several unsolved problems in existing methods. The target of this work is to improve existing CS-PPI methods for multicontrast imaging, especially for two-dimensional imaging. Theory and Methods: If the same field of view is scanned in multicontrast imaging, there is significant amount of sharable information. It is proposed in this study to use manifold sharable information among multicontrast images to enhance CS-PPI in a sequential way. Coil sensitivity information and structure based adaptive regularization, which were extracted from previously reconstructed images, were applied to enhance the following reconstructions. The proposed method is called Parallel-imaging and compressed-sensing Reconstruction Of Multicontrast Imaging using SharablE information (PROMISE). Results: Using L 1-SPIRiT as a CS-PPI example, results on multicontrast brain and carotid scans demonstrated that lower error level and better detail preservation can be achieved by exploiting manifold sharable information. Besides, the privilege of PROMISE still exists while there is interscan motion. Conclusion: Using the sharable information among multicontrast images can enhance CS-PPI with tolerance to motions.
Positron emission tomography (PET) is a widely used molecular imaging modality for various clinic... more Positron emission tomography (PET) is a widely used molecular imaging modality for various clinical applications. With Magnetic Resonance Imaging (MRI) providing anatomical information, simultaneous PET/MR reduces the radiation risk. Both improved hardware and algorithms have been developed to further reduce the amount of radiotracer dosage, but these methods are not yet applied to very low dose. Here, we propose a Deep Learning based method to enable ultra-low-dose PET denoising with multi-contrast information from simultaneous MRI. Methods:The method is implemented to denoise 18F-fluorodeoxyglucose (FDG) brain PET images from low-dose images with 200-fold dose reduction through undersampling, and evaluated for glioblastoma (GBM) patients. Comprehensive quantitative and qualitative evaluations were conducted to verify the performance and clinical applicability of the proposed method, including quantitative accuracy evaluation, visual quality evaluation, reader study with manual tumor segmentation to evaluate the diagnostic quality. Results:The results demonstrate that the proposed method achieves superior results in performance and efficiency comparing with the state-of-art denoising methods. Conclusion:Though reconstructed from scans with only 0.5% of the standard dose, the denoised ultra-low-dose PET images deliver similar visual quality and diagnostic information as the standard-dose PET images. By combining PET and MR information, the proposed Deep Learning based method improves image quality of ultra-low-dose PET, preserves diagnostic quality, and potentially enables much safer, faster, and more cost-effective PET/MR studies.
BACKGROUND AND PURPOSE: In this prospective, multicenter, multireader study, we evaluated the imp... more BACKGROUND AND PURPOSE: In this prospective, multicenter, multireader study, we evaluated the impact on both image quality and quantitative image-analysis consistency of 60% accelerated volumetric MR imaging sequences processed with a commercially available, vendor-agnostic, DICOM-based, deep learning tool (SubtleMR) compared with that of standard of care. MATERIALS AND METHODS: Forty subjects underwent brain MR imaging examinations on 6 scanners from 5 institutions. Standard of care and accelerated datasets were acquired for each subject, and the accelerated scans were enhanced with deep learning processing. Standard of care, accelerated scans, and accelerated-deep learning were subjected to NeuroQuant quantitative analysis and classified by a neuroradiologist into clinical disease categories. Concordance of standard of care and accelerated-deep learning biomarker measurements were assessed. Randomized, side-by-side, multiplanar datasets (360 series) were presented blinded to 2 neuroradiologists and rated for apparent SNR, image sharpness, artifacts, anatomic/lesion conspicuity, image contrast, and gray-white differentiation to evaluate image quality. RESULTS: Accelerated-deep learning was statistically superior to standard of care for perceived quality across imaging features despite a 60% sequence scan-time reduction. Both accelerated-deep learning and standard of care were superior to accelerated scans for all features. There was no difference in quantitative volumetric biomarkers or clinical classification for standard of care and accelerated-deep learning datasets. CONCLUSIONS: Deep learning reconstruction allows 60% sequence scan-time reduction while maintaining high volumetric quantification accuracy, consistent clinical classification, and what radiologists perceive as superior image quality compared with standard of care. This trial supports the reliability, efficiency, and utility of deep learning-based enhancement for quantitative imaging. Shorter scan times may heighten the use of volumetric quantitative MR imaging in routine clinical settings. ABBREVIATIONS: DL ¼ deep learning; FAST ¼ accelerated scan; HOC ¼ hippocampal occupancy score; HV ¼ hippocampal volumes; ILV ¼ inferior lateral ventricles; MCI ¼ mild cognitive impairment; SLV ¼ superior lateral ventricles; SOC ¼ standard of care D eep learning (DL) is a subset of machine learning that uses convolutional neural networks to process large volumes of data. 1-6 While traditional reconstruction techniques can be limited by long scan times, SNR constraints, and motion artifacts,
Modern deep neural networks have a large number of parameters, making them very hard to train. We... more Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the final D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network. Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classification, caption generation and speech recognition. On ImageNet, DSD improved the Top1 accuracy of GoogLeNet by 1.1%, VGG-16 by 4.3%, ResNet-18 by 1.2% and ResNet-50 by 1.1%, respectively. On the WSJ'93 dataset, DSD improved DeepSpeech and DeepSpeech2 WER by 2.0% and 1.1%. On the Flickr-8K dataset, DSD improved the NeuralTalk BLEU score by over 1.7. DSD is easy to use in practice: at training time, DSD incurs only one extra hyper-parameter: the sparsity ratio in the S step. At testing time, DSD doesn't change the network architecture or incur any inference overhead. The consistent and significant performance gain of DSD experiments shows the inadequacy of the current training methods for finding the best local optimum, while DSD effectively achieves superior optimization performance for finding a better solution. DSD models are available to download at https://songhan.github.io/DSD. * Indicates equal contribution † Also at NVIDIA ‡ Now at Google Brain.
The Android App we developed takes video frames of a human face from camera as input, and outputs... more The Android App we developed takes video frames of a human face from camera as input, and outputs a fusion image of extracted facial features and contours and a motion distribution map. The motion distribution map is generated based on MicroExpression hotmap with special color added. The brightness of the color is scaled by the magnitude of motion in each different area on the face. The client, an Android device, gets the initial location of eyes and mouth. Covariance based image registration is used to generated motion distribution of facial features on the server side. The fusion image generated with the information is then sent back to the client for display. Users can learn from this fusion image about micro changes of face features and thus interpret the human emotion. Since more than key points of facial features are extracted, we expect full utilization of our data to give precise interpretation provided a robust scoring system of motions of different facial features and cont...
Vitals are vital! Patient vitals provide critical information essential to the successful diagnos... more Vitals are vital! Patient vitals provide critical information essential to the successful diagnosis and management of patients in every healthcare settings, particularly in the acute-care setting. Heart-rate(HR), and to a greater degree heart rate variability, as well as overall patient mobility can indicate the presence of severe disease states. Practical and technical constraints currently limit monitoring in healthcare settings to select (eg. admitted) patients, leaving numerous others unmonitored and potentially at risk. Wed like to use Computer vision to provide a non-invasive solution to this acute problem. To tackle this problem, we extended Haar cascade detectors and SIFT based tracking and face recognition for a robust continuous face detection and assignment. Applying independent component analysis (ICA) based blind signal decomposition we were able to accurately extract heart-rate from video. Additionally, we proposed a method based on multi-scale spatial-temporal correla...
reconstruction are optimized for multi-contrast scans. The sampling trajectory is optimized for n... more reconstruction are optimized for multi-contrast scans. The sampling trajectory is optimized for new acquisition. The sharable information of structure and sensitivity is exploited to enhance reconstruction. PROMISE: Parallel Reconstruction with Optimized acquisition for Multi-contrast Imaging in the context of compressed Sensing Enhao Gong, Feng Huang, Kui Ying, Wenchuan Wu, Shi Wang, Chun Yuan, and Randy Duensing Electrical Engineering, Stanford University, Stanford, CA, United States, Philips Research Asia Shanghai, Shanghai, China, Department of Engineering Physics, Tsinghua University, Beijing, China, Center for Biomedical Imaging Research, Department of Biomedical Engineering, Tsinghua University, Beijing, China, Department of Radiology, University of Washington, Seattle, WA, United States, Philips Healthcare, Gainesville, FL, United States
Positron emission tomography (PET) is a widely used molecular imaging modality for various clinic... more Positron emission tomography (PET) is a widely used molecular imaging modality for various clinical applications. With Magnetic Resonance Imaging (MRI) providing anatomical information, simultaneous PET/MR reduces the radiation risk. Both improved hardware and algorithms have been developed to further reduce the amount of radiotracer dosage, but these methods are not yet applied to very low dose. Here, we propose a Deep Learning based method to enable ultra-low-dose PET denoising with multi-contrast information from simultaneous MRI. Methods:The method is implemented to denoise 18F-fluorodeoxyglucose (FDG) brain PET images from low-dose images with 200-fold dose reduction through undersampling, and evaluated for glioblastoma (GBM) patients. Comprehensive quantitative and qualitative evaluations were conducted to verify the performance and clinical applicability of the proposed method, including quantitative accuracy evaluation, visual quality evaluation, reader study with manual tumor segmentation to evaluate the diagnostic quality. Results:The results demonstrate that the proposed method achieves superior results in performance and efficiency comparing with the state-of-art denoising methods. Conclusion:Though reconstructed from scans with only 0.5% of the standard dose, the denoised ultra-low-dose PET images deliver similar visual quality and diagnostic information as the standard-dose PET images. By combining PET and MR information, the proposed Deep Learning based method improves image quality of ultra-low-dose PET, preserves diagnostic quality, and potentially enables much safer, faster, and more cost-effective PET/MR studies.
IMPORTANCE Predicting infarct size and location is important for decision-making and prognosis in... more IMPORTANCE Predicting infarct size and location is important for decision-making and prognosis in patients with acute stroke. OBJECTIVES To determine whether a deep learning model can predict final infarct lesions using magnetic resonance images (MRIs) acquired at initial presentation (baseline) and to compare the model with current clinical prediction methods. DESIGN, SETTING, AND PARTICIPANTS In this multicenter prognostic study, a specific type of neural network for image segmentation (U-net) was trained, validated, and tested using patients from the Imaging Collaterals in Acute Stroke (iCAS) study from April 14, 2014, to April 15, 2018, and the Diffusion Weighted Imaging Evaluation for Understanding Stroke Evolution Study-2 (DEFUSE-2) study from July 14, 2008, to September 17, 2011 (reported in October 2012). Patients underwent baseline perfusion-weighted and diffusion-weighted imaging and MRI at 3 to 7 days after baseline. Patients were grouped into unknown, minimal, partial, and major reperfusion status based on 24-hour imaging results. Baseline images acquired at presentation were inputs, and the final true infarct lesion at 3 to 7 days was considered the ground truth for the model. The model calculated the probability of infarction for every voxel, which can be thresholded to produce a prediction. Data were analyzed from July 1, 2018, to March 7, 2019. MAIN OUTCOMES AND MEASURES Area under the curve, Dice score coefficient (DSC) (a metric from 0-1 indicating the extent of overlap between the prediction and the ground truth; a DSC of Ն0.5 represents significant overlap), and volume error. Current clinical methods were compared with model performance in subgroups of patients with minimal or major reperfusion. RESULTS Among the 182 patients included in the model (97 women [53.3%]; mean [SD] age, 65 [16] years), the deep learning model achieved a median area under the curve of 0.92 (interquartile range [IQR], 0.87-0.96), DSC of 0.53 (IQR, 0.31-0.68), and volume error of 9 (IQR, −14 to 29) mL. In subgroups with minimal (DSC, 0.58 [IQR, 0.31-0.67] vs 0.55 [IQR, 0.40-0.65]; P = .37) or major (DSC, 0.48 [IQR, 0.29-0.65] vs 0.45 [IQR, 0.15-0.54]; P = .002) reperfusion for which comparison with existing clinical methods was possible, the deep learning model had comparable or better performance. CONCLUSIONS AND RELEVANCE The deep learning model appears to have successfully predicted infarct lesions from baseline imaging without reperfusion information and achieved comparable performance to existing clinical methods. Predicting the subacute infarct lesion may help clinicians prepare for decompression treatment and aid in patient selection for neuroprotective clinical trials.
Purpose: To improve the image quality of highly accelerated multi-channel MRI data by learning a ... more Purpose: To improve the image quality of highly accelerated multi-channel MRI data by learning a joint variational network that reconstructs multiple clinical contrasts jointly. Methods: Data from our multi-contrast acquisition was embedded into the variational network architecture where shared anatomical information is exchanged by mixing the input contrasts. Complementary k-space sampling across imaging contrasts and Bunch-Phase/Wave-Encoding were used for data acquisition to improve the reconstruction at high accelerations. At 3T, our joint variational network approach across T1w, T2w and T2-FLAIR-weighted brain scans was tested for retrospective under-sampling at R=6 (2D) and R=4x4 (3D) acceleration. Prospective acceleration was also performed for 3D data where the combined acquisition time for whole brain coverage at 1 mm isotropic resolution across three contrasts was less than three minutes. Results: Across all test datasets, our joint multi-contrast network better preserved fine anatomical details with reduced image-blurring when compared to the corresponding single-contrast reconstructions. Improvement in image quality was also obtained through complementary kspace sampling and Bunch-Phase/Wave-Encoding where the synergistic combination yielded the overall best performance as evidenced by exemplarily slices and quantitative error metrics. Conclusion: By leveraging shared anatomical structures across the jointly reconstructed scans, our joint multi-contrast approach learnt more efficient regularizers which helped to retain natural image appearance and avoid over-smoothing. When synergistically combined with advanced encoding techniques, the performance was further improved, enabling up to R=16-fold acceleration with good image quality. This should help pave the way to very rapid high-resolution brain exams.
Deep learning is a form of machine learning using a convolutional neural network architecture tha... more Deep learning is a form of machine learning using a convolutional neural network architecture that shows tremendous promise for imaging applications. It is increasingly being adapted from its original demonstration in computer vision applications to medical imaging. Because of the high volume and wealth of multimodal imaging information acquired in typical studies, neuroradiology is poised to be an early adopter of deep learning. Compelling deep learning research applications have been demonstrated, and their use is likely to grow rapidly. This review article describes the reasons, outlines the basic methods used to train and test deep learning models, and presents a brief overview of current and potential clinical applications with an emphasis on how they are likely to change future neuroradiology practice. Facility with these methods among neuroimaging researchers and clinicians will be important to channel and harness the vast potential of this new method.
Proceedings of the National Academy of Sciences, 2014
Significance For complex biological processes, the formation of protein complexes is a strategy f... more Significance For complex biological processes, the formation of protein complexes is a strategy for coordinating the activities of many enzymes in space and time. It has been hypothesized that growth of the bacterial cell wall involves stable synthetic complexes, but neither the existence of such complexes nor the consequences of such a mechanism for growth efficiency have been demonstrated. Here, we use single-molecule tracking to demonstrate that the association between an essential cell wall synthesis enzyme and the cytoskeleton is highly dynamic, which allows the cell to buffer growth rate against large fluctuations in enzyme abundance. This indicates that dynamic association can be an efficient strategy for coordination of multiple enzymes, especially those for which excess abundance can be harmful to cells.
We demonstrate Optical-Resolution Photoacoustic Microscopy (OR-PAM), where the optical field is f... more We demonstrate Optical-Resolution Photoacoustic Microscopy (OR-PAM), where the optical field is focused and scanned using Digital Phase Conjugation through a multimode fiber. The focus is scanned across the field of view using digital means, and the acoustic signal induced is collected by a transducer. Optical-resolution photoacoustic images of a knot made by two absorptive wires are obtained and we report on resolution smaller than 1.5 μm across a 201 μm × 201 μm field of view. The use of a multimode optical fiber for the optical excitation part can pave the way for miniature endoscopes that can provide optical-resolution photoacoustic images at large optical depth.
Multi-contrast Magnetic Resonance Imaging (MRI) acquisitions from a single scan have tremendous p... more Multi-contrast Magnetic Resonance Imaging (MRI) acquisitions from a single scan have tremendous potential to streamline exams and reduce imaging time. However, maintaining clinically feasible scan time necessitates significant undersampling, pushing the limits on compressed sensing and other low-dimensional techniques. During MRI scanning, one of the possible solutions is by using undersampling designs which can effectively improve the acquisition and achieve higher reconstruction accuracy. However, existing undersampling optimization methods are time-consuming and the limited performance prevents their clinical applications. In this paper, we proposed an improved undersampling trajectory optimization scheme to generate an optimized trajectory within seconds and apply it to subsequent multi-contrast MRI datasets on a per-subject basis, where we named it OUTCOMES. By using a data-driven method combined with improved algorithm design, GPU acceleration, and more efficient computation, ...
Arterial Spin Labeling (ASL) is a popular non-invasive neuroimaging technique to use MRI for quan... more Arterial Spin Labeling (ASL) is a popular non-invasive neuroimaging technique to use MRI for quantitatively Cerebral Blood Flow (CBF) mapping. However, ASL usually suffers from poor signal quality and repeated measurements are typically acquired to improve signal quality through averaging at the cost of long scan time. In this work, a deep learning algorithm is proposed to leverage both convolutional neural network (CNN) based image enhancement as well as combining complementary/mutual information from multiple tissue contrasts in ASL acquisition. Both quantitative and qualitative evaluation demonstrate the performance and stability of the proposed algorithm and its superiority over conventional denoising algorithms and standard deep learning based denoising. The results demonstrate the feasibility of efficient and high-quality ASL measurements from average-free fast acquisition which will enable broader clinical application of ASL.
Magnetic resonance image (MRI) reconstruction is a severely ill-posed linear inverse task demandi... more Magnetic resonance image (MRI) reconstruction is a severely ill-posed linear inverse task demanding time and resource intensive computations that can substantially trade off accuracy for speed in real-time imaging. In addition, state-of-the-art compressed sensing (CS) analytics are not cognizant of the image diagnostic quality. To cope with these challenges we put forth a novel CS framework that permeates benefits from generative adversarial networks (GAN) to train a (low-dimensional) manifold of diagnostic-quality MR images from historical patients. Leveraging a mixture of least-squares (LS) GANs and pixel-wise 1 cost, a deep residual network with skip connections is trained as the generator that learns to remove the aliasing artifacts by projecting onto the manifold. LSGAN learns the texture details, while 1 controls the high-frequency noise. A multilayer convolutional neural network is then jointly trained based on diagnostic quality images to discriminate the projection quality. The test phase performs feed-forward propagation over the generator network that demands a very low computational overhead. Extensive evaluations are performed on a large contrast-enhanced MR dataset of pediatric patients. In particular, images rated based on expert radiologists corroborate that GANCS retrieves high contrast images with detailed texture relative to conventional CS, and pixel-wise schemes. In addition, it offers reconstruction under a few milliseconds, two orders of magnitude faster than state-of-the-art CS-MRI schemes. * The authors are with the Stanford University, Departments of Electrical Engineering 1 , Radiology 2 , Radiation Oncology 3 , and Computer Science 4 .
Studying oxygen metabolism during brain ischemia has been limited by imaging modalities that have... more Studying oxygen metabolism during brain ischemia has been limited by imaging modalities that have either low resolution but good tissue penetration, or high resolution requiring invasive preparations (open-skull windows). Here, we report a study of cerebrovascular oxygen metabolism in mouse focal ischemia using a novel minimally invasive, high-resolution technique known as optical-resolution photoacoustic microscopy (OR-PAM). Using OR-PAM, we serially imaged cortical vessels through the intact skull of Swiss Webster mice (n=3) before, during, and up to 72 hrs after a 1-hr intraluminal transient middle cerebral artery occlusion (MCAO). Vessels were imaged at 2 different wavelengths (532 & 563 nm) to quantify oxy-, deoxy-hemoglobin and oxygen saturation (sO 2 , Fig , A-C). Oxygen extraction fraction (OEF) was calculated from venous sO 2 values (assuming a constant arterial sO 2 of 95%). Seventy-two hrs after ischemia, animals were sacrificed and whole brains were stained with TTC to d...
PurposeWith rising safety concerns over the use of gadolinium‐based contrast agents (GBCAs) in co... more PurposeWith rising safety concerns over the use of gadolinium‐based contrast agents (GBCAs) in contrast‐enhanced MRI, there is a need for dose reduction while maintaining diagnostic capability. This work proposes comprehensive technical solutions for a deep learning (DL) model that predicts contrast‐enhanced images of the brain with approximately 10% of the standard dose, across different sites and scanners.MethodsThe proposed DL model consists of a set of methods that improve the model robustness and generalizability. The steps include multi‐planar reconstruction, 2.5D model, enhancement‐weighted L1, perceptual, and adversarial losses. The proposed model predicts contrast‐enhanced images from corresponding pre‐contrast and low‐dose images. With IRB approval and informed consent, 640 heterogeneous patient scans (56 train, 13 validation, and 571 test) from 3 institutions consisting of 3D T1‐weighted brain images were used. Quantitative metrics were computed and 50 randomly sampled te...
Purpose: A typical clinical MR examination includes multiple scans to acquire images with differe... more Purpose: A typical clinical MR examination includes multiple scans to acquire images with different contrasts for complementary diagnostic information. The multicontrast scheme requires long scanning time. The combination of partially parallel imaging and compressed sensing (CS-PPI) has been used to reconstruct accelerated scans. However, there are several unsolved problems in existing methods. The target of this work is to improve existing CS-PPI methods for multicontrast imaging, especially for two-dimensional imaging. Theory and Methods: If the same field of view is scanned in multicontrast imaging, there is significant amount of sharable information. It is proposed in this study to use manifold sharable information among multicontrast images to enhance CS-PPI in a sequential way. Coil sensitivity information and structure based adaptive regularization, which were extracted from previously reconstructed images, were applied to enhance the following reconstructions. The proposed method is called Parallel-imaging and compressed-sensing Reconstruction Of Multicontrast Imaging using SharablE information (PROMISE). Results: Using L 1-SPIRiT as a CS-PPI example, results on multicontrast brain and carotid scans demonstrated that lower error level and better detail preservation can be achieved by exploiting manifold sharable information. Besides, the privilege of PROMISE still exists while there is interscan motion. Conclusion: Using the sharable information among multicontrast images can enhance CS-PPI with tolerance to motions.
Positron emission tomography (PET) is a widely used molecular imaging modality for various clinic... more Positron emission tomography (PET) is a widely used molecular imaging modality for various clinical applications. With Magnetic Resonance Imaging (MRI) providing anatomical information, simultaneous PET/MR reduces the radiation risk. Both improved hardware and algorithms have been developed to further reduce the amount of radiotracer dosage, but these methods are not yet applied to very low dose. Here, we propose a Deep Learning based method to enable ultra-low-dose PET denoising with multi-contrast information from simultaneous MRI. Methods:The method is implemented to denoise 18F-fluorodeoxyglucose (FDG) brain PET images from low-dose images with 200-fold dose reduction through undersampling, and evaluated for glioblastoma (GBM) patients. Comprehensive quantitative and qualitative evaluations were conducted to verify the performance and clinical applicability of the proposed method, including quantitative accuracy evaluation, visual quality evaluation, reader study with manual tumor segmentation to evaluate the diagnostic quality. Results:The results demonstrate that the proposed method achieves superior results in performance and efficiency comparing with the state-of-art denoising methods. Conclusion:Though reconstructed from scans with only 0.5% of the standard dose, the denoised ultra-low-dose PET images deliver similar visual quality and diagnostic information as the standard-dose PET images. By combining PET and MR information, the proposed Deep Learning based method improves image quality of ultra-low-dose PET, preserves diagnostic quality, and potentially enables much safer, faster, and more cost-effective PET/MR studies.
BACKGROUND AND PURPOSE: In this prospective, multicenter, multireader study, we evaluated the imp... more BACKGROUND AND PURPOSE: In this prospective, multicenter, multireader study, we evaluated the impact on both image quality and quantitative image-analysis consistency of 60% accelerated volumetric MR imaging sequences processed with a commercially available, vendor-agnostic, DICOM-based, deep learning tool (SubtleMR) compared with that of standard of care. MATERIALS AND METHODS: Forty subjects underwent brain MR imaging examinations on 6 scanners from 5 institutions. Standard of care and accelerated datasets were acquired for each subject, and the accelerated scans were enhanced with deep learning processing. Standard of care, accelerated scans, and accelerated-deep learning were subjected to NeuroQuant quantitative analysis and classified by a neuroradiologist into clinical disease categories. Concordance of standard of care and accelerated-deep learning biomarker measurements were assessed. Randomized, side-by-side, multiplanar datasets (360 series) were presented blinded to 2 neuroradiologists and rated for apparent SNR, image sharpness, artifacts, anatomic/lesion conspicuity, image contrast, and gray-white differentiation to evaluate image quality. RESULTS: Accelerated-deep learning was statistically superior to standard of care for perceived quality across imaging features despite a 60% sequence scan-time reduction. Both accelerated-deep learning and standard of care were superior to accelerated scans for all features. There was no difference in quantitative volumetric biomarkers or clinical classification for standard of care and accelerated-deep learning datasets. CONCLUSIONS: Deep learning reconstruction allows 60% sequence scan-time reduction while maintaining high volumetric quantification accuracy, consistent clinical classification, and what radiologists perceive as superior image quality compared with standard of care. This trial supports the reliability, efficiency, and utility of deep learning-based enhancement for quantitative imaging. Shorter scan times may heighten the use of volumetric quantitative MR imaging in routine clinical settings. ABBREVIATIONS: DL ¼ deep learning; FAST ¼ accelerated scan; HOC ¼ hippocampal occupancy score; HV ¼ hippocampal volumes; ILV ¼ inferior lateral ventricles; MCI ¼ mild cognitive impairment; SLV ¼ superior lateral ventricles; SOC ¼ standard of care D eep learning (DL) is a subset of machine learning that uses convolutional neural networks to process large volumes of data. 1-6 While traditional reconstruction techniques can be limited by long scan times, SNR constraints, and motion artifacts,
Modern deep neural networks have a large number of parameters, making them very hard to train. We... more Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the final D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network. Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classification, caption generation and speech recognition. On ImageNet, DSD improved the Top1 accuracy of GoogLeNet by 1.1%, VGG-16 by 4.3%, ResNet-18 by 1.2% and ResNet-50 by 1.1%, respectively. On the WSJ'93 dataset, DSD improved DeepSpeech and DeepSpeech2 WER by 2.0% and 1.1%. On the Flickr-8K dataset, DSD improved the NeuralTalk BLEU score by over 1.7. DSD is easy to use in practice: at training time, DSD incurs only one extra hyper-parameter: the sparsity ratio in the S step. At testing time, DSD doesn't change the network architecture or incur any inference overhead. The consistent and significant performance gain of DSD experiments shows the inadequacy of the current training methods for finding the best local optimum, while DSD effectively achieves superior optimization performance for finding a better solution. DSD models are available to download at https://songhan.github.io/DSD. * Indicates equal contribution † Also at NVIDIA ‡ Now at Google Brain.
The Android App we developed takes video frames of a human face from camera as input, and outputs... more The Android App we developed takes video frames of a human face from camera as input, and outputs a fusion image of extracted facial features and contours and a motion distribution map. The motion distribution map is generated based on MicroExpression hotmap with special color added. The brightness of the color is scaled by the magnitude of motion in each different area on the face. The client, an Android device, gets the initial location of eyes and mouth. Covariance based image registration is used to generated motion distribution of facial features on the server side. The fusion image generated with the information is then sent back to the client for display. Users can learn from this fusion image about micro changes of face features and thus interpret the human emotion. Since more than key points of facial features are extracted, we expect full utilization of our data to give precise interpretation provided a robust scoring system of motions of different facial features and cont...
Vitals are vital! Patient vitals provide critical information essential to the successful diagnos... more Vitals are vital! Patient vitals provide critical information essential to the successful diagnosis and management of patients in every healthcare settings, particularly in the acute-care setting. Heart-rate(HR), and to a greater degree heart rate variability, as well as overall patient mobility can indicate the presence of severe disease states. Practical and technical constraints currently limit monitoring in healthcare settings to select (eg. admitted) patients, leaving numerous others unmonitored and potentially at risk. Wed like to use Computer vision to provide a non-invasive solution to this acute problem. To tackle this problem, we extended Haar cascade detectors and SIFT based tracking and face recognition for a robust continuous face detection and assignment. Applying independent component analysis (ICA) based blind signal decomposition we were able to accurately extract heart-rate from video. Additionally, we proposed a method based on multi-scale spatial-temporal correla...
reconstruction are optimized for multi-contrast scans. The sampling trajectory is optimized for n... more reconstruction are optimized for multi-contrast scans. The sampling trajectory is optimized for new acquisition. The sharable information of structure and sensitivity is exploited to enhance reconstruction. PROMISE: Parallel Reconstruction with Optimized acquisition for Multi-contrast Imaging in the context of compressed Sensing Enhao Gong, Feng Huang, Kui Ying, Wenchuan Wu, Shi Wang, Chun Yuan, and Randy Duensing Electrical Engineering, Stanford University, Stanford, CA, United States, Philips Research Asia Shanghai, Shanghai, China, Department of Engineering Physics, Tsinghua University, Beijing, China, Center for Biomedical Imaging Research, Department of Biomedical Engineering, Tsinghua University, Beijing, China, Department of Radiology, University of Washington, Seattle, WA, United States, Philips Healthcare, Gainesville, FL, United States
Positron emission tomography (PET) is a widely used molecular imaging modality for various clinic... more Positron emission tomography (PET) is a widely used molecular imaging modality for various clinical applications. With Magnetic Resonance Imaging (MRI) providing anatomical information, simultaneous PET/MR reduces the radiation risk. Both improved hardware and algorithms have been developed to further reduce the amount of radiotracer dosage, but these methods are not yet applied to very low dose. Here, we propose a Deep Learning based method to enable ultra-low-dose PET denoising with multi-contrast information from simultaneous MRI. Methods:The method is implemented to denoise 18F-fluorodeoxyglucose (FDG) brain PET images from low-dose images with 200-fold dose reduction through undersampling, and evaluated for glioblastoma (GBM) patients. Comprehensive quantitative and qualitative evaluations were conducted to verify the performance and clinical applicability of the proposed method, including quantitative accuracy evaluation, visual quality evaluation, reader study with manual tumor segmentation to evaluate the diagnostic quality. Results:The results demonstrate that the proposed method achieves superior results in performance and efficiency comparing with the state-of-art denoising methods. Conclusion:Though reconstructed from scans with only 0.5% of the standard dose, the denoised ultra-low-dose PET images deliver similar visual quality and diagnostic information as the standard-dose PET images. By combining PET and MR information, the proposed Deep Learning based method improves image quality of ultra-low-dose PET, preserves diagnostic quality, and potentially enables much safer, faster, and more cost-effective PET/MR studies.
IMPORTANCE Predicting infarct size and location is important for decision-making and prognosis in... more IMPORTANCE Predicting infarct size and location is important for decision-making and prognosis in patients with acute stroke. OBJECTIVES To determine whether a deep learning model can predict final infarct lesions using magnetic resonance images (MRIs) acquired at initial presentation (baseline) and to compare the model with current clinical prediction methods. DESIGN, SETTING, AND PARTICIPANTS In this multicenter prognostic study, a specific type of neural network for image segmentation (U-net) was trained, validated, and tested using patients from the Imaging Collaterals in Acute Stroke (iCAS) study from April 14, 2014, to April 15, 2018, and the Diffusion Weighted Imaging Evaluation for Understanding Stroke Evolution Study-2 (DEFUSE-2) study from July 14, 2008, to September 17, 2011 (reported in October 2012). Patients underwent baseline perfusion-weighted and diffusion-weighted imaging and MRI at 3 to 7 days after baseline. Patients were grouped into unknown, minimal, partial, and major reperfusion status based on 24-hour imaging results. Baseline images acquired at presentation were inputs, and the final true infarct lesion at 3 to 7 days was considered the ground truth for the model. The model calculated the probability of infarction for every voxel, which can be thresholded to produce a prediction. Data were analyzed from July 1, 2018, to March 7, 2019. MAIN OUTCOMES AND MEASURES Area under the curve, Dice score coefficient (DSC) (a metric from 0-1 indicating the extent of overlap between the prediction and the ground truth; a DSC of Ն0.5 represents significant overlap), and volume error. Current clinical methods were compared with model performance in subgroups of patients with minimal or major reperfusion. RESULTS Among the 182 patients included in the model (97 women [53.3%]; mean [SD] age, 65 [16] years), the deep learning model achieved a median area under the curve of 0.92 (interquartile range [IQR], 0.87-0.96), DSC of 0.53 (IQR, 0.31-0.68), and volume error of 9 (IQR, −14 to 29) mL. In subgroups with minimal (DSC, 0.58 [IQR, 0.31-0.67] vs 0.55 [IQR, 0.40-0.65]; P = .37) or major (DSC, 0.48 [IQR, 0.29-0.65] vs 0.45 [IQR, 0.15-0.54]; P = .002) reperfusion for which comparison with existing clinical methods was possible, the deep learning model had comparable or better performance. CONCLUSIONS AND RELEVANCE The deep learning model appears to have successfully predicted infarct lesions from baseline imaging without reperfusion information and achieved comparable performance to existing clinical methods. Predicting the subacute infarct lesion may help clinicians prepare for decompression treatment and aid in patient selection for neuroprotective clinical trials.
Purpose: To improve the image quality of highly accelerated multi-channel MRI data by learning a ... more Purpose: To improve the image quality of highly accelerated multi-channel MRI data by learning a joint variational network that reconstructs multiple clinical contrasts jointly. Methods: Data from our multi-contrast acquisition was embedded into the variational network architecture where shared anatomical information is exchanged by mixing the input contrasts. Complementary k-space sampling across imaging contrasts and Bunch-Phase/Wave-Encoding were used for data acquisition to improve the reconstruction at high accelerations. At 3T, our joint variational network approach across T1w, T2w and T2-FLAIR-weighted brain scans was tested for retrospective under-sampling at R=6 (2D) and R=4x4 (3D) acceleration. Prospective acceleration was also performed for 3D data where the combined acquisition time for whole brain coverage at 1 mm isotropic resolution across three contrasts was less than three minutes. Results: Across all test datasets, our joint multi-contrast network better preserved fine anatomical details with reduced image-blurring when compared to the corresponding single-contrast reconstructions. Improvement in image quality was also obtained through complementary kspace sampling and Bunch-Phase/Wave-Encoding where the synergistic combination yielded the overall best performance as evidenced by exemplarily slices and quantitative error metrics. Conclusion: By leveraging shared anatomical structures across the jointly reconstructed scans, our joint multi-contrast approach learnt more efficient regularizers which helped to retain natural image appearance and avoid over-smoothing. When synergistically combined with advanced encoding techniques, the performance was further improved, enabling up to R=16-fold acceleration with good image quality. This should help pave the way to very rapid high-resolution brain exams.
Deep learning is a form of machine learning using a convolutional neural network architecture tha... more Deep learning is a form of machine learning using a convolutional neural network architecture that shows tremendous promise for imaging applications. It is increasingly being adapted from its original demonstration in computer vision applications to medical imaging. Because of the high volume and wealth of multimodal imaging information acquired in typical studies, neuroradiology is poised to be an early adopter of deep learning. Compelling deep learning research applications have been demonstrated, and their use is likely to grow rapidly. This review article describes the reasons, outlines the basic methods used to train and test deep learning models, and presents a brief overview of current and potential clinical applications with an emphasis on how they are likely to change future neuroradiology practice. Facility with these methods among neuroimaging researchers and clinicians will be important to channel and harness the vast potential of this new method.
Proceedings of the National Academy of Sciences, 2014
Significance For complex biological processes, the formation of protein complexes is a strategy f... more Significance For complex biological processes, the formation of protein complexes is a strategy for coordinating the activities of many enzymes in space and time. It has been hypothesized that growth of the bacterial cell wall involves stable synthetic complexes, but neither the existence of such complexes nor the consequences of such a mechanism for growth efficiency have been demonstrated. Here, we use single-molecule tracking to demonstrate that the association between an essential cell wall synthesis enzyme and the cytoskeleton is highly dynamic, which allows the cell to buffer growth rate against large fluctuations in enzyme abundance. This indicates that dynamic association can be an efficient strategy for coordination of multiple enzymes, especially those for which excess abundance can be harmful to cells.
We demonstrate Optical-Resolution Photoacoustic Microscopy (OR-PAM), where the optical field is f... more We demonstrate Optical-Resolution Photoacoustic Microscopy (OR-PAM), where the optical field is focused and scanned using Digital Phase Conjugation through a multimode fiber. The focus is scanned across the field of view using digital means, and the acoustic signal induced is collected by a transducer. Optical-resolution photoacoustic images of a knot made by two absorptive wires are obtained and we report on resolution smaller than 1.5 μm across a 201 μm × 201 μm field of view. The use of a multimode optical fiber for the optical excitation part can pave the way for miniature endoscopes that can provide optical-resolution photoacoustic images at large optical depth.
Multi-contrast Magnetic Resonance Imaging (MRI) acquisitions from a single scan have tremendous p... more Multi-contrast Magnetic Resonance Imaging (MRI) acquisitions from a single scan have tremendous potential to streamline exams and reduce imaging time. However, maintaining clinically feasible scan time necessitates significant undersampling, pushing the limits on compressed sensing and other low-dimensional techniques. During MRI scanning, one of the possible solutions is by using undersampling designs which can effectively improve the acquisition and achieve higher reconstruction accuracy. However, existing undersampling optimization methods are time-consuming and the limited performance prevents their clinical applications. In this paper, we proposed an improved undersampling trajectory optimization scheme to generate an optimized trajectory within seconds and apply it to subsequent multi-contrast MRI datasets on a per-subject basis, where we named it OUTCOMES. By using a data-driven method combined with improved algorithm design, GPU acceleration, and more efficient computation, ...
Arterial Spin Labeling (ASL) is a popular non-invasive neuroimaging technique to use MRI for quan... more Arterial Spin Labeling (ASL) is a popular non-invasive neuroimaging technique to use MRI for quantitatively Cerebral Blood Flow (CBF) mapping. However, ASL usually suffers from poor signal quality and repeated measurements are typically acquired to improve signal quality through averaging at the cost of long scan time. In this work, a deep learning algorithm is proposed to leverage both convolutional neural network (CNN) based image enhancement as well as combining complementary/mutual information from multiple tissue contrasts in ASL acquisition. Both quantitative and qualitative evaluation demonstrate the performance and stability of the proposed algorithm and its superiority over conventional denoising algorithms and standard deep learning based denoising. The results demonstrate the feasibility of efficient and high-quality ASL measurements from average-free fast acquisition which will enable broader clinical application of ASL.
Magnetic resonance image (MRI) reconstruction is a severely ill-posed linear inverse task demandi... more Magnetic resonance image (MRI) reconstruction is a severely ill-posed linear inverse task demanding time and resource intensive computations that can substantially trade off accuracy for speed in real-time imaging. In addition, state-of-the-art compressed sensing (CS) analytics are not cognizant of the image diagnostic quality. To cope with these challenges we put forth a novel CS framework that permeates benefits from generative adversarial networks (GAN) to train a (low-dimensional) manifold of diagnostic-quality MR images from historical patients. Leveraging a mixture of least-squares (LS) GANs and pixel-wise 1 cost, a deep residual network with skip connections is trained as the generator that learns to remove the aliasing artifacts by projecting onto the manifold. LSGAN learns the texture details, while 1 controls the high-frequency noise. A multilayer convolutional neural network is then jointly trained based on diagnostic quality images to discriminate the projection quality. The test phase performs feed-forward propagation over the generator network that demands a very low computational overhead. Extensive evaluations are performed on a large contrast-enhanced MR dataset of pediatric patients. In particular, images rated based on expert radiologists corroborate that GANCS retrieves high contrast images with detailed texture relative to conventional CS, and pixel-wise schemes. In addition, it offers reconstruction under a few milliseconds, two orders of magnitude faster than state-of-the-art CS-MRI schemes. * The authors are with the Stanford University, Departments of Electrical Engineering 1 , Radiology 2 , Radiation Oncology 3 , and Computer Science 4 .
Studying oxygen metabolism during brain ischemia has been limited by imaging modalities that have... more Studying oxygen metabolism during brain ischemia has been limited by imaging modalities that have either low resolution but good tissue penetration, or high resolution requiring invasive preparations (open-skull windows). Here, we report a study of cerebrovascular oxygen metabolism in mouse focal ischemia using a novel minimally invasive, high-resolution technique known as optical-resolution photoacoustic microscopy (OR-PAM). Using OR-PAM, we serially imaged cortical vessels through the intact skull of Swiss Webster mice (n=3) before, during, and up to 72 hrs after a 1-hr intraluminal transient middle cerebral artery occlusion (MCAO). Vessels were imaged at 2 different wavelengths (532 & 563 nm) to quantify oxy-, deoxy-hemoglobin and oxygen saturation (sO 2 , Fig , A-C). Oxygen extraction fraction (OEF) was calculated from venous sO 2 values (assuming a constant arterial sO 2 of 95%). Seventy-two hrs after ischemia, animals were sacrificed and whole brains were stained with TTC to d...
PurposeWith rising safety concerns over the use of gadolinium‐based contrast agents (GBCAs) in co... more PurposeWith rising safety concerns over the use of gadolinium‐based contrast agents (GBCAs) in contrast‐enhanced MRI, there is a need for dose reduction while maintaining diagnostic capability. This work proposes comprehensive technical solutions for a deep learning (DL) model that predicts contrast‐enhanced images of the brain with approximately 10% of the standard dose, across different sites and scanners.MethodsThe proposed DL model consists of a set of methods that improve the model robustness and generalizability. The steps include multi‐planar reconstruction, 2.5D model, enhancement‐weighted L1, perceptual, and adversarial losses. The proposed model predicts contrast‐enhanced images from corresponding pre‐contrast and low‐dose images. With IRB approval and informed consent, 640 heterogeneous patient scans (56 train, 13 validation, and 571 test) from 3 institutions consisting of 3D T1‐weighted brain images were used. Quantitative metrics were computed and 50 randomly sampled te...
Uploads
Papers by Enhao Gong