Skip to content

Latest commit

 

History

History
139 lines (113 loc) · 5.44 KB

README.md

File metadata and controls

139 lines (113 loc) · 5.44 KB

FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features

Yufeng Yin, Di Chang, Guoxian Song, Shen Sang, Tiancheng Zhi, Jing Liu, Linjie Luo, Mohammad Soleymani
USC ICT, ByteDance

WACV 2024
Arxiv

Introduction

This is the official implementation of our WACV 2024 Algorithm Track paper: FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features.

FG-Net extracts feature maps from a StyleGAN2 model pre-trained on a large and diverse face image dataset. Then, these features are used to detect AUs with a Pyramid CNN Interpreter, making the training efficient and capturing essential local features. The proposed FG-Net achieves a strong generalization ability for heatmap-based AU detection thanks to the generalizable and semantic-rich features extracted from the pre-trained generative model. Extensive experiments are conducted to evaluate within- and cross-corpus AU detection with the widely-used DISFA and BP4D datasets. Compared with the state-of-the-art, the proposed method achieves superior cross-domain performance while maintaining competitive within-domain performance. In addition, FG-Net is dataefficient and achieves competitive performance even when trained on 1000 samples.

Installation

Clone repo:

git clone https://github.com/ihp-lab/FG-Net.git
cd FG-Net

The code is tested with Python == 3.7, PyTorch == 1.10.1 and CUDA == 11.3 on NVIDIA Quadro RTX 8000. We recommend you to use anaconda to manage dependencies.

conda create -n fg-net python=3.7
conda activate fg-net
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda install cudatoolkit-dev=11.3
pip install pandas
pip install tqdm
pip install -U scikit-learn
pip install opencv-python
pip install dlib
pip install imutils

Data Structure

Download the BP4D and DISFA dataset from the official website.

We first pre-process the input image by dlib to obatain the facial landmarks. The detected landmark are used to crop and align the face by FFHQ-alignment. We finally use dlib again to detect the facial labdmarks for the aligned images to generate heatmaps.

You should get a dataset folder like below:

data
├── DISFA
│ ├── labels
│ │ └── 0
│ │ │ ├── train.csv
│ │ │ └── test.csv
│ │ ├── 1
│ │ └── 2
│ ├── aligned_images
│ └── aligned_landmarks
└── BP4D

Checkpoints

StyleGAN2: To get pytorch checkpoints for StyleGAN2 (stylegan2-ffhq-config-f.pt), check Section Convert weight from official checkpoints

StyleGAN2 Encoder: pSp encoder. Rename the pt file to encoder.pt.

AU detection: BP4D and DISFA

Put all checkpoints under the folder /code/checkpoints.

Training and Within-domain Evaluation

cd code
CUDA_VISIBLE_DEVICES=0 python train_interpreter.py --exp experiments/bp4d_0.json
CUDA_VISIBLE_DEVICES=1 python train_interpreter.py --exp experiments/disfa_0.json

Cross-domain Evaluation

cd code
CUDA_VISIBLE_DEVICES=0 python eval_interpreter.py --exp experiments/eval_b2d.json
CUDA_VISIBLE_DEVICES=1 python eval_interpreter.py --exp experiments/eval_d2b.json

Single image inference

cd code
CUDA_VISIBLE_DEVICES=0 python single_image_inference.py --exp experiments/single_image_inference_bp4d.json
CUDA_VISIBLE_DEVICES=1 python single_image_inference.py --exp experiments/single_image_inference_disfa.json

License

Our code is distributed under the MIT License. See LICENSE file for more information.

Citation

@article{yin2023fg,
  title={FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features},
  author={Yin, Yufeng and Chang, Di and Song, Guoxian and Sang, Shen and Zhi, Tiancheng and Liu, Jing and Luo, Linjie and Soleymani, Mohammad},
  journal={arXiv preprint arXiv:2308.12380},
  year={2023}
}

Contact

If you have any questions, please raise an issue or email to Yufeng Yin ([email protected]or [email protected]).

Acknowledgments

Our code follows several awesome repositories. We appreciate them for making their codes available to public.