Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).
- Python==2.7.12
- torch==0.4.1
- torchfile==0.1.0
- torchvision==0.2.1
- scikit-image==0.14.1
- pandas==0.19.1
- easydict==1.6
- nltk==3.2.2
In addition, please add the project folder to PYTHONPATH
Data
Download our preprocessed metadata for coco dataset and extract the images to data/coco/
Training
- Pre-train DAMSM models:
- For coco dataset:
python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1
- For coco dataset:
- Train AttnGAN models:
- For coco dataset:
python main.py --cfg cfg/coco_attn2.yml --gpu 3
- For coco dataset:
*.yml
files are example configuration files for training/evaluation our models.- Note: The GPU parameter simply enables while its absence disables the GPU. The code is parallel by default.
Pretrained Model
- DAMSM for coco. Download and save it to
DAMSMencoders/
- AttnGAN for coco. Download and save it to
models/
Data Sampling
- Configure the
sampling.py
scrit in thecode
folder to point to directories of your choice - The script shall sample the 10000 samples each from training, validation and test data without replacement
Validation and Custom Image Generation
- Validation Image Generation
- Modify the B_VALIDATION flag to True in eval_coco.yml
- Run
python main.py --cfg cfg/eval_coco.yml --gpu 1
to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved tomodels/
- Custom Image Generation
- Input your own sentence in "./data/example_captions.txt" if you want to generate images from customized sentences.
Evaluation
- We use the Fréchet Inception Distance (FID) to compute an interpretable evaluation metric. The concept was first introduced in 'GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium' by Martin Heusel et al. Available here
Examples generated by AttnGAN [Blog]
bird example | coco example |
---|---|
![]() |
![]() |
Evaluation code embedded into a callable containerized API is included in the eval\
folder.
If you find AttnGAN useful in your research, please consider citing:
@article{Tao18attngan,
author = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
title = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
Year = {2018},
booktitle = {{CVPR}}
}
Reference