GitHub - JinyangMarkLiu/JPDVT: [CVPR 2024] Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers (SPDVT)
_{Official PyTorch Implementation}

[CVPR 2024] Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

[Paper] [Arxiv]

This GitHub repository is currently undergoing organization. Stay tuned for the upcoming release of fully functional code!

Setup

git clone https://github.com/JinyangMarkLiu/JPDVT.git
cd JPDVT

Preparing Data

Download datasets as you need. Here we give brief instructions for setting up part of the datasets we used.

ImageNet

You can use this script to download and prepare the ImageNet dataset. If you need to download the dataset, please uncomment the first part of the script.

JPwLEG-3

Download the JPwLEG-3 from this Google Drive. Only select_image part is used in our experiments.

Training

We provide training scripts for training image models and video models.

Training image models

On ImageNet dataset:

torchrun --nnodes=1 --nproc_per_node=4 train_JPDVT.py --dataset imagenet --data-path <imagenet-train-path> --image-size 192 --crop

On MET dataset:

torchrun --nnodes=1 --nproc_per_node=4 train_JPDVT.py --dataset met --data-path <met-data-path> --image-size 288 --epochs 1000

Testing

BibTeX

If you find our paper/project useful, please consider citing our paper:

@InProceedings{Liu_2024_CVPR,
    author    = {Liu, Jinyang and Teshome, Wondmgezahu and Ghimire, Sandesh and Sznaier, Mario and Camps, Octavia},
    title     = {Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {23009-23018}
}

Acknowledgments

Our codebase is mainly based on improved diffusion, make a video, and DiT.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
image_model		image_model
video_model		video_model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers (SPDVT)
_{Official PyTorch Implementation}

[Paper] [Arxiv]

Setup

Preparing Data

ImageNet

JPwLEG-3

Training

Training image models

Testing

BibTeX

Acknowledgments

About

Releases

Packages

Languages

JinyangMarkLiu/JPDVT

Folders and files

Latest commit

History

Repository files navigation

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers (SPDVT) Official PyTorch Implementation

[Paper] [Arxiv]

Setup

Preparing Data

ImageNet

JPwLEG-3

Training

Training image models

Testing

BibTeX

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers (SPDVT)
_{Official PyTorch Implementation}

Packages