This repository contains the code for the following paper: Papež M, Rektoris M, Šmídl V, Pevný T. Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs.
An example of a PGC for undirected acyclic graphs. (a) We consider a graph
@article{papez2025probabilistic,
title={Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs},
author={Pape{\v{z}}, Milan and Rektoris, Martin and {\v{S}}m{\'\i}dl, V{\'a}clav and Pevn{\'y}, Tom{\'a}{\v{s}}},
journal={arXiv preprint arXiv:2503.12162},
year={2025}
}
Clone this repository.
git clone https://github.com/mlnpapez/PGC PGC
Go to the PGCs directory.
cd PGC
Set up the environment.
conda create --name pgc python=3.10
conda activate pgc
pip install torch==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install rdkit==2024.3.6
pip install tqdm==4.67.0
pip install pandas==2.2.3
pip install pylatex==1.4.2
pip install scipy==1.14.1
pip install fcd_torch==1.0.7
pip install scikit-learn==1.6.0
pip install git+https://github.com/fabriziocosta/EDeN.git
The following command will download and preprocess the QM9 or Zinc250k dataset, each for five different orderings.
python -m utils.datasets
For example, the script will produce qm9_canonical.pt
, which contains molecules in the canonical ordering of the atoms. To select the dataset, change dataset
in utils.datasets.py
.
config/qm9/
contains JSON files with the hyper-parameters of different PGC variants. Change the hyper-parameters based on your preferences and then run the following command.
python -m train
It will train all the PGC variants (or only the selected ones if you change the list of names
in train.py
).
The resulting models will be stored in results/training/model_checkpoint/
, and the corresponding illustrations of unconditional molecule generation, along with the metrics assessing the performance of the models, will be stored in results/training/model_evaluation/
.
Unconditional samples of molecular graphs from the PT-S variant of PGCs (pgc_marg
).
gridsearch_hyperpars.py
contains hyper-parameter grids for finding suitable architectures of the PGCs variants. Change the hyper-parameter grids based on your preferences, and then run the following command.
nohup python -m gridsearch > gridsearch.log &
This command will run the script in the background, submitting jobs to your SLURM cluster. The resulting models, metrics, and output logs will be stored in results/gridsearch/model_checkpoint/
, results/gridsearch/model_evaluation/
, and results/gridsearch/model_outputlogs/
, respectively.
After completing all the SLURM jobs, run the following command.
python -m gridsearch_evaluate
It will produce a table comparing the PGC variants with the baselines (both in the .pdf
and .tex
formats).
Run the following command to generate new molecules conditionally on a known molecule.
python -m conditional_sampling
To impose a known structure of the generated molecules, change patt_smls
in conditional_sampling.py
. Similarly, to select a model from which to generate the samples, change model_path
.
Conditional samples of molecular graphs from the PT-S variant of PGCs (pgc_marg
). The known part of a molecule is highlighted in blue.