[Paper] [Voxel data (Zenodo, 3GB)] [Complete dataset (450 GB)] [Triangle meshes and point clouds (partial, 27 GB)]
GPU-based fragmentation of voxelizations using OpenGL compute shaders. This project is aimed at generating datasets for training fragment assembly models. While this fragmentation method can be applied over any mesh, this work specifically focus on archaeological artefacts such as those depicted in Figure 1.
Figure 1. Assembled fragments of Iberian vessels.
The code in this repository has some dependencies which are following listed:
- assimp 5.2.4.
- glew 2.2.0.
- opengl 4.6.
- glfw3 3.3.7.
- glm 0.9.9.8.
- simplify.
- MagicaVoxel File Writer.
*The last two dependencies were already included as part of this repo, under the folder MeshFragments/Libraries/
.
The project is primarily intended to be used in Windows. The Microsoft Visual Studio project files are uploaded to the repo and therefore it should be trivial to open it (regardless of changing the development platform kit). The project was configured as follows:
- Development platform kit:
v143
. - Language standard:
C++ 23
. - Integration with
vcpkg
. After cloningvcpkg
and launching the main.bat
, it can be integrated with MSVC by executingvcpkg integrate install
in the command line (note thatvcpkg
can be registered in the system path for easier usage).
The whole fragment data is available at our research institute's page. However, two lighter versions have been released since the complete dataset is too heavy (450 GB). Moreover, we encourage the readers to primarily use the Zenodo dataset if your work is centred on implicit data/voxels. In summary, these are the available datasets:
-
A 3GB dataset composed only of voxel data, published in Zenodo.
-
The whole fragment dataset, split into eight files of ~50GB (totalling 450GB) with compressed voxel data, point clouds and triangle meshes.
-
A lighter version of uncompressed triangle meshes and point clouds (
vessels_200_obj_ply_no_zipped.zip
; 27 GB). This is mainly intended for testing the dataset since it only contains decimated fragments of 200 models, with no individual zipping. However, note that these are provided as triangle meshes and point clouds derived from marching cubes, and may have more geometric inaccuracies.
The scripts to decompress binary grids, meshes and point clouds are available at docs/decompress
. Point clouds are decompressed in C++ using the Point Cloud Library (PCL), whereas the other formats are decompressed using Python.
Figure 2. Rendering uncompressed data.
@article{LopezGenerating2024,
title = {Generating implicit object fragment datasets for machine learning},
journal = {Computers & Graphics},
pages = {104104},
year = {2024},
issn = {0097-8493},
doi = {https://doi.org/10.1016/j.cag.2024.104104},
url = {https://www.sciencedirect.com/science/article/pii/S0097849324002395},
author = {Alfonso López and Antonio J. Rueda and Rafael J. Segura and Carlos J. Ogayar and Pablo Navarro and José M. Fuertes},
keywords = {Voxel, Fragmentation, Fracture dataset, Voronoi, GPU programming}
}