This release contains more than 800 xml files, and their corresponding images, from 21 different 17th century books.
It includes a python and a bash script to create a dataset of those xml files, and two csv files describing them.
It includes some modifications since 4.0 release.