Skip to content

Data behind my Master's thesis "Using NLP to Study the Public Consultation on the AI Act Proposal"

Notifications You must be signed in to change notification settings

felixrech/PC-AI

Repository files navigation

PC-AI Dataset

The Public Consultation on the AI Act proposal (PC-AI) dataset contains the results of the 2021 public consultation on the draft of the Artificial Intelligence Act (AI Act). The feedback received (feedbacks.csv) and their attachments (attachments.csv, attachments/) have been scraped from the 'Have your Say' platform using our library hys_scraper.

Though they can be computed from the overall feedbacks, some summary statistics (categories.csv, countries.csv)have also been scraped. The text of each PDF attachment has been extracted using PyMuPDF and is available as a text file in the attachments/ folder.

Patches

Additionally, we also offer a patched version (patched_feedback.csv) making the following changes - see patch_script.ipynb for details:

  • Reclassify several feedbacks from the other user type into their proper user types
  • Fix a typo in the 'Have your Say' API (i.e. rename user type academic_research_instittution [sic.])
  • Remove some accidental submissions and the same feedback submitted multiple times by the same person

White paper

The public consultation on the original AI white paper is also available in the white_paper branch.

About

Data behind my Master's thesis "Using NLP to Study the Public Consultation on the AI Act Proposal"

Resources

Stars

Watchers

Forks