This repository provides the peer-review-stage public data release for the manuscript:
DREAM: Deployment-Time Priority Stratification, Review Routing, and Threshold Adaptation for Generative AI Incidents
The current release contains the initial incident-level labeling table used as the manuscript input layer. It includes 1,134 generative AI incident records together with the six-dimensional scores Da, A, R, E, Di, and M.
| File | Purpose |
|---|---|
dream_initial_labels_1134.csv |
Public release of the initial 1,134-incident labeling dataset |
DATASET_SCHEMA.md |
Column-level description of the released dataset |
RELEASE_NOTE.md |
Release boundary and post-acceptance expansion note |
DATA_AVAILABILITY.md |
Submission-ready statement of what is released now and what will follow |
CHANGELOG.md |
Versioned update record for the public data release |
LICENSE |
Dataset license text for the current public release |
CITATION.cff |
Machine-readable citation metadata for the repository |
SHA256SUMS.txt |
SHA256 checksums for the public release files |
The released CSV contains:
- 1,134 incident records
- six-dimensional scores:
Da,A,R,E,Di,M - incident descriptors including date, title, description, risk domain, and risk subdomain
- completeness and alignment fields retained from the study pipeline
Direct download:
Complete release package:
The repository provides SHA256SUMS.txt so users can verify that the downloaded files match the public release.
Example verification on a local machine:
sha256sum -c SHA256SUMS.txtThis repository currently releases only the initial labeling table used at the manuscript input stage.
It does not yet include the full validation package, expert-panel materials, proxy-validation artifacts, or code.
All remaining data products and code will be released after article acceptance.
Additional release-context documents:
DATA_AVAILABILITY.mdCHANGELOG.md
If you use this repository or dataset, please cite:
- this GitHub repository release
- the associated DREAM manuscript
Citation metadata are provided in CITATION.cff.
The dataset materials currently released in this repository are provided under the Creative Commons Attribution 4.0 International license.
This choice is intended for the current data release. If code is added in a later post-acceptance expansion, code files may be marked under a separate software license where appropriate.
Corresponding author: Jiayin Qi
The Cyberspace Institute of Advanced Technology, Guangzhou University
Email: qijiayin@139.com