This repository contains the data artifacts and model outputs used for our paper on document-level Event Argument Extraction (EAE) on the MAVEN-ARG benchmark:
Schema-Constrained Document-Level Event Argument Extraction with Lightweight LLM Fine-Tuning Pouya Sattari, Roberto Pietrantuono, Antonio Guerriero ECML PKDD 2026, Research Track. Naples, Italy
The goal of this repository is reproducibility. It provides the preprocessed dataset files used in our experiments, together with per-model prompts, raw generations, predictions, evaluation summaries, submission files, and example training/inference notebooks.
.
├── README.md
├── .gitignore
├── data/
│ ├── README.md
│ └── maven_arg_preprocessed/
│ ├── train_preprocessed.jsonl
│ ├── valid_preprocessed.jsonl
│ └── test_preprocessed.jsonl
└── artifacts/
├── llama/
│ ├── label2role.json
│ ├── 1-notebooks/
│ ├── prompts/
│ ├── generations/
│ ├── predictions/
│ └── submissions/
├── mistral_nemo/
│ ├── label2role.json
│ ├── 1-notebooks/
│ ├── prompts/
│ ├── generations/
│ ├── predictions/
│ ├── metrics/
│ └── submissions/
├── phi4/
│ ├── label2role.json
│ ├── 1-notebooks/
│ ├── prompts/
│ ├── generations/
│ ├── predictions/
│ └── metrics/
└── qwen3/
├── label2role.json
├── 1-notebooks/
├── prompts/
├── generations/
├── predictions/
├── metrics/
└── submissions/
This directory contains the preprocessed MAVEN-ARG dataset files used in our experiments:
train_preprocessed.jsonlvalid_preprocessed.jsonltest_preprocessed.jsonl
These files were derived from the original MAVEN-ARG dataset.
The original dataset can be obtained from the official repository:
https://github.com/THU-KEG/MAVEN-Argument
See data/README.md for additional details.
This directory contains the experiment artifacts for each evaluated model.
Each model directory includes the resources needed to reproduce the reported results.
Subdirectories may include:
-
1-notebooks/
Example notebooks demonstrating the training and inference workflow used for each model. -
prompts/
Prompt-formatted inputs used during model inference. -
generations/
Raw model outputs generated during inference. -
predictions/
Final prediction files used for evaluation or submission. -
metrics/
Evaluation summaries or metric snapshots recorded during experiments. -
submissions/
Archived submission files used for official evaluation. -
label2role.json
Event-type to role-set mappings used during prompting and argument filtering.
Artifacts are provided for the following model families:
- Llama (Llama-3.1-8B)
- Mistral-Nemo (12B)
- Phi-4 (14B)
- Qwen3 (14B)
Each model directory contains the prompts, outputs, notebooks, and evaluation artifacts corresponding to that model.
This repository supports reproducibility by providing:
- the processed dataset used in the experiments
- prompt inputs used for model inference
- raw model generations
- final predictions used for evaluation
- evaluation summaries and submission artifacts
- example notebooks demonstrating the training and inference pipelines
- This repository is organized as a reproducibility artifact repository, not as a full training codebase.
- Only the preprocessed dataset files used in the experiments are included here.
- The original raw MAVEN-ARG dataset should be obtained from the official source.
- Some files are relatively large because they contain prompt datasets and model outputs.
This repository accompanies the paper:
Schema-Constrained Document-Level Event Argument Extraction with Lightweight LLM Fine-Tuning Pouya Sattari, Roberto Pietrantuono, Antonio Guerriero ECML PKDD 2026, Research Track. Naples, Italy
Please refer to the original MAVEN-ARG repository for dataset licensing and usage conditions: