EAE Reproducibility Repository

This repository contains the data artifacts and model outputs used for our paper on document-level Event Argument Extraction (EAE) on the MAVEN-ARG benchmark:

Schema-Constrained Document-Level Event Argument Extraction with Lightweight LLM Fine-Tuning Pouya Sattari, Roberto Pietrantuono, Antonio Guerriero ECML PKDD 2026, Research Track. Naples, Italy

The goal of this repository is reproducibility. It provides the preprocessed dataset files used in our experiments, together with per-model prompts, raw generations, predictions, evaluation summaries, submission files, and example training/inference notebooks.

Repository Structure

.
├── README.md
├── .gitignore
├── data/
│   ├── README.md
│   └── maven_arg_preprocessed/
│       ├── train_preprocessed.jsonl
│       ├── valid_preprocessed.jsonl
│       └── test_preprocessed.jsonl
└── artifacts/
    ├── llama/
    │   ├── label2role.json
    │   ├── 1-notebooks/
    │   ├── prompts/
    │   ├── generations/
    │   ├── predictions/
    │   └── submissions/
    ├── mistral_nemo/
    │   ├── label2role.json
    │   ├── 1-notebooks/
    │   ├── prompts/
    │   ├── generations/
    │   ├── predictions/
    │   ├── metrics/
    │   └── submissions/
    ├── phi4/
    │   ├── label2role.json
    │   ├── 1-notebooks/
    │   ├── prompts/
    │   ├── generations/
    │   ├── predictions/
    │   └── metrics/
    └── qwen3/
        ├── label2role.json
        ├── 1-notebooks/
        ├── prompts/
        ├── generations/
        ├── predictions/
        ├── metrics/
        └── submissions/

1-notebooks/
Example notebooks demonstrating the training and inference workflow used for each model.
prompts/
Prompt-formatted inputs used during model inference.
generations/
Raw model outputs generated during inference.
predictions/
Final prediction files used for evaluation or submission.
metrics/
Evaluation summaries or metric snapshots recorded during experiments.
submissions/
Archived submission files used for official evaluation.
label2role.json
Event-type to role-set mappings used during prompting and argument filtering.

Models Included

Artifacts are provided for the following model families:

Llama (Llama-3.1-8B)
Mistral-Nemo (12B)
Phi-4 (14B)
Qwen3 (14B)

Each model directory contains the prompts, outputs, notebooks, and evaluation artifacts corresponding to that model.

Purpose

This repository supports reproducibility by providing:

the processed dataset used in the experiments
prompt inputs used for model inference
raw model generations
final predictions used for evaluation
evaluation summaries and submission artifacts
example notebooks demonstrating the training and inference pipelines

Notes

This repository is organized as a reproducibility artifact repository, not as a full training codebase.
Only the preprocessed dataset files used in the experiments are included here.
The original raw MAVEN-ARG dataset should be obtained from the official source.
Some files are relatively large because they contain prompt datasets and model outputs.

Paper

This repository accompanies the paper:

Schema-Constrained Document-Level Event Argument Extraction with Lightweight LLM Fine-Tuning Pouya Sattari, Roberto Pietrantuono, Antonio Guerriero ECML PKDD 2026, Research Track. Naples, Italy

License and Dataset Usage

Please refer to the original MAVEN-ARG repository for dataset licensing and usage conditions:

https://github.com/THU-KEG/MAVEN-Argument

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
artifacts		artifacts
data		data
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EAE Reproducibility Repository

Repository Structure

Contents

`data/`

`artifacts/`

Models Included

Purpose

Notes

Paper

License and Dataset Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EAE Reproducibility Repository

Repository Structure

Contents

data/

artifacts/

Models Included

Purpose

Notes

Paper

License and Dataset Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`data/`

`artifacts/`

Packages