Code, model and data for Zhou et al. 2022: Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts. We are currently working on making this package easier to use, any advice is welcomed.
We provide the following data:
comparable_text_pretrain.txt.zipGoogle Drive: Distant supervision data that we used for pre-train DecompT5 as described in Section 3.data/decomposition_train.txt: The decomposition supervision we used to train the decomposition model in DecompEntail (on top of DecompT5).data/entailment_train.txt: The entailment supervision we used to train the entailment model in DecompEntail (On top of T5-3b).data/strategyqa/*: StrategyQA train/dev/test splits we used for experiments.data/hotpotqa/*: HotpotQA binary questions we used for experiments.data/overnight/*: Overnight data used for experiments.data/torque/*: Torque data used for experiments.
We provide several trained model weights used in our paper, hosted on Huggingface hub. We randomly released one seed from multi-seed experiments.
- CogComp/l2d: T5-large trained on
comparable_text_pretrain.txt. - CogComp/l2d-decomp: DecompT5 trained on
data/decomposition_train.txt, used in the DecompEntail pipeline. - CogComp/l2d-entail: T5-3b trained on
data/entailment_train.txt, used in the DecompEntail pipeline.
The code are divided into two separate packages, each using slightly different dependencies, as provided in corresponding requirements.txt.
The seq2seq package can be used to reproduce DecompT5, and its related experiments in Section 5 of the paper.
It is also used to train and evaluate the entailment model used in DecompEntail. We provide a few use case examples as shell scripts.
seq2seq/train_decompose.sh: Train CogComp/l2d-decompseq2seq/train_entailment.sh: Train CogComp/l2d-entailseq2seq/eval_entailment.sh: Evaluate entailment model
In addition, we provide the generation and evaluation code for overnight and torque experiments in seq2seq/gen_seq.py
- To generate the top 10 candidates, use
gen_output(). - To evaluate the generated candidates, use
evaluate_top(). See code comments for more detail.
The DecompEntail pipeline can be run with the following steps:
- Generate decompositions given raw questions.
- This can be done by
generate_decomposition()indecompose/gen_facts.py. See comments for more detail.
- This can be done by
- Format generated decompositions into l2d-entail readable forms.
- This can be done by
format_to_entailment_model()indecompose/gen_facts.py.
- This can be done by
- Run l2d-entail to get entailment scores.
- This can be done by
seq2seq/eval_entailment.shand replacing the input file with the output file from the previous step. - If you run an aggregation with different seeds, concatenate the output files into one file and use as an input to the script.
- This can be done by
- Majority vote to derive final labels based on entailment scores.
- The previous step will output two files
eval_probs.txtandeval_results_lm.txt. Replace the path indecompose/evaluator.pyand compute accuracy.
- The previous step will output two files
See the following paper:
@inproceedings{ZRYR22,
author = {Ben Zhou and Kyle Richardson and Xiaodong Yu and Dan Roth},
title = {Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts},
booktitle = {EMNLP},
year = {2022},
}