Skip to content

Use LLMs as training regularizers for small, differentiable models and significantly improve their generalization ability when trained on few-shot and skewed datasets.

Notifications You must be signed in to change notification settings

davor10105/laat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: GPL--3.0

Large Language Model Attribution Aligned Training (LAAT)

License: GPL--3.0

Use LLMs as training regularizers for small models and significantly improve their generalization ability. Read more in our paper: Large Language Models as Attribution Regularizers for Efficient Model Training

Quickstart 🚀

You can quickly train a model on a specified dataset using LLM attribution guidance:

from laat.datasets import LAATDataset
from laat.splitters import NShotSplitter
from laat.models.laat import LAATLAATModel, LAATClassifier, TorchLogisticRegression
from langchain_openai import ChatOpenAI


# load the dataset
dataset = LAATDataset("breast-ljubljana", "laat/data")
# split it into k-shot
X_train, X_test, y_train, y_test = NShotSplitter.split(dataset.X, dataset.y, shot=5)

# define training parameters
model_kwargs = {
    "lr": 1e-2,
    "max_epochs": 200,
    "train_split": None,
    "optimizer": torch.optim.Adam,
    "optimizer__weight_decay": 1e-2,
    "verbose": False,
    "device": "cuda",
}

# instantiate the model
model = LAATLAATModel(
            model_name=f"laat_gpt-4o-mini_lr",
            model_class=partial_class(
                LAATClassifier,
                module=TorchLogisticRegression,
                **model_kwargs,
                ),
            pandas_to_numpy_mapper=dataset.to_numpy,
            dataset=dataset,
            reasoning_llm=ChatOpenAI(model="gpt-4o-mini"),
            parsing_llm=ChatOpenAI(
                model="gpt-4o-mini",
                temperature=0.0,
                ),
            gamma=100.0,
            n_estimates=5,
        )

# train the model
model.train(X_train, y_train)

To train a model you need a .csv dataset, and a metadata .json file describing the task and listing the descriptions of all features. You can define a metadata file manually, or generate one automatically, by providing a dataset and the task description:

from laat.datasets import LAATDataset
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4.1-nano")

LAATDataset.generate_metadata(
    dataset_name="indian_liver",
    dataset_task_description="Predict whether the patient has a liver disease. Yes or no?",
    model=model,
    data_root="laat/data",
)

Citation

If you find this repository or the paper useful in your research, please cite us using the following BibTeX entry:

@misc{vukadin2025largelanguagemodelsattribution,
      title={Large Language Models as Attribution Regularizers for Efficient Model Training}, 
      author={Davor Vukadin and Marin Šilić and Goran Delač},
      year={2025},
      eprint={2502.20268},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.20268}, 
}

Author

👤 Davor Vukadin

About

Use LLMs as training regularizers for small, differentiable models and significantly improve their generalization ability when trained on few-shot and skewed datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages