Synthetic causal training data generator for LLM fine-tuning. 16 industry domains, 200+ mechanisms, ~100K samples in 10 seconds. Pure Python.
-
Updated
Jan 18, 2026 - Python
Synthetic causal training data generator for LLM fine-tuning. 16 industry domains, 200+ mechanisms, ~100K samples in 10 seconds. Pure Python.
Collecting training data for fine tuning a model and building pipeline on the cloud to train the model.
Machine Learning for Letter Recognition
Add a description, image, and links to the trainingdata topic page so that developers can more easily learn about it.
To associate your repository with the trainingdata topic, visit your repo's landing page and select "manage topics."