QBot is an pipeline for training large language models (LLMs) using QQ chat data. The project covers the full workflow: decrypting chat records, extracting and preprocessing data, LoRA fine-tuning, reward model training, RLHF, and Huggingface-based demos.
pip install -r requirements.txtplease quit your linux qq before runing script
python scripts/dump_qq_data.pypython scripts/extract.py --profile_db data/raw/path-to-profile_info.decrypt.db --msg_db data/raw/path-to-nt_msg.decrypt.dbpython scripts/preprocess.py /path/to/data /path/to/sensitive_words.txtTBD