Current curriculum covers topic from basic NLP techinques to the most modern ones, that may be helpful for custom training of LLMs:
- NLP Basics: tokenization, text preprocessing, text representations
- Text & Language Models: embeddings, n-gram models, RNNs, LSTMs, seq2seq, attention
- Transformers & LLMs: Transformer, pre-training (MLM/CLM), prompting, fine-tuning, PEFT
- Scaling & Optimization: : distributed training, MoE, efficient inference, quantization
- Retrieval & Agents: Information Retrieval, RAG, agent-based systems
- Post-training: alignment, RLHF, DPO
- German Gritsai @grgera
- Anastasiia Vozniuk @natriistorm
- Ildar Khabutdinov @depinwhite
| Week # | Date | Topic | Lecture | Seminar | Recording |
|---|---|---|---|---|---|
| 1 | February 10 | Intro to NLP | slides, slides with notes | ipynb | record |
TBA
TBA
TBA
- Probability Theory + Statistics
- Machine Learning
- Python
- Basic knowledge on NLP
We expect students to know basics of Natural Language Processing, as the course focuses on more advanced topics. When you unsure about the basics, we recommned to watch these lectures / read these materials: