-
Notifications
You must be signed in to change notification settings - Fork 510
Reorganize pre-training doc #3778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,4 +7,5 @@ hidden: | |
| --- | ||
| development/update_dependencies.md | ||
| development/contribute_docs.md | ||
| development/hlo_diff_testing.md | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,9 +44,12 @@ ______________________________________________________________________ | |
|
|
||
| When intended architectures transformations alter graph lowering, reference file baselines require updates. | ||
|
|
||
| > [!IMPORTANT]\ | ||
| > While running the update script locally is not the end of the world, **relying on local execution can cause remote CI tests to fail.** | ||
| > The PR verification pipelines run the tests in a strictly locked GitHub Actions environment. The smallest discrepancies in local library installations will introduce slight backend lowering graph deviations. If your local execution leads to a remote CI check failure, rely on the GitHub Action trigger described below to generate environment-matching baselines. | ||
| ```{important} | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added syntax to show this note correctly on readthedocs. |
||
|
|
||
| While running the update script locally is not the end of the world, **relying on local execution can cause remote CI tests to fail.** | ||
|
|
||
| The PR verification pipelines run the tests in a strictly locked GitHub Actions environment. The smallest discrepancies in local library installations will introduce slight backend lowering graph deviations. If your local execution leads to a remote CI check failure, rely on the GitHub Action trigger described below to generate environment-matching baselines. | ||
| ``` | ||
|
|
||
| ### Method 1: Run the manual GitHub Action Workflow (Highly Recommended) | ||
|
|
||
|
|
@@ -66,13 +69,14 @@ Alternatively, you can trigger the remote workflow via terminal CLI execution: | |
| gh workflow run update_reference_hlo.yml --ref <branch> | ||
| ``` | ||
|
|
||
| > [!NOTE] | ||
| > A successful run of the manual update workflow will add a new commit to your Pull Request branch. Once complete, you must: | ||
| > | ||
| > 1. Pull the new commit from remote. | ||
| > 2. Squash the commits in your branch once again to keep your PR history clean. | ||
| > 3. Push the squashed commit to remote. | ||
| > 4. Retry the `tpu-integration` workflow to verify tests pass on your PR. | ||
| ```{note} | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added syntax to show this note correctly on readthedocs. |
||
| A successful run of the manual update workflow will add a new commit to your Pull Request branch. Once complete, you must: | ||
|
|
||
| 1. Pull the new commit from remote. | ||
| 2. Squash the commits in your branch once again to keep your PR history clean. | ||
| 3. Push the squashed commit to remote. | ||
| 4. Retry the `tpu-integration` workflow to verify tests pass on your PR. | ||
| ``` | ||
|
|
||
| ### Method 2: Local Execution | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,58 +18,59 @@ | |
|
|
||
| Explore our how-to guides for optimizing, debugging, and managing your MaxText workloads. | ||
|
|
||
| ::::{grid} 1 2 2 2 | ||
| :gutter: 2 | ||
|
|
||
| :::{grid-item-card} ⚡ Optimization | ||
| ````{grid} 1 2 2 2 | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both the colon and backticks syntax can be used here, but the markdown linter we are using prefers backticks, so I converted the syntax here to be compatible with the linter. |
||
| --- | ||
| gutter: 2 | ||
| --- | ||
| ```{grid-item-card} ⚡ Optimization | ||
| :link: guides/optimization | ||
| :link-type: doc | ||
|
|
||
| Techniques for maximizing performance, including sharding strategies, Pallas kernels, and benchmarking. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 💾 Data Pipelines | ||
| ```{grid-item-card} 💾 Data Pipelines | ||
| :link: guides/data_input_pipeline | ||
| :link-type: doc | ||
|
|
||
| Configure input pipelines using **Grain** (recommended for determinism), **HuggingFace**, or **TFDS**. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 🔄 Checkpointing | ||
| ```{grid-item-card} 🔄 Checkpointing | ||
| :link: guides/checkpointing_solutions | ||
| :link-type: doc | ||
|
|
||
| Manage GCS checkpoints, handle preemption with emergency checkpointing, and configure multi-tier storage. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 🔍 Monitoring & Debugging | ||
| ```{grid-item-card} 🔍 Monitoring & Debugging | ||
| :link: guides/monitoring_and_debugging | ||
| :link-type: doc | ||
|
|
||
| Tools for observability: goodput monitoring, hung job debugging, and Vertex AI TensorBoard integration. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 🐍 Python Notebooks | ||
| ```{grid-item-card} 🐍 Python Notebooks | ||
| :link: guides/run_python_notebook | ||
| :link-type: doc | ||
|
|
||
| Interactive development guides for running MaxText on Google Colab or local JupyterLab environments. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 🌱 Model Bringup | ||
| ```{grid-item-card} 🌱 Model Bringup | ||
| :link: guides/model_bringup | ||
| :link-type: doc | ||
|
|
||
| A step-by-step guide for the community to help expand MaxText's model library. | ||
| ::: | ||
| ``` | ||
|
|
||
| :::{grid-item-card} 🎓 Distillation | ||
| ```{grid-item-card} 🎓 Distillation | ||
| :link: guides/distillation | ||
| :link-type: doc | ||
|
|
||
| How online distillation works in MaxText: loss anatomy, α / β / temperature schedule tuning, layer indices, monitoring metrics, and troubleshooting. | ||
| ::: | ||
| :::: | ||
| ``` | ||
| ```` | ||
|
|
||
| ```{toctree} | ||
| --- | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file had not been added to any ToCs, and so was not reachable.