diff --git a/docs/en/TOC.md b/docs/en/TOC.md index b7e18c3671e..c6053251899 100644 --- a/docs/en/TOC.md +++ b/docs/en/TOC.md @@ -10,7 +10,8 @@ + Get Started - [Quick Start](userguide/get_started.md) - [Installation](userguide/install.md) - - [Trubleshooting](userguide/troubleshooting.md) + - [Configuration Best Practices](userguide/config_best_practices.md) + - [Troubleshooting](userguide/troubleshooting.md) + Dataset + Creation - [Accelerate Data Accessing(via POSIX)](samples/accelerate_data_accessing.md) diff --git a/docs/en/userguide/config_best_practices.md b/docs/en/userguide/config_best_practices.md new file mode 100644 index 00000000000..93019062c0f --- /dev/null +++ b/docs/en/userguide/config_best_practices.md @@ -0,0 +1,75 @@ +# Fluid Configuration Guide: Best Practices and Tuning + +This document serves as a deep-dive into the configuration knobs of Fluid. While Fluid works out-of-the-box with sensible defaults, achieving production-grade performance requires tuning based on your specific storage backend and workload characteristics. + +## 1. Dataset: The Foundation + +The `Dataset` resource defines **where** your data lives and **how** it should be accessed. + +### Key Considerations +* **Mount Point Naming**: When mounting multiple sources, use explicit `name` fields. Fluid uses these names to create the internal directory structure. Without them, you risk path collisions if two sources have similar root structures. +* **Read-Only vs. Read-Write**: For most AI training workloads, set `readOnly: true` in your mounts. This allows the caching engine (like Alluxio) to optimize for read-heavy traffic and avoid the overhead of consistency checks for writes. + +| Config Point | Why it matters | +| :--- | :--- | +| `spec.placement: Exclusive` | **Performance Isolation.** Prevents other datasets from "stealing" cache space on the same node. Essential for low-latency requirements. | +| `spec.nodeAffinity` | **Disk Type Targeting.** If your cluster has a mix of HDD and NVMe nodes, use affinity to ensure Fluid only caches data on the high-speed nodes. | + +--- + +## 2. AlluxioRuntime: High-Performance Caching + +Alluxio is the "engine" for most Fluid deployments. Its configuration determines your data-plane throughput. + +### Tuning the Memory Tier (MEM) +For the fastest possible access, use `/dev/shm` (Ramdisk). +* **Best Practice**: Ensure your `tieredstore` levels point to a medium of type `MEM`. +* **Gotcha**: If your node runs out of RAM, the Alluxio Worker might be OOMKilled. Always set `resources.limits.memory` slightly higher than your total `quota`. + +### JVM Heap Management +Since Alluxio is Java-based, `jvmOptions` are critical. If you have millions of small files, the Master node needs more heap space to track metadata. +```yaml +# Example: Increasing Master Heap for large metadata +master: + jvmOptions: + - "-Xms4g" + - "-Xmx4g" +``` + +--- + +## 3. JuiceFSRuntime: Cloud-Native POSIX + +JuiceFS is excellent for environments where POSIX compliance is a hard requirement. + +### Metadata vs. Data +JuiceFS separates metadata (Redis/MySQL/TiKV) from data (S3/OSS). +* **Optimization**: Use the `attr-cache` option in `spec.fuse.options`. Setting this to `60s` or higher can drastically reduce the load on your metadata service during repetitive tasks like `ls -R`. +* **Worker Caching**: Use the `--cache-size` flag in `spec.worker.options` to limit how much local disk JuiceFS uses. Without this, it might fill up the node's root partition. + +--- + +## 4. JindoRuntime: Alibaba Cloud Optimization + +If you are running in ACK (Alibaba Cloud Container Service), JindoRuntime provides native optimizations for OSS. + +* **Credential Management**: Avoid hardcoding AK/SK in the YAML. Use `hadoopConfig` to reference a Secret containing `core-site.xml` with your OSS credentials. +* **Log Bloat**: Jindo can be chatty. Set `spec.fuse.logConfig` to `level: warn` for stable production environments to save disk space on logs. + +--- + +## 5. ThinRuntime: The "Universal" Adapter + +ThinRuntime is intended for storage systems that don't have a dedicated Fluid controller (e.g., NFS, Ceph). + +* **Standardization**: Leverage `ThinRuntimeProfile`. It allows you to define the "how-to-mount" logic once and reuse it across multiple datasets. +* **Health Probes**: Since ThinRuntime relies on external FUSE binaries, always define `livenessProbe`. This allows Kubernetes to auto-restart the FUSE pod if the mount point becomes "stale" or "transport endpoint is not connected." + +--- + +## Common Production Checklist + +1. **Resource Quotas**: Never run workers without `limits`. A caching engine will naturally try to consume all available resources. +2. **Pull Secrets**: If your images are in a private registry, `imagePullSecrets` must be defined at the spec level so the Master, Worker, and Fuse pods can all pull successfully. +3. **Tiered Locality**: Use `storage-network` labels if your storage and compute are on separate network planes to avoid cross-switch bottlenecking. + diff --git a/docs/zh/TOC.md b/docs/zh/TOC.md index 42977c21539..48e46d11868 100644 --- a/docs/zh/TOC.md +++ b/docs/zh/TOC.md @@ -14,6 +14,7 @@ + 入门 - [安装](userguide/install.md) - [快速开始](userguide/get_started.md) + - [配置最佳实践](userguide/config_best_practices.md) - [问题诊断](userguide/troubleshooting.md) + 数据集使用 + 创建 diff --git a/docs/zh/userguide/config_best_practices.md b/docs/zh/userguide/config_best_practices.md new file mode 100644 index 00000000000..ce9b4b34ea3 --- /dev/null +++ b/docs/zh/userguide/config_best_practices.md @@ -0,0 +1,75 @@ +# Fluid 配置指南:最佳实践与性能调优 + +本文档旨在深入探讨 Fluid 的各项配置参数。虽然 Fluid 提供了开箱即用的默认值,但在生产环境中,针对特定的存储后端和工作负载特性进行调优是确保高性能的关键。 + +## 1. Dataset: 核心基础 + +`Dataset` 资源定义了数据的**来源**以及**访问方式**。 + +### 关键注意事项 +* **挂载点命名**: 在挂载多个数据源时,请务必指定清晰的 `name` 字段。Fluid 会根据这些名称构建内部目录结构。如果不指定名称,当多个数据源具有相似的根目录结构时,可能会发生路径冲突。 +* **只读与读写**: 对于大多数 AI 训练任务,建议将挂载点设置为 `readOnly: true`。这允许像 Alluxio 这样的缓存引擎针对纯读流量进行优化,并避免维护写入一致性带来的额外开销。 + +| 配置项 | 核心价值 | +| :--- | :--- | +| `spec.placement: Exclusive` | **性能隔离。** 防止同一节点上的其他数据集“挤占”缓存空间,是低延迟要求的保障。 | +| `spec.nodeAffinity` | **精准定位。** 如果集群中包含 HDD 和 NVMe 混合节点,通过亲和性确保 Fluid 只在高速节点上配置缓存。 | + +--- + +## 2. AlluxioRuntime: 高性能分布式缓存 + +Alluxio 是 Fluid 中应用最广泛的缓存引擎,其配置直接决定了数据层(Data-Plane)的吞吐量。 + +### 内存级缓存调优 (MEM) +为了获得极速访问,通常使用 `/dev/shm`(内存盘)。 +* **最佳实践**: 确保 `tieredstore` 层级设置中,介质类型指向 `MEM`。 +* **风险提示**: 如果节点内存不足,Alluxio Worker 可能会因 OOM 被 kill。务必将 `resources.limits.memory` 设置为略高于 `配额`。 + +### JVM 堆内存管理 +由于 Alluxio 基于 Java 开发,`jvmOptions` 至关重要。如果存在数百万个小文件,Master 节点需要更多的堆内存来跟踪元数据。 +```yaml +# 示例:为元数据较多的场景增加 Master 堆内存 +master: + jvmOptions: + - "-Xms4g" + - "-Xmx4g" +``` + +--- + +## 3. JuiceFSRuntime: 云原生 POSIX 存储 + +JuiceFS 非常适合那些对 POSIX 兼容性有硬性要求的环境。 + +### 元数据与性能 +JuiceFS 将元数据与数据物理隔离。 +* **优化建议**: 利用 `spec.fuse.options` 中的 `attr-cache` 选项。将其设置为 `60s` 或更长,可以显著减轻元数据服务在执行 `ls -R` 等高频扫描任务时的压力。 +* **空间配额**: 使用 `spec.worker.options` 中的 `--cache-size` 限制本地磁盘占用,防止其填满宿主机的根分区。 + +--- + +## 4. JindoRuntime: 阿里云生态优化 + +在阿里云 ACK 环境中,JindoRuntime 针对 OSS 提供了原生加速。 + +* **凭据安全**: 避免在 YAML 中硬编码 AK/SK。推荐使用 `hadoopConfig` 引用包含 `core-site.xml` 的 Secret。 +* **日志控制**: Jindo 在默认情况下日志量可能较大。生产环境中建议设置 `spec.fuse.logConfig` 为 `level: warn`,以节省节点日志存储空间。 + +--- + +## 5. ThinRuntime: 通用适配器 + +ThinRuntime 专为尚未内置在 Fluid 中的存储系统(如 NFS、Ceph)而设计。 + +* **标准化部署**: 充分利用 `ThinRuntimeProfile`。您可以一次性定义挂载逻辑,并在多个 Dataset 中复用。 +* **健康检查**: 由于 ThinRuntime 依赖外部 FUSE 进程,务必定义 `livenessProbe`。这能确保在挂载点出现“传输端点未连接”等异常时,Kubernetes 能自动重启 FUSE Pod。 + +--- + +## 生产环境 Checklist + +1. **资源配额**: 严禁在不设置 `limits` 的情况下运行 Worker。缓存引擎通常会倾向于耗尽所有可用资源。 +2. **镜像密钥**: 如果镜像存储在私有仓库,必须在 Spec 级配置 `imagePullSecrets`,以确保所有组件 Pod(Master, Worker, Fuse)都能成功拉取镜像并启动。 +3. **分层本地性**: 如果计算节点与存储节点位于不同的网络平面,建议结合网络标签(storage-network)使用,以避免跨核心交换机的流量瓶颈。 +