dataflow-operator
diff --git a/‎docs/development.md‎
Lines changed: 0 additions & 1 deletion b/‎docs/development.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎docs/en/architecture.md‎
Lines changed: 26 additions & 13 deletions b/‎docs/en/architecture.md‎
Lines changed: 26 additions & 13 deletions
diff --git a/‎docs/en/connectors.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/en/connectors.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/en/development.md‎
Lines changed: 0 additions & 1 deletion b/‎docs/en/development.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎docs/en/fault-tolerance.md‎
Lines changed: 28 additions & 7 deletions b/‎docs/en/fault-tolerance.md‎
Lines changed: 28 additions & 7 deletions
@@ -139,7 +139,6 @@ make manifests
 
 Эта команда генерирует:
 - CRD манифесты в `config/crd/bases/`
-- RBAC манифесты в `config/rbac/`
 
 ### Генерация DeepCopy методов
 
 
@@ -25,6 +25,7 @@ High-level flow:
   - **Errors**: optional sink for messages that fail to be written to the main sink.
   - **Resources**: optional CPU/memory for the processor pod.
   - **Scheduling**: optional `nodeSelector`, `affinity`, `tolerations`.
+  - **CheckpointPersistence**: optional; defaults to `true`. When enabled, polling sources (PostgreSQL, ClickHouse, Trino) persist read position to a ConfigMap, reducing duplicates on restart. Set to `false` to disable.
 
 Secrets can be referenced via `SecretRef` in the spec; the operator resolves them before writing the spec into the ConfigMap.
 
@@ -46,7 +47,9 @@ For each DataFlow `<name>` in a namespace:
 | Resource | Name | Purpose |
 |----------|------|---------|
 | ConfigMap | `dataflow-<name>-spec` | Holds `spec.json` (resolved spec with secrets inlined). |
+| ConfigMap | `dataflow-<name>-checkpoint` | Stores read position for polling sources (default). Omitted when `checkpointPersistence: false`. |
 | Deployment | `dataflow-<name>` | One replica; pod runs the **processor** container. |
+| ServiceAccount, Role, RoleBinding | `dataflow-<name>-processor` | RBAC for processor to read/write checkpoint ConfigMap (default). Omitted when `checkpointPersistence: false`. |
 
 The processor container:
 
@@ -65,8 +68,9 @@ The operator uses a **ClusterRole** (and **ClusterRoleBinding** to its ServiceAc
 - Create/patch **events**.
 - Read **secrets** (for resolution).
 - Create/update/delete **ConfigMaps** and **Deployments** in the same namespaces as DataFlow resources.
+- When checkpoint persistence is enabled: create **ServiceAccounts**, **Roles**, and **RoleBindings** for processor pods to access the checkpoint ConfigMap.
 
-See the Helm templates (e.g. `clusterrole.yaml`, `clusterrolebinding.yaml`) and the manifest under `config/rbac/` for the exact rules.
+See the Helm templates (e.g. `clusterrole.yaml`, `clusterrolebinding.yaml`) for the exact rules.
 
 ### Optional: GUI
 
@@ -92,18 +96,21 @@ flowchart LR
   API["API Server"]
   CRD["DataFlow CRD"]
   Operator["Operator Pod"]
-  CM["ConfigMap"]
+  CMSpec["ConfigMap spec"]
+  CMCheckpoint["ConfigMap checkpoint"]
   Dep["Deployment"]
   Proc["Processor Pod"]
   Ext["Kafka / PostgreSQL / Trino / Nessie"]
 
   User -->|"apply DataFlow"| API
   API --> CRD
   Operator -->|watch| CRD
-  Operator -->|create/update| CM
+  Operator -->|create/update| CMSpec
+  Operator -->|create/update| CMCheckpoint
   Operator -->|create/update| Dep
   Dep --> Proc
-  Proc -->|mount spec| CM
+  Proc -->|mount spec| CMSpec
+  Proc -->|read/write checkpoint| CMCheckpoint
   Proc -->|connect| Ext
 ```
 
@@ -114,33 +121,39 @@ flowchart LR
 For each DataFlow, the controller runs the following steps (on create, update, or when owned resources change):
 
 1. **Get DataFlow**  
-   If not found, return. If **DeletionTimestamp** is set: delete the Deployment and ConfigMap (cleanup), update status to `Stopped`, then return.
+   If not found, return. If **DeletionTimestamp** is set: delete the Deployment, ConfigMaps (spec and checkpoint), and processor RBAC (cleanup), update status to `Stopped`, then return.
 
 2. **Resolve secrets**  
    Use **SecretResolver** to substitute all `SecretRef` fields in the spec with values from Kubernetes Secrets. Result: **resolved spec**.
 
 3. **ConfigMap**  
    Create or update the ConfigMap `dataflow-<name>-spec` with key `spec.json` = JSON of the resolved spec. Set controller reference to the DataFlow.
 
-4. **Deployment**  
-   Create or update the Deployment `dataflow-<name>`: processor image, volume from that ConfigMap, args and env as above. Use resources/affinity from DataFlow spec if set. Set controller reference to the DataFlow.
+4. **Checkpoint ConfigMap and RBAC** (when `checkpointPersistence` is not `false`, default: enabled)  
+   Create ConfigMap `dataflow-<name>-checkpoint` and RBAC (ServiceAccount, Role, RoleBinding) so the processor pod can read/write the checkpoint. The processor persists source read position (lastReadID, lastReadChangeTime) there, reducing duplicates on restart.
 
-5. **Deployment status**  
+5. **Deployment**  
+   Create or update the Deployment `dataflow-<name>`: processor image, volume from the spec ConfigMap, args and env as above. When checkpoint persistence is enabled, set `serviceAccountName` so the pod uses the dedicated ServiceAccount. Use resources/affinity from DataFlow spec if set. Set controller reference to the DataFlow.
+
+6. **Deployment status**  
    Read the Deployment; set DataFlow status **Phase** and **Message** from it (e.g. `Running` when `ReadyReplicas > 0`, `Pending` when replicas are starting, `Error` when no replicas).
 
-6. **Update DataFlow status**  
+7. **Update DataFlow status**  
    Write Phase, Message, and other status fields back to the DataFlow resource (with retry on conflict).
 
 ### Reconcile Loop Diagram
 
 ```mermaid
 flowchart TD
   A[Get DataFlow] --> B{Deleted?}
-  B -->|Yes| C[Cleanup Deployment and ConfigMap]
+  B -->|Yes| C[Cleanup Deployment, ConfigMaps, RBAC]
   C --> D[Update Status Stopped]
   B -->|No| E[Resolve Secrets]
   E --> F[Create or Update ConfigMap]
-  F --> G[Create or Update Deployment]
+  F --> F2{CheckpointPersistence?}
+  F2 -->|Yes| F3[Create Checkpoint ConfigMap and RBAC]
+  F2 -->|No| G
+  F3 --> G[Create or Update Deployment]
   G --> H[Read Deployment Status]
   H --> I[Update DataFlow Status]
 ```
@@ -164,7 +177,7 @@ It reads the spec from the file, builds a **Processor** from it, and runs `Proce
 
 The **Processor** (in `internal/processor/processor.go`) is built from the spec and contains:
 
-- **Source**: a **SourceConnector** (Kafka, PostgreSQL, Trino, or Nessie) — `Connect`, `Read`, `Close`.
+- **Source**: a **SourceConnector** (Kafka, PostgreSQL, Trino, or Nessie) — `Connect`, `Read`, `Close`. By default, polling sources load initial checkpoint from ConfigMap and save it after each successful sink write (debounced). Disable with `checkpointPersistence: false`.
 - **Sink**: a **SinkConnector** for the main destination — `Connect`, `Write`, `Close`.
 - **Error sink** (optional): another SinkConnector for failed writes.
 - **Transformations**: an ordered list of **Transformer** implementations (timestamp, flatten, filter, mask, router, select, remove, snakeCase, camelCase).
@@ -220,6 +233,6 @@ flowchart LR
 
 ## Summary
 
-- **Kubernetes**: You declare a **DataFlow** CR; the **operator** reconciles it into a **ConfigMap** (spec) and a **Deployment** (processor pod). RBAC and optional GUI complete the picture.
+- **Kubernetes**: You declare a **DataFlow** CR; the **operator** reconciles it into a **ConfigMap** (spec) and a **Deployment** (processor pod). By default, a second ConfigMap and RBAC are created for checkpoint storage (set `checkpointPersistence: false` to disable). RBAC and optional GUI complete the picture.
 - **Reconciliation**: Get DataFlow → resolve secrets → update ConfigMap → update Deployment → reflect Deployment status in DataFlow status.
 - **Runtime**: Each **processor** pod runs a single pipeline: source → read channel → transformations → write to main (and optionally error and router) sinks, using pluggable connectors and a fixed set of transformations.
@@ -414,7 +414,7 @@ source:
 - **Change Tracking**: By default tracks changes via `updated_at` column (or `changeTrackingColumn`), captures both INSERTs and UPDATEs
 - **Auto-create Table**: When `autoCreateTable: true`, creates the table with CDC-friendly schema (`id SERIAL PRIMARY KEY`, `created_at`, `updated_at`) if it doesn't exist. Creation happens at Connect time.
 - **Schema notation**: Table name supports `schema.table` format (e.g. `public.products`)
-- **In-memory state**: Read position (lastReadChangeTime) is stored only in memory. On pod/connector restart, the table is fully re-read. For pg→pg flows, enable `upsertMode: true` in sink to update duplicates instead of inserting them again.
+- **Checkpoint persistence**: By default, read position (lastReadChangeTime) is persisted to ConfigMap; on restart, reading resumes from the last position. Set `checkpointPersistence: false` in spec to store only in memory. For pg→pg flows, enable `upsertMode: true` in sink to update duplicates instead of inserting them again.
 
 ### Sink
 
 
@@ -105,7 +105,6 @@ make manifests
 
 This command generates:
 - CRD manifests in `config/crd/bases/`
-- RBAC manifests in `config/rbac/`
 
 ### Generate DeepCopy Methods
 
 
@@ -12,9 +12,9 @@ DataFlow Operator processes messages with **at-least-once** delivery semantics.
 | Source | State storage | On restart |
 |--------|---------------|------------|
 | **Kafka** | Consumer group (Kafka) | Resumes from last committed offset. No duplicates if offset was committed after sink write. |
-| **PostgreSQL** | In-memory (lastReadChangeTime) | State lost. Re-reads from beginning. Duplicates or gaps possible. |
-| **ClickHouse** | In-memory (lastReadID, lastReadTime) | State lost. Re-reads from beginning. Duplicates possible. |
-| **Trino** | In-memory (lastReadID) | State lost. Re-reads from beginning. Duplicates possible. |
+| **PostgreSQL** | ConfigMap (default); in-memory when `checkpointPersistence: false` | By default resumes from last position. Without persistence: re-reads from beginning. |
+| **ClickHouse** | ConfigMap (default); in-memory when `checkpointPersistence: false` | By default resumes from last position. Without persistence: re-reads from beginning. |
+| **Trino** | ConfigMap (default); in-memory when `checkpointPersistence: false` | By default resumes from last position. Without persistence: re-reads from beginning. |
 
 ### Kafka Source
 
@@ -26,12 +26,14 @@ The Kafka consumer commits offset **only after** the message is successfully wri
 
 ### Polling Sources (PostgreSQL, ClickHouse, Trino)
 
-Read position (lastReadID, lastReadChangeTime) is stored **only in memory**. On pod crash:
+By default, read position (lastReadID, lastReadChangeTime) is stored **only in memory**. On pod crash:
 
 - State is lost.
 - On restart, the source re-reads from the beginning (or from a wrong position).
 - **Duplicates** or **gaps** are possible depending on when the crash occurred.
 
+**Checkpoint persistence** is enabled by default. The read position is persisted to a ConfigMap. On restart, the source resumes from the last committed position, reducing duplicates. Set `checkpointPersistence: false` in spec to disable.
+
 !!! warning "Idempotent sink required"
     For polling sources, always configure an **idempotent sink** (UPSERT, ReplacingMergeTree) to handle duplicates safely.
 
@@ -107,9 +109,28 @@ On SIGTERM (e.g., pod eviction, node drain):
 
 Ensure `terminationGracePeriodSeconds` is sufficient for large batches to flush (default: 600 seconds).
 
-## Checkpoint Persistence (Future)
+## Checkpoint Persistence
+
+!!! note "Enabled by default"
+    The `checkpointPersistence` field in the DataFlow spec defaults to `true`. You do not need to set it explicitly — checkpoint persistence is enabled for all DataFlows with polling sources.
+
+Checkpoint persistence is **enabled by default**. The read position (lastReadID, lastReadChangeTime) is persisted to ConfigMap `dataflow-<name>-checkpoint`. On processor restart, polling sources (PostgreSQL, ClickHouse, Trino) resume from the last committed position, reducing duplicates.
+
+To disable, set `checkpointPersistence: false`:
+
+```yaml
+apiVersion: dataflow.dataflow.io/v1
+kind: DataFlow
+metadata:
+  name: my-dataflow
+spec:
+  checkpointPersistence: false  # Disable (default: true)
+  source:
+    type: postgresql
+    # ...
+```
 
-Persisting source checkpoint (lastReadID, lastReadChangeTime) to external storage (ConfigMap or sink table) would allow polling sources to resume from the last committed position after a processor restart, reducing duplicates. This is planned for a future release. Until then, use idempotent sinks to handle duplicates safely.
+The controller creates the ConfigMap and RBAC (ServiceAccount, Role, RoleBinding) for the processor. Checkpoint is saved with debounce (every 30 seconds) and on graceful shutdown.
 
 ## Summary Checklist
 
@@ -118,5 +139,5 @@ Persisting source checkpoint (lastReadID, lastReadChangeTime) to external storag
 | PostgreSQL sink | Enable `upsertMode: true` with PRIMARY KEY or `conflictKey` |
 | ClickHouse sink | Use `ReplacingMergeTree` with `ORDER BY` on deduplication key |
 | Kafka source | Consumer group persists offset; idempotent sink recommended |
-| Polling sources | **Always** use idempotent sink; state is lost on crash |
+| Polling sources | **Always** use idempotent sink; checkpoint persistence enabled by default |
 | batchSize | Consider smaller values to reduce duplicate window |