Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
a102d53
init commit
Feb 7, 2026
9011838
.
Feb 21, 2026
bbc803b
make metrics as client_component
Feb 24, 2026
feb3e8d
fixes
Feb 28, 2026
0e31bc1
add includes
Mar 2, 2026
23414c1
fixes and add metric tests
Mar 15, 2026
aa80faa
add table metrics
Mar 27, 2026
c7e65b5
fix semconv
Mar 29, 2026
51757d6
add tracing implementation and its demo launch
Mar 17, 2026
1fd13a6
make spans nested in traces
Mar 25, 2026
650b039
[draft]
Mar 29, 2026
27fb2a6
fix test
Apr 3, 2026
a1a9c01
[C++ SDK] Supported TCP_NODELAY option for grpc sockets (#32631)
Gazizonoki Feb 12, 2026
e35be7e
Enable direct read without consumer (LOGBROKER-9364) (#32846)
Feb 12, 2026
e455288
Add TSimpleBlockingFederatedWriteSession (#32794)
Feb 12, 2026
299f856
Access levels in whoami (#32907)
azevaykin Feb 12, 2026
1345e7e
LOGBROKER-10206 KWS (#31966)
kuzin57 Feb 12, 2026
0bed519
LOGBROKER-9686 Add limits to in flight bytes per partition (#33024)
kuzin57 Feb 12, 2026
694ec90
Fix typos 'recieve' -> 'receive' (#33517)
flown4qqqq Feb 12, 2026
f5a6301
[EXT-1921] Proto annotation-based masking (#33259)
StekPerepolnen Feb 12, 2026
044be49
Add smart mode to CMS (#33297)
pixcc Feb 12, 2026
88aa044
LOGBROKER-10274 Restart if got unknown ack from server (#33934)
kuzin57 Feb 12, 2026
2d5d349
LOGBROKER-10206 Add keyed write session workload & KWS fixes (#33427)
kuzin57 Feb 12, 2026
2746ce9
Update import generation: 35
github-actions[bot] Feb 12, 2026
eaeaa93
Added future/subscription lib
Gazizonoki Feb 13, 2026
a9073ef
Fixed topic cmake build
Gazizonoki Feb 13, 2026
cb8db74
Update version to v3.14.0
Gazizonoki Feb 13, 2026
8fa5e72
LOGBROKER-10206 Fix after review (#34064)
kuzin57 Mar 10, 2026
932ddcd
[NBS-6956] Add nbs partition in-mem (#33486)
vazhem Mar 10, 2026
d212e36
PQv1: passthrough the '_advanced_monitoring' attribute (#34128)
ubyte Mar 10, 2026
9dc92c2
LOGBROKER-10206 Add include (#34147)
kuzin57 Mar 10, 2026
7a421cd
Consumers partition metrics: add unittest for the '_advanced_monitori…
ubyte Mar 10, 2026
7fba250
remove unused TThrRefBase base class from the public interface ISimpl…
ubyte Mar 10, 2026
ee6a962
LOGBROKER-10206 Add metrics & some fixes (#34200)
kuzin57 Mar 10, 2026
0675af0
NBS2: add "get load actor adapter actor id" request to dstool; return…
BarkovBG Mar 10, 2026
550c277
Fixed sanitizer error in ydb/core/persqueue/pqtablet/partition/mlp/ut…
nshestakov Mar 10, 2026
5c09588
LOGBROKER-7430 fixed partition reading hanging in topic sdk (#34362)
GrigoriyPA Mar 10, 2026
c992db0
Fixed SDK oss build (#34423)
Gazizonoki Mar 10, 2026
447cb92
LOGBROKER-7430 added unit test on reading with restarts (#34487)
GrigoriyPA Mar 10, 2026
4d2b605
[C++ SDK] Fixed tsan fail in kqp/ut/effects (#34319)
Gazizonoki Mar 10, 2026
6916363
[C++ SDK] Created CHANGELOG.md (#34599)
Gazizonoki Mar 10, 2026
84fe814
[C++ SDK] Added missing include in topics (#34591)
Gazizonoki Mar 10, 2026
8a15344
Support ALTER TABLE COMPACT in kqp and rpc_alter_table (#34516)
lex007in Mar 10, 2026
66cd23e
Allow to specify impl table proto settings for fulltext_relevance ind…
vitalif Mar 10, 2026
433eff0
Columnshard bloom skip index support (#34458)
xyliganSereja Mar 10, 2026
bc2d75a
Specify user SID in change (CDC) records (#33262)
kseleznyov Mar 10, 2026
bdeb6f6
Support include_index_data in ExportFs (#34641)
stanislav-shchetinin Mar 10, 2026
3982cf1
[NBS-6905] Make nbs2 volume and partition tablets (#34315)
vazhem Mar 10, 2026
449218c
Support cancel/forget/list forced compaction methods in RPC (#34676)
lex007in Mar 10, 2026
3dcfb17
Support index_population_mode in ImportFs (#34832)
stanislav-shchetinin Mar 10, 2026
80588e4
[NBS2]: introduce batch sync and erase (#34613)
BarkovBG Mar 10, 2026
6572e76
Remove Layout (FLAT/FLAT_RELEVANCE) from fulltext settings (#34802)
vitalif Mar 10, 2026
50943b0
YQ-5091 rename PQ partitions balancer and fix registration (#34667)
GrigoriyPA Mar 10, 2026
b75187a
Revert "YQ-5091 rename PQ partitions balancer and fix registration" (…
maximyurchuk Mar 10, 2026
ba8cddd
Add a note about removed full text Layout field (#34998)
vitalif Mar 10, 2026
65a6e7e
Revert "Revert "YQ-5091 rename PQ partitions balancer and fix registr…
GrigoriyPA Mar 10, 2026
6641062
Fixed TSAN error in tests (#35091)
nshestakov Mar 10, 2026
7abd00a
[C++ SDK] Added grpc load balancing policy option (#35137)
Gazizonoki Mar 10, 2026
53a6fec
Added changelog entry for grpc load balancing policy (#35150)
Gazizonoki Mar 10, 2026
fdaf0ef
[C++ SDK] Moved private executors to adapters (#35197)
Gazizonoki Mar 10, 2026
8bae806
LOGBROKER-10206 New interface (#34369)
kuzin57 Mar 10, 2026
ce374f2
SchemeShard: add metrics and status handling for forced compaction (#…
lex007in Mar 10, 2026
12556cc
LOGBROKER-10314 Fix race in read session (#35470)
kuzin57 Mar 10, 2026
33d015b
Fix: TTableClient destructor can hang indefinitely on Drain().Wait() …
Copilot Mar 10, 2026
e92250e
Update import generation: 36
github-actions[bot] Mar 10, 2026
bce3a46
Update CMake files
pnv1 Mar 10, 2026
8ba9df8
Remove EnableOltpSink flag from tests (Part 1) (#35499)
nikvas0 Mar 23, 2026
e592180
Make TServer driver private (#35592)
n00bcracker Mar 23, 2026
19b5eaa
Transfer counters and metrics (#30345)
FloatingCrowbar Mar 23, 2026
25055e0
[NBS-6805] NBS 2 tests (#35566)
vazhem Mar 23, 2026
199cd80
LOGBROKER-10206 Add seqNo initialization (#35653)
kuzin57 Mar 23, 2026
0ac0962
Add JSON inverted index type (#35061)
vitalif Mar 23, 2026
74884de
LOGBROKER-10206 Fix include (#35825)
kuzin57 Mar 23, 2026
8dce6ad
LOGBROKER-10206 Add check in tests (#35861)
kuzin57 Mar 23, 2026
b2b29d5
Removed EnableTopicServiceTx, EnableTopicSplitMerge, EnableTopicMessa…
nshestakov Mar 23, 2026
cc926ac
LOGBROKER-9648 Fix partition session id conflict (#35929)
kuzin57 Mar 23, 2026
6e20bd5
LOGBROKER-10206 Better interface (changes for support) (#35932)
kuzin57 Mar 23, 2026
cb2e580
LOGBROKER-10206 Fix after support (#36161)
kuzin57 Mar 23, 2026
8d4325d
YQ-5187 fixed hanging in PQ read session (#36220)
GrigoriyPA Mar 23, 2026
a3cb3d7
Add session stop reason hint (#35434)
Ane1y Mar 23, 2026
6241414
Update import generation: 37
github-actions[bot] Mar 23, 2026
6c625f2
fixes
Feb 28, 2026
e194ba8
fixes and add metric tests
Mar 15, 2026
bb87881
[C++ SDK] Supported TCP_NODELAY option for grpc sockets (#32631)
Gazizonoki Feb 12, 2026
0861029
fixes
Feb 28, 2026
76591da
fixes and add metric tests
Mar 15, 2026
5e069e5
Merge remote-tracking branch 'upstream/main' into TRACE-0
Apr 27, 2026
a3fc311
Enable direct read without consumer (LOGBROKER-9364) (#32846)
Feb 12, 2026
f5e0a28
Add TSimpleBlockingFederatedWriteSession (#32794)
Feb 12, 2026
591fc55
Access levels in whoami (#32907)
azevaykin Feb 12, 2026
6e7e435
fix test
Apr 27, 2026
bf693d7
fix test
Apr 27, 2026
dc1df0a
update
Apr 28, 2026
6fb7b3d
update
Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/last_commit.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
fd4d02d6b3ebb3565fec58f5a585b44e09e6899e
fd4d02d6b3ebb3565fec58f5a585b44e09e6899e
4 changes: 4 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,7 @@ add_subdirectory(topic_writer/producer/basic_write)
add_subdirectory(ttl)
add_subdirectory(vector_index)
add_subdirectory(vector_index_builtin)

if (YDB_SDK_ENABLE_OTEL_TRACE AND YDB_SDK_ENABLE_OTEL_METRICS)
add_subdirectory(otel_tracing)
endif()
41 changes: 41 additions & 0 deletions examples/otel_tracing/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
add_executable(otel_tracing_example)

target_link_libraries(otel_tracing_example PUBLIC
yutil
getopt
YDB-CPP-SDK::Query
YDB-CPP-SDK::Table
YDB-CPP-SDK::Params
YDB-CPP-SDK::Driver
YDB-CPP-SDK::OpenTelemetryTrace
YDB-CPP-SDK::OpenTelemetryMetrics
opentelemetry-cpp::otlp_http_exporter
opentelemetry-cpp::otlp_http_metric_exporter
)

target_sources(otel_tracing_example PRIVATE
main.cpp
)

vcs_info(otel_tracing_example)

if (CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64" OR CMAKE_SYSTEM_PROCESSOR STREQUAL "AMD64")
target_link_libraries(otel_tracing_example PUBLIC
cpuid_check
)
endif()

if (CMAKE_SYSTEM_NAME STREQUAL "Linux")
target_link_options(otel_tracing_example PRIVATE
-ldl
-lrt
-Wl,--no-as-needed
-lpthread
)
elseif (CMAKE_SYSTEM_NAME STREQUAL "Darwin")
target_link_options(otel_tracing_example PRIVATE
-Wl,-platform_version,macos,11.0,11.0
-framework
CoreFoundation
)
endif()
194 changes: 194 additions & 0 deletions examples/otel_tracing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# YDB C++ SDK — OpenTelemetry Demo

Демонстрация трассировки и метрик операций QueryService и TableService
с визуализацией в **Grafana**, **Jaeger** и **Prometheus**.

## Архитектура

```
┌──────────────┐ OTLP/HTTP ┌──────────────────┐
│ C++ demo │ ──────────────────> │ OTel Collector │
│ application │ │ :4328 (HTTP) │
└──────────────┘ └────────┬──────────┘
│ │
traces │ │ metrics
▼ ▼
┌──────────┐ ┌────────────┐
│ Jaeger │ │ Prometheus │
│ :16686 │ │ :9090 │
└─────┬─────┘ └──────┬──────┘
│ │
└───────┬───────┘
┌──────────┐
│ Grafana │
│ :3000 │
└──────────┘
```

## Быстрый старт

### 1. Запустить инфраструктуру

```bash
cd examples/otel_tracing
docker compose up -d
```

Дождитесь готовности YDB:

```bash
docker compose logs ydb -f
# Ждите строку "Database started successfully"
```

### 2. Собрать SDK с OTel и тестами

Из корня репозитория:

```bash
mkdir -p build && cd build

cmake .. \
-DYDB_SDK_TESTS=ON \
-DYDB_SDK_ENABLE_OTEL_TRACE=ON \
-DYDB_SDK_ENABLE_OTEL_METRICS=ON

cmake --build . --target otel_tracing_example -j$(nproc)
```

### 3. Запустить демо

```bash
./examples/otel_tracing/otel_tracing_example \
--endpoint localhost:2136 \
--database /local \
--otlp http://localhost:4328 \
--iterations 20 \
--retry-workers 6 \
--retry-ops 30
```

#### Доступные флаги

| Флаг | По умолчанию | Описание |
|--------------------|---------------------------|--------------------------------------------------------------------------|
| `--endpoint`, `-e` | `localhost:2136` | gRPC-эндпоинт YDB |
| `--database`, `-d` | `/local` | Имя базы |
| `--otlp` | `http://localhost:4328` | OTLP/HTTP endpoint коллектора |
| `--iterations`,`-n`| `20` | Итераций в Query- и Table-нагрузке |
| `--retry-workers` | `6` | Параллельных воркеров в retry-нагрузке (`0` чтобы пропустить) |
| `--retry-ops` | `30` | Операций на каждого retry-воркера |

#### Демонстрация реальных ретраев

Третий встроенный сценарий — `RunRetryWorkload` — намеренно провоцирует
**SERIALIZABLE-конфликты**: N параллельных воркеров делают
`SELECT → sleep → UPSERT → COMMIT` на одной и той же «горячей» строке
(`id = 9999`) внутри `RetryQuerySync`. YDB возвращает `ABORTED`
проигравшим транзакциям, и SDK прозрачно ретраит их.

В трейсах появятся:

```
ydb.RunWithRetry (INTERNAL, ydb.retry.count=N)
├── ydb.Try (INTERNAL) # первая попытка: ydb.retry.attempt и ydb.retry.backoff_ms отсутствуют
│ ├── ydb.CreateSession
│ ├── ydb.ExecuteQuery
│ └── ydb.Commit db.response.status_code=ABORTED, error.type=ydb_error, exception event
├── ydb.Try ydb.retry.attempt=1 (INTERNAL, ydb.retry.backoff_ms=...)
│ └── ... db.response.status_code=ABORTED, error.type=ydb_error
└── ydb.Try ydb.retry.attempt=N (INTERNAL, ydb.retry.backoff_ms=...)
└── ... db.response.status_code=SUCCESS
```

Для усиления конфликтов поднимите воркеров и операций:

```bash
./examples/otel_tracing/otel_tracing_example \
--retry-workers 12 --retry-ops 80
```

В конце программа печатает счётчик наблюдённых абортов — каждый из них
соответствует одному автоматическому ретраю SDK.

> **Важно:** для статуса `ABORTED` SDK использует политику
> `RetryImmediately` (см. `src/client/impl/internal/retry/retry.h`),
> поэтому атрибут `ydb.retry.backoff_ms` будет равен `0` —
> это by design. Чтобы увидеть `backoff_ms > 0`, нужны статусы
> `UNAVAILABLE` (FastBackoff, slot 5 ms) или `OVERLOADED` /
> `CLIENT_RESOURCE_EXHAUSTED` (SlowBackoff, slot 1 s). Самый простой способ
> их получить — кратковременно перезапустить YDB во время работы примера:
>
> ```bash
> ./examples/otel_tracing/otel_tracing_example --retry-workers 8 --retry-ops 100 &
> sleep 5
> docker compose -f examples/otel_tracing/docker-compose.yml restart ydb
> wait
> ```

### 4. Открыть дашборды

| Сервис | URL | Описание |
|-----------|------------------------------|---------------------------------|
| Grafana | http://localhost:3000 | Дашборд "YDB QueryService" |
| Jaeger | http://localhost:16686 | Поиск трейсов по сервису |
| Prometheus| http://localhost:9090 | Метрики `db_client_operation_*` |

**Grafana**: логин `admin` / пароль `admin`.

### 5. Что смотреть

#### В Grafana (дашборд "YDB QueryService"):
- **Request Rate by Operation** — RPS по операциям (ExecuteQuery, ExecuteDataQuery, CreateSession, Commit, Rollback)
- **Error Rate by Operation** — частота ошибок
- **Duration p50/p95/p99** — распределение длительности операций
- **Error Ratio** — процент ошибок
- **Recent Traces** — таблица трейсов из Jaeger

#### В Jaeger UI:
- Выберите сервис `ydb-cpp-sdk-demo`.
- RPC-спаны (`SpanKind = CLIENT`):
`ydb.CreateSession`, `ydb.ExecuteQuery`, `ydb.ExecuteDataQuery`,
`ydb.BeginTransaction`, `ydb.Commit`, `ydb.Rollback`,
`ydb.ExecuteSchemeQuery`, `ydb.BulkUpsert`.
- Retry-спаны (`SpanKind = INTERNAL`):
- `ydb.RunWithRetry` — обёртка над всей retryable-логикой.
При фактических повторах содержит атрибут `ydb.retry.count` (общее число
выполненных повторов, `>= 1`).
- `ydb.Try` — по одному на каждую попытку. На retry-попытках содержит
атрибуты `ydb.retry.attempt` (`1..N`) и `ydb.retry.backoff_ms`
(длительность sleep перед этой попыткой). На первой (не retry) попытке
эти атрибуты не выставляются.
- Общие атрибуты на всех YDB-спанах:
- `db.system.name = ydb`
- `db.namespace` (имя базы)
- `server.address`, `server.port` (эндпоинт балансера)
- `network.peer.address`, `network.peer.port` (фактический узел кластера)
- На ошибках добавляются:
- `db.response.status_code` — строковый статус YDB (например, `ABORTED`)
- `error.type` — категория источника ошибки: `ydb_error` (ошибка,
возвращённая YDB) или `transport_error` (ошибка транспортного уровня)
- событие `exception` с `exception.type` и `exception.message`

#### В Prometheus:
- `db_client_operation_duration_seconds_bucket` — гистограмма длительности
(OTel Semantic Conventions). Лейблы: `db.system.name`, `db.namespace`,
`db.operation.name` (с префиксом `ydb.`), `ydb.client.api`
(`Query` / `Table`). Для ошибок добавляются `db.response.status_code`
(точный YDB-статус, например `ABORTED`) и `error.type` —
низкокардинальная категория источника ошибки: `ydb_error` (статусы YDB-сервера)
или `transport_error` (клиентские/транспортные статусы).
- `db_client_operation_requests_total` — счётчик начатых операций
(включая каждую попытку ретрая).
- `db_client_operation_errors_total` — счётчик неуспешных попыток.
Полезно сравнивать с `requests_total`: для retry-нагрузки на той же
«горячей» строке коэффициент ошибок будет очень высоким — это и есть
индикатор работы ретраев.

### 6. Остановить

```bash
cd examples/otel_tracing
docker compose down -v
```
71 changes: 71 additions & 0 deletions examples/otel_tracing/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
services:
ydb:
image: cr.yandex/yc/yandex-docker-local-ydb:latest
platform: linux/amd64
hostname: localhost
ports:
- "2136:2136"
- "8765:8765"
environment:
- GRPC_TLS_PORT=2135
- GRPC_PORT=2136
- MON_PORT=8765
- YDB_DEFAULT_LOG_LEVEL=NOTICE
- YDB_USE_IN_MEMORY_PDISKS=true
volumes:
- ydb-data:/ydb_data
healthcheck:
test: /bin/sh -c "/ydb -e grpc://localhost:2136 -d /local scheme ls"
interval: 5s
timeout: 5s
retries: 20

jaeger:
image: jaegertracing/all-in-one:1.76.0
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
environment:
- COLLECTOR_OTLP_ENABLED=true

prometheus:
image: prom/prometheus:v2.53.0
ports:
- "9090:9090"
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
depends_on:
- otel-collector

otel-collector:
image: otel/opentelemetry-collector-contrib:0.110.0
ports:
- "4327:4317"
- "4328:4318"
- "8889:8889"
volumes:
- ./otel-collector/config.yml:/etc/otelcol-contrib/config.yaml:ro
depends_on:
- jaeger

grafana:
image: grafana/grafana:11.1.0
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning:ro
- ./grafana/dashboards:/var/lib/grafana/dashboards:ro
- grafana-data:/var/lib/grafana
depends_on:
- jaeger
- prometheus

volumes:
ydb-data:
grafana-data:
Loading
Loading