Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@
* Extend the `GET /inspect/entities` admin API to inspect a metric persisted by **any** OAP, even one this node does not define locally. When the metric is unknown to the local registry, the caller supplies `valueColumn` + `valueType` and the storage backend resolves the physical index/table/group from its own running config (no DB schema/table-metadata read): ES uses the merged `metrics-all` index + `metric_table` discriminator, JDBC probes the node's function tables by the `table_name` discriminator, and BanyanDB synthesizes a read-only measure schema. Scope is no longer required — the `entity_id` is decoded structurally (service / 2nd-level / relations) with a generic `name` leaf. Locally-defined metrics keep the exact field names, scope, and `mqeEntity` as before.
* Add the `POST /inspect/values` admin API — read the value series of a metric persisted by **another** OAP (one this node does not define locally) by supplying its `{valueColumn, valueType}`. The real MQE engine runs over a request-scoped `InspectQueryContext` overlay (provide-if-absent — the local catalog always wins) that makes the foreign metric look registered to every read path: `ValueColumnMetadata` resolves its value column / type / scope, and the storage location registries resolve where it lives (`MetadataRegistry` synthesizes a BanyanDB measure schema, `IndexController` resolves the ES `metrics-all` index, `TableHelper` probes the JDBC function tables), so the read returns the native MQE `ExpressionResult` with no per-DAO special-casing. Admin-only (a forced read this OAP cannot validate); not mirrored onto the public REST / GraphQL surface. See the [Inspect API](../setup/backend/admin-api/inspect.md).
* Remove the always-on alarm-to-event conversion (`EventHookCallback`). A triggered alarm is no longer synthesized into the events pipeline as an `Alarm`/`AlarmRecovery` event; events now originate only from real event sources (agents, SkyWalking CLI, Kubernetes Event Exporter). Alarms remain available through the alarm store (`getAlarm`/`queryAlarms`) and the configured alarm hooks. This drops a documented "Known Event" and removes 1-2 synthetic event records per alarm fire.
* **TLS for all OAP HTTP/REST servers, with cert hot-reload.** Adds the
`restSSLEnabled` / `restSSLKeyPath` / `restSSLCertChainPath` config structure to every
OAP HTTP server — core REST, sharing-server, admin, PromQL, LogQL, TraceQL and Zipkin
query/receiver — each with its own dedicated environment variables (`SW_CORE_REST_SSL_*`,
`SW_RECEIVER_SHARING_REST_SSL_*`, `SW_ADMIN_SERVER_REST_SSL_*`, `SW_PROMQL_REST_SSL_*`,
`SW_LOGQL_REST_SSL_*`, `SW_TRACEQL_REST_SSL_*`, `SW_QUERY_ZIPKIN_REST_SSL_*`,
`SW_RECEIVER_ZIPKIN_REST_SSL_*`). The shared Armeria `HTTPServer` reloads the key pair
from disk on rotation (via `TlsProvider.ofScheduled`) so refreshed certificates are
picked up without restarting the OAP, matching the existing gRPC SSL hot-reload
behavior. HTTP TLS is server-side only (no mTLS).
* **New `queryAlarms` GraphQL query — entity / layer / rule filters for alarms.** Adds
a comprehensive alarm query API alongside the legacy `getAlarm`. The new
`queryAlarms(condition: AlarmQueryCondition!): Alarms` accepts a single input type
Expand Down
21 changes: 21 additions & 0 deletions docs/en/setup/backend/configuration-vocabulary.md

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions docs/en/setup/backend/grpc-security.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,38 @@ receiver-sharing-server:

You can still use this [script](../../../../tools/TLS/tls_key_generate.sh) to generate CA certificate and the key files of server-side(for OAP Server) and client-side(for Agent/Satellite).
You have to notice the keys, including server and client-side, are from the same CA certificate.

## TLS on OAP HTTP/REST servers

Besides gRPC, the OAP server exposes several HTTP/REST servers. Each one is configured
independently and shares the same `restSSL*` configuration structure, with **dedicated
environment variables** per server:

| HTTP server | `application.yml` section | Environment variables |
|-------------|---------------------------|-----------------------|
| Core REST (GraphQL query API) | `core/default` | `SW_CORE_REST_SSL_ENABLED` / `SW_CORE_REST_SSL_KEY_PATH` / `SW_CORE_REST_SSL_CERT_CHAIN_PATH` |
| Sharing REST receiver | `receiver-sharing-server/default` | `SW_RECEIVER_SHARING_REST_SSL_ENABLED` / `SW_RECEIVER_SHARING_REST_SSL_KEY_PATH` / `SW_RECEIVER_SHARING_REST_SSL_CERT_CHAIN_PATH` |
| Admin server | `admin-server/default` | `SW_ADMIN_SERVER_REST_SSL_ENABLED` / `SW_ADMIN_SERVER_REST_SSL_KEY_PATH` / `SW_ADMIN_SERVER_REST_SSL_CERT_CHAIN_PATH` |
| PromQL | `promql/default` | `SW_PROMQL_REST_SSL_ENABLED` / `SW_PROMQL_REST_SSL_KEY_PATH` / `SW_PROMQL_REST_SSL_CERT_CHAIN_PATH` |
| LogQL | `logql/default` | `SW_LOGQL_REST_SSL_ENABLED` / `SW_LOGQL_REST_SSL_KEY_PATH` / `SW_LOGQL_REST_SSL_CERT_CHAIN_PATH` |
| TraceQL | `traceQL/default` | `SW_TRACEQL_REST_SSL_ENABLED` / `SW_TRACEQL_REST_SSL_KEY_PATH` / `SW_TRACEQL_REST_SSL_CERT_CHAIN_PATH` |
| Zipkin query | `query-zipkin/default` | `SW_QUERY_ZIPKIN_REST_SSL_ENABLED` / `SW_QUERY_ZIPKIN_REST_SSL_KEY_PATH` / `SW_QUERY_ZIPKIN_REST_SSL_CERT_CHAIN_PATH` |
| Zipkin receiver | `receiver-zipkin/default` | `SW_RECEIVER_ZIPKIN_REST_SSL_ENABLED` / `SW_RECEIVER_ZIPKIN_REST_SSL_KEY_PATH` / `SW_RECEIVER_ZIPKIN_REST_SSL_CERT_CHAIN_PATH` |

For example, to enable TLS on the core REST server under `application.yml/core/default`:

```yaml
restSSLEnabled: ${SW_CORE_REST_SSL_ENABLED:true}
restSSLKeyPath: ${SW_CORE_REST_SSL_KEY_PATH:/path/to/server.pem}
restSSLCertChainPath: ${SW_CORE_REST_SSL_CERT_CHAIN_PATH:/path/to/server.crt}
```

* `restSSLKeyPath` is the private key, either PKCS#8(PEM) or PKCS#1(DER).
* `restSSLCertChainPath` is the X.509 certificate chain.

Each server can point at its own certificate, or you can point several of them at the same
mounted certificate (for example a single Kubernetes secret). The HTTP servers present a
server certificate only (no client certificate verification / mTLS).

**When the certificate files are rotated in place (for example a refreshed Kubernetes
secret), they are reloaded automatically and you do not have to restart an OAP instance.**
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,15 @@ public class AdminServerModuleConfig extends ModuleConfig {
private int acceptQueueSize = 0;
private int httpMaxRequestHeaderSize = 8192;

/**
* TLS settings for the admin HTTP server (env vars {@code SW_ADMIN_SERVER_REST_SSL_*}).
* The certificate and key are read from disk and reloaded on rotation without a
* restart. Server-side TLS only (no mTLS).
*/
private boolean restSSLEnabled = false;
private String restSSLKeyPath = "";
private String restSSLCertChainPath = "";

/**
* Bind address for the admin-internal gRPC host that carries peer-to-peer
* cluster RPCs for admin features (dsl-debugging install/collect/stop,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,9 @@ public void prepare() {
.acceptQueueSize(moduleConfig.getAcceptQueueSize())
.idleTimeOut(moduleConfig.getIdleTimeOut())
.maxRequestHeaderSize(moduleConfig.getHttpMaxRequestHeaderSize())
.enableTLS(moduleConfig.isRestSSLEnabled())
.tlsKeyPath(moduleConfig.getRestSSLKeyPath())
.tlsCertChainPath(moduleConfig.getRestSSLCertChainPath())
.build();
httpServer = new HTTPServer(httpServerConfig);
httpServer.setBlockingTaskName("admin-http");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ public class CoreModuleConfig extends ModuleConfig {
private int restMaxThreads = 200;
private long restIdleTimeOut = 30000;
private int restAcceptQueueSize = 0;
/**
* TLS settings for the core REST server (GraphQL query API). Every OAP HTTP server
* exposes the same {@code restSSL*} structure under its own module with dedicated
* environment variables. The certificate and private key are read from disk and
* reloaded on rotation without a restart.
*/
private boolean restSSLEnabled = false;
private String restSSLKeyPath;
private String restSSLCertChainPath;

private String gRPCHost;
private int gRPCPort;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,9 @@ public void prepare() throws ServiceNotProvidedException, ModuleStartException {
moduleConfig.getRestAcceptQueueSize())
.maxRequestHeaderSize(
moduleConfig.getHttpMaxRequestHeaderSize())
.enableTLS(moduleConfig.isRestSSLEnabled())
.tlsKeyPath(moduleConfig.getRestSSLKeyPath())
.tlsCertChainPath(moduleConfig.getRestSSLCertChainPath())
.build();
setBootingParameter("oap.external.http.host", moduleConfig.getRestHost());
setBootingParameter("oap.external.http.port", moduleConfig.getRestPort());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
import com.linecorp.armeria.common.HttpMethod;
import com.linecorp.armeria.common.HttpResponse;
import com.linecorp.armeria.common.HttpStatus;
import com.linecorp.armeria.common.TlsKeyPair;
import com.linecorp.armeria.common.TlsProvider;
import com.linecorp.armeria.common.util.EventLoopGroups;
import com.linecorp.armeria.server.Route;
import com.linecorp.armeria.server.ServerBuilder;
Expand Down Expand Up @@ -114,6 +116,13 @@ public class HTTPServer implements Server {
*/
private static final EventLoopGroup SHARED_WORKER_GROUP;

/**
* How often the TLS key pair is re-read from disk. Certificates mounted from a
* Kubernetes secret are rotated in place, so the server reloads them periodically
* without a restart. Mirrors the gRPC server's file-change monitor cadence.
*/
private static final Duration TLS_RELOAD_INTERVAL = Duration.ofSeconds(10);

static {
final int cores = Runtime.getRuntime().availableProcessors();
SHARED_WORKER_GROUP = EventLoopGroups.newEventLoopGroup(Math.min(5, cores));
Expand Down Expand Up @@ -164,12 +173,11 @@ public void initialize() {
sb.https(new InetSocketAddress(
config.getHost(),
config.getPort()));
try (InputStream cert = new FileInputStream(config.getTlsCertChainPath());
InputStream key = PrivateKeyUtil.loadDecryptionKey(config.getTlsKeyPath())) {
sb.tls(cert, key);
} catch (IOException e) {
throw new IllegalArgumentException(e);
}
// Reload the key pair from disk on a schedule so rotated certificates
// (e.g. a refreshed Kubernetes secret) are picked up without a restart.
sb.tlsProvider(TlsProvider.ofScheduled(
() -> loadTlsKeyPair(config.getTlsKeyPath(), config.getTlsCertChainPath()),
TLS_RELOAD_INTERVAL));
} else {
sb.http(new InetSocketAddress(
config.getHost(),
Expand Down Expand Up @@ -237,6 +245,24 @@ public void start() {
sb.build().start().join();
}

/**
* Read the private key and certificate chain from disk into a {@link TlsKeyPair}.
* Invoked once at startup and then periodically by the TLS provider, so rotated
* files on disk are reflected on the next read.
*
* @param tlsKeyPath file path of the private key (PKCS#8 PEM or PKCS#1 DER).
* @param tlsCertChainPath file path of the X.509 certificate chain.
* @return the key pair loaded from the current contents of the files.
*/
static TlsKeyPair loadTlsKeyPair(final String tlsKeyPath, final String tlsCertChainPath) {
try (InputStream key = PrivateKeyUtil.loadDecryptionKey(tlsKeyPath);
InputStream cert = new FileInputStream(tlsCertChainPath)) {
return TlsKeyPair.of(key, cert);
} catch (IOException e) {
throw new IllegalArgumentException(e);
}
}

private String transformAbsoluteURI(final String uri) {
if (uri.startsWith("https://")) {
return uri.substring(uri.indexOf("/", 8));
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/

package org.apache.skywalking.oap.server.library.server.http;

import com.linecorp.armeria.common.TlsKeyPair;
import io.netty.handler.ssl.util.SelfSignedCertificate;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;

import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;

class HTTPServerTLSTest {

@Test
void shouldLoadKeyPairFromDisk(@TempDir Path dir) throws Exception {
final SelfSignedCertificate cert = new SelfSignedCertificate("localhost");
final Path keyPath = dir.resolve("server.key");
final Path certPath = dir.resolve("server.crt");
Files.copy(cert.privateKey().toPath(), keyPath);
Files.copy(cert.certificate().toPath(), certPath);

final TlsKeyPair keyPair =
HTTPServer.loadTlsKeyPair(keyPath.toString(), certPath.toString());

assertThat(keyPair.privateKey()).isNotNull();
assertThat(keyPair.certificateChain()).isNotEmpty();
}

/**
* The TLS provider re-invokes {@link HTTPServer#loadTlsKeyPair} on a schedule, so
* overwriting the files in place (as happens when a Kubernetes secret is rotated)
* must yield the new certificate on the next read.
*/
@Test
void shouldPickUpRotatedCertificate(@TempDir Path dir) throws Exception {
final Path keyPath = dir.resolve("server.key");
final Path certPath = dir.resolve("server.crt");

final SelfSignedCertificate first = new SelfSignedCertificate("localhost");
Files.copy(first.privateKey().toPath(), keyPath);
Files.copy(first.certificate().toPath(), certPath);
final TlsKeyPair before =
HTTPServer.loadTlsKeyPair(keyPath.toString(), certPath.toString());

// Rotate: overwrite the same paths with a freshly generated certificate.
final SelfSignedCertificate second = new SelfSignedCertificate("localhost");
Files.copy(second.privateKey().toPath(), keyPath, StandardCopyOption.REPLACE_EXISTING);
Files.copy(second.certificate().toPath(), certPath, StandardCopyOption.REPLACE_EXISTING);
final TlsKeyPair after =
HTTPServer.loadTlsKeyPair(keyPath.toString(), certPath.toString());

assertThat(after.certificateChain())
.as("rotated certificate should be read back from disk")
.isNotEqualTo(before.certificateChain());
}

@Test
void shouldFailWhenFilesMissing() {
assertThatThrownBy(() -> HTTPServer.loadTlsKeyPair("/no/such.key", "/no/such.crt"))
.isInstanceOf(IllegalArgumentException.class);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,11 @@ public class LogQLConfig extends ModuleConfig {
private String restContextPath;
private long restIdleTimeOut = 30000;
private int restAcceptQueueSize = 0;
/**
* TLS settings for this HTTP server. The certificate and key are read from disk and
* reloaded on rotation without a restart. Server-side TLS only (no mTLS).
*/
private boolean restSSLEnabled = false;
private String restSSLKeyPath;
private String restSSLCertChainPath;
}
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ public void start() throws ServiceNotProvidedException, ModuleStartException {
.contextPath(config.getRestContextPath())
.idleTimeOut(config.getRestIdleTimeOut())
.acceptQueueSize(config.getRestAcceptQueueSize())
.enableTLS(config.isRestSSLEnabled())
.tlsKeyPath(config.getRestSSLKeyPath())
.tlsCertChainPath(config.getRestSSLCertChainPath())
.build();

httpServer = new HTTPServer(httpServerConfig);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,13 @@ public class PromQLConfig extends ModuleConfig {
private String restContextPath;
private long restIdleTimeOut = 30000;
private int restAcceptQueueSize = 0;
/**
* TLS settings for this HTTP server. The certificate and key are read from disk and
* reloaded on rotation without a restart. Server-side TLS only (no mTLS).
*/
private boolean restSSLEnabled = false;
private String restSSLKeyPath;
private String restSSLCertChainPath;

// The following configs are used to build `/api/v1/status/buildinfo` API response.
private String buildInfoVersion = "2.45.0"; // Declare compatibility with 2.45 LTS version APIs.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,9 @@ public void start() throws ServiceNotProvidedException, ModuleStartException {
.contextPath(config.getRestContextPath())
.idleTimeOut(config.getRestIdleTimeOut())
.acceptQueueSize(config.getRestAcceptQueueSize())
.enableTLS(config.isRestSSLEnabled())
.tlsKeyPath(config.getRestSSLKeyPath())
.tlsCertChainPath(config.getRestSSLCertChainPath())
.build();

httpServer = new HTTPServer(httpServerConfig);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,13 @@ public class TraceQLConfig extends ModuleConfig {
private String restContextPathSkywalking;
private long restIdleTimeOut = 30000;
private int restAcceptQueueSize = 0;
/**
* TLS settings for this HTTP server. The certificate and key are read from disk and
* reloaded on rotation without a restart. Server-side TLS only (no mTLS).
*/
private boolean restSSLEnabled = false;
private String restSSLKeyPath;
private String restSSLCertChainPath;
private long lookback = 86400000L;
private String zipkinTracesListResultTags = ZIPKIN_TRACES_LIST_RESULT_TAGS;
private String skywalkingTracesListResultTags = SKYWALKING_TRACES_LIST_RESULT_TAGS;
Expand Down
Loading
Loading