Skip to content

Intermittent failures causing node to panic #443

@danielloader

Description

@danielloader

Bit of a strange one but figured I'd ask, I've got an issue where if a nodejs microservice is idle for some time (not known how long yet for a pattern) and then receives a request which requires a hit on the /zitadel.management.v1.ManagementService/GetMyOrg method, there's a high chance it'll crash the nodejs process with an uncaught exception.

  • Happens using Node 22 and Node 20
  • Happens with the latest @zitadel/node package version.

If you need any more information I'll try to provide it for you.

I've enabled the following environment variables for information:

GRPC_TRACE=all
GRPC_VERBOSITY=DEBUG

Debug level log for the failed call:

D 2024-08-21T12:49:03.834Z | v1.10.9 28 | channel | (1) dns:login.example.io:443 createResolvingCall [4600] method="/zitadel.management.v1.ManagementService/GetMyOrg", deadline=Infinity
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] Created
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] Deadline: Infinity
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] start called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] startRead called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] write() called with message of length 0
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] halfClose called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | channel | (1) dns:login.quotech.io:443 createRetryingCall [4601] method="/zitadel.management.v1.ManagementService/GetMyOrg"
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | resolving_call | [4600] Created child [4601]
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | retrying_call | [4601] start called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | channel | (1) dns:login.example.io:443 createLoadBalancingCall [4602] method="/zitadel.management.v1.ManagementService/GetMyOrg"
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | retrying_call | [4601] Created child call [4602] for attempt 1
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | load_balancing_call | [4602] start called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | load_balancing_call | [4602] Pick called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | load_balancing_call | [4602] Pick result: COMPLETE subchannel: (8) 104.21.88.227:443 status: undefined undefined
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | retrying_call | [4601] startRead called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | load_balancing_call | [4602] startRead called
D 2024-08-21T12:49:03.834Z | v1.10.9 28 | transport_flowctrl | (3) 104.21.88.227:443 local window size: 61879 remote window size: 2147483407
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | transport_internals | (3) 104.21.88.227:443 session.closed=false session.destroyed=false session.socket.destroyed=false
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | load_balancing_call | [4602] Created child call [4603]
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | retrying_call | [4601] write() called with message of length 5
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | load_balancing_call | [4602] write() called with message of length 5
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | subchannel_call | [4603] write() called with message of length 5
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | subchannel_call | [4603] sending data chunk of length 5
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | retrying_call | [4601] halfClose called
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | load_balancing_call | [4602] halfClose called
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | subchannel_call | [4603] end() called
D 2024-08-21T12:49:03.835Z | v1.10.9 28 | subchannel_call | [4603] calling end() on HTTP/2 stream
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | subchannel_call | [4603] HTTP/2 stream closed with code 8
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | subchannel_call | [4603] ended with status: code=1 details="Call cancelled"
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | load_balancing_call | [4602] Received status
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | load_balancing_call | [4602] ended with status: code=1 details="Call cancelled" start time=2024-08-21T12:49:03.834Z
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | retrying_call | [4601] Received status from child [4602]
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | retrying_call | [4601] state=TRANSPARENT_ONLY handling status with progress PROCESSED from child [4602] in state ACTIVE
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | retrying_call | [4601] ended with status: code=1 details="Call cancelled" start time=2024-08-21T12:49:03.834Z
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | resolving_call | [4600] Received status
D 2024-08-21T12:49:03.836Z | v1.10.9 28 | resolving_call | [4600] ended with status: code=1 details="Call cancelled"
node:internal/process/promises:391
triggerUncaughtException(err, true /* fromPromise */);
^
ClientError: /zitadel.management.v1.ManagementService/GetMyOrg CANCELLED: Call cancelled
at wrapClientError (/app/node_modules/nice-grpc/lib/client/wrapClientError.js:9:16)
at Object.callback (/app/node_modules/nice-grpc/lib/client/createUnaryMethod.js:27:66)
at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:193:36)
at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)
at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)
at /app/node_modules/@grpc/grpc-js/build/src/resolving-call.js:129:78
at process.processTicksAndRejections (node:internal/process/task_queues:77:11) {
path: '/zitadel.management.v1.ManagementService/GetMyOrg',
code: 1,
details: 'Call cancelled'
}
Node.js v22.6.0
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Debug level log for a successful call of the same endpoint (after hitting refresh a few times after the container is live again):

D 2024-08-21T12:53:49.368Z | v1.10.9 28 | channel | (1) dns:login.example.io:443 createResolvingCall [12] method="/zitadel.management.v1.ManagementService/ListUserMemberships", deadline=Infinity
D 2024-08-21T12:53:49.368Z | v1.10.9 28 | resolving_call | [12] Created
D 2024-08-21T12:53:49.368Z | v1.10.9 28 | resolving_call | [12] Deadline: Infinity
D 2024-08-21T12:53:49.368Z | v1.10.9 28 | resolving_call | [12] start called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | resolving_call | [12] startRead called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | resolving_call | [12] write() called with message of length 20
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | resolving_call | [12] halfClose called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | channel | (1) dns:login.example.io:443 createRetryingCall [13] method="/zitadel.management.v1.ManagementService/ListUserMemberships"
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | resolving_call | [12] Created child [13]
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | retrying_call | [13] start called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | channel | (1) dns:login.example.io:443 createLoadBalancingCall [14] method="/zitadel.management.v1.ManagementService/ListUserMemberships"
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | retrying_call | [13] Created child call [14] for attempt 1
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] start called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] Pick called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] Pick result: COMPLETE subchannel: (1) 104.21.88.227:443 status: undefined undefined
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | retrying_call | [13] startRead called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] startRead called
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | transport_flowctrl | (1) 104.21.88.227:443 local window size: 64976 remote window size: 2147483612
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | transport_internals | (1) 104.21.88.227:443 session.closed=false session.destroyed=false session.socket.destroyed=false
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] Created child call [15]
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | retrying_call | [13] write() called with message of length 25
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | load_balancing_call | [14] write() called with message of length 25
D 2024-08-21T12:53:49.369Z | v1.10.9 28 | subchannel_call | [15] write() called with message of length 25
D 2024-08-21T12:53:49.370Z | v1.10.9 28 | subchannel_call | [15] sending data chunk of length 25
D 2024-08-21T12:53:49.370Z | v1.10.9 28 | retrying_call | [13] halfClose called
D 2024-08-21T12:53:49.370Z | v1.10.9 28 | load_balancing_call | [14] halfClose called
D 2024-08-21T12:53:49.370Z | v1.10.9 28 | subchannel_call | [15] end() called
D 2024-08-21T12:53:49.370Z | v1.10.9 28 | subchannel_call | [15] calling end() on HTTP/2 stream
D 2024-08-21T12:53:49.416Z | v1.10.9 28 | subchannel_call | [15] Received server headers:
:status: 200
date: Wed, 21 Aug 2024 12:53:49 GMT
content-type: application/grpc
cache-control: no-store
expires: Wed, 21 Aug 2024 12:53:49 GMT
grpc-metadata-cache-control: no-store
grpc-metadata-expires: Wed, 21 Aug 2024 12:53:49 GMT
grpc-metadata-pragma: no-cache
pragma: no-cache
trailer: Grpc-Status, Grpc-Message, Grpc-Status-Details-Bin
x-robots-tag: none
traceparent: 00-18f66a253324a3a3908ec887209acdf3-83531992162624ce-00
x-cloud-trace-context: 18f66a253324a3a3908ec887209acdf3/9462935357263062222
via: 1.1 google
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-cache-hit: uncacheable
alt-svc: h3=":443"; ma=86400
cf-cache-status: DYNAMIC
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=qCHEhO4Kiu8j1%2Ff7KzH4dNSa9W2ZC3cy3Voh5Ii%2BKyYQu4irGrfEnxa9RvQ4QgrEr34Du7W8Ql3RcaPgZl4ewrl%2FzUuNbq%2FlMVWmxR7jFvTtP6Aa7CKKIED5HKGZVyf45WzgG0CiLw%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
expect-ct: max-age=86400, enforce
referrer-policy: same-origin
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
server: cloudflare
cf-ray: 8b6ac4a78b5c6545-LHR
D 2024-08-21T12:53:49.416Z | v1.10.9 28 | load_balancing_call | [14] Received metadata
D 2024-08-21T12:53:49.416Z | v1.10.9 28 | retrying_call | [13] Received metadata from child [14]
D 2024-08-21T12:53:49.416Z | v1.10.9 28 | retrying_call | [13] Committing call [14] at index 0
D 2024-08-21T12:53:49.416Z | v1.10.9 28 | resolving_call | [12] Received metadata
D 2024-08-21T12:53:49.474Z | v1.10.9 28 | subchannel_call | [15] receive HTTP/2 data frame of length 355
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | subchannel_call | [15] parsed message of length 355
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | subchannel_call | [15] pushing to reader message of length 355
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | load_balancing_call | [14] Received message
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | retrying_call | [13] Received message from child [14]
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | resolving_call | [12] Received message
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | resolving_call | [12] Finished filtering received message
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | subchannel_call | [15] Received server trailers:
grpc-status: 0
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | subchannel_call | [15] received status code 0 from server
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | subchannel_call | [15] ended with status: code=0 details=""
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | load_balancing_call | [14] Received status
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | load_balancing_call | [14] ended with status: code=0 details="" start time=2024-08-21T12:53:49.369Z
D 2024-08-21T12:53:49.475Z | v1.10.9 28 | retrying_call | [13] Received status from child [14]
D 2024-08-21T12:53:49.476Z | v1.10.9 28 | retrying_call | [13] state=COMMITTED handling status with progress PROCESSED from child [14] in state ACTIVE
D 2024-08-21T12:53:49.476Z | v1.10.9 28 | retrying_call | [13] ended with status: code=0 details="" start time=2024-08-21T12:53:49.369Z
D 2024-08-21T12:53:49.476Z | v1.10.9 28 | resolving_call | [12] Received status
D 2024-08-21T12:53:49.476Z | v1.10.9 28 | resolving_call | [12] ended with status: code=0 details=""
D 2024-08-21T12:53:49.476Z | v1.10.9 28 | subchannel_call | [15] HTTP/2 stream closed with code 0
{"level":30,"time":1724244829480,"pid":28,"hostname":"risks-586d7dd84b-x9lhh","req":{"id":"afbc2f33-d4ff-47ac-989b-8e1c8a4357f0","method":"GET","url":"/insurance-products"},"res":{"statusCode":200},"responseTime":4,"msg":"request completed"}
D 2024-08-21T12:53:58.862Z | v1.10.9 28 | subchannel_refcount | (2) [2606:4700:3031::6815:58e3]:443 refcount 1 -> 0
D 2024-08-21T12:53:58.862Z | v1.10.9 28 | subchannel_refcount | (3) 172.67.153.180:443 refcount 1 -> 0
D 2024-08-21T12:53:58.862Z | v1.10.9 28 | subchannel_refcount | (4) [2606:4700:3033::ac43:99b4]:443 refcount 1 -> 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions