-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(NODE-7121): prevent connection churn on backpressure errors when establishing connections #4800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f4f55f6 to
e14a3ef
Compare
ad9299d to
151b986
Compare
541a8e0 to
70c1333
Compare
70c1333 to
d3ec74a
Compare
test/integration/server-discovery-and-monitoring/server_discovery_and_monitoring.prose.test.ts
Outdated
Show resolved
Hide resolved
test/unit/assorted/server_discovery_and_monitoring.spec.test.ts
Outdated
Show resolved
Hide resolved
|
This file is missing from the sync: |
tadjik1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @baileympearson, great work on syncing and improving tests! Just a few small comments from my side.
| "servers": { | ||
| "a:27017": { | ||
| "type": "Unknown", | ||
| "topologyVersion": null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a corresponding updated yml file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I missed this? I copy the full directory in when syncing tests 🤷
Fixed.
src/cmap/connect.ts
Outdated
| if (error instanceof MongoError) { | ||
| error.addErrorLabel(MongoErrorLabel.HandshakeError); | ||
| } | ||
| // If we encounter an error executing the initial handshake, apply backpressure labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec says that the labels should be applied "to network errors or network timeouts" - can we update the comment accordingly? And, in this case, are we expecting all errors to be network errors? is it worth explicitly checking to make sure that the error is a network error?
There's also this wording:
For errors that the driver can distinguish as never occurring due to server overload,
such as DNS lookup failures, TLS related errors, or errors encountered establishing a connection to a socks5 proxy,
the driver MUST clear the connection pool and MUST mark the server Unknown for these error types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are non network errors possible here but they're very unlikely and/or would show up and be handled in non-prod environments.
some examples:
- MongoParseErrors, if we fail to parse a wire protocol response (arguably a network error?)
- Server errors, if the handshake doc is invalid (would most likely show up non-prod, would also prevent any connections being established and connect would fail).
I've gone ahead and just filtered out only network errors though (network timeout errors subclass network errors) though.
src/cmap/connect.ts
Outdated
| socket = await connectedSocket; | ||
| return socket; | ||
| } catch (error) { | ||
| // If we encounter a SystemOverloaded error while establishing a socket, apply the backpressure labels to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the above, the spec says that the labels should be applied "to network errors or network timeouts" - can we update this comment accordingly? I assume in this case we do expect all errors to be network errors, but is it worth double checking that this is the case and that there aren't any that we could reliably determine aren't due to server overload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've clarified the comment here. lmk what you think
| }); | ||
| }); | ||
|
|
||
| context('Connection Pool Backpressure', function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could we annotate the test with the corresponding prose wording from the spec (as we do for other prose tests to make them easier to follow & audit)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tadjik1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @baileympearson!
Description
Summary of Changes
This PR adopts the backpressure changes from mongodb/specifications#1860 and mongodb/specifications#1855.
Notes for Reviewers
What is the motivation for this change?
Release Highlight
Release notes highlight
Double check the following
npm run check:lint)type(NODE-xxxx)[!]: descriptionfeat(NODE-1234)!: rewriting everything in coffeescript