GH-50009: [R] FinalizeS3 segfaults for stale connection#50081
GH-50009: [R] FinalizeS3 segfaults for stale connection#50081thisisnic wants to merge 2 commits into
Conversation
|
|
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent R session crashes/segfaults on S3 shutdown by ensuring the AWS SDK installs a SIGPIPE handler during S3 initialization, addressing stale-connection SIGPIPE behavior reported in GH-50009 (and related SIGPIPE reports like #32026).
Changes:
- Switch R’s
S3FileSystemcreation path fromEnsureS3Initialized()toInitializeS3()withinstall_sigpipe_handler = true. - Tolerate
InitializeS3()returningInvalidwhen S3 is already initialized.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // We need to ensure that S3 is initialized before we start messing with the | ||
| // options | ||
| StopIfNotOk(fs::EnsureS3Initialized()); | ||
| // options. We use InitializeS3() rather than EnsureS3Initialized() so we can | ||
| // enable the SIGPIPE handler - without it, stale connections in the SDK's | ||
| // connection pool can trigger SIGPIPE during Aws::ShutdownAPI(), which causes | ||
| // R's signal handler to longjmp out of the teardown and segfault (GH-50009). | ||
| fs::S3GlobalOptions options = fs::S3GlobalOptions::Defaults(); | ||
| options.install_sigpipe_handler = true; | ||
| auto status = fs::InitializeS3(options); | ||
| // InitializeS3 returns Invalid if already initialized - that's fine | ||
| if (!status.ok() && !fs::IsS3Initialized()) { | ||
| StopIfNotOk(status); | ||
| } |
jonkeane
left a comment
There was a problem hiding this comment.
Is it possible to write a test that triggers that segfault? It might be too tricky, but it would be lovely to know that we don't accidentally revert this behavior (for example if the options elsewhere change).
Otherwise this looks good
|
This is super hard to test as I've tried reproducing it but am having difficulty doing so. I'm a bit unsure about merging this now. For: the change is reasonably well supported by similar issues folks have had, tests pass on CI Against: I'm unable to reproduce the original issue and haven't tested the code beyond CI. I think maybe let's merge? |
Rationale for this change
User experiences issues with process crashing when reading/writing from S3. Looks like a stale connection and sigpipe stuff. See also #32026
What changes are included in this PR?
Install sigpipe handler upon S3 initialisation so it'll not kill the process.
Are these changes tested?
No - and I'm not sure how I can really test this out.
Are there any user-facing changes?
No