Skip to content

Conversation

@Pranav2612000
Copy link

@Pranav2612000 Pranav2612000 commented Dec 30, 2025

According to golang/go#11100 (comment), the -Wl,-z,nodelete does not work on macos. This causes dlclose to be called when the shared library is dropped, which results in unexpected behaviour ( I've observed the program hangs because Go cannot kill the goroutines and so waits for them to close )

On linux, since the -Wl,-z,nodelete is respected, we do not see this issue.

This PR fixes the problem by ensuring that we explicitly open the library with RTLD_NODELETE flag which ensures that dlclose is not called during unload

Fixes #3840

@Pranav2612000
Copy link
Author

Thank you @savannahar68 for debugging the issue and helping with the fix

@Pranav2612000 Pranav2612000 force-pushed the fix/fix-hang-during-dlopen-mac branch from 5fdead7 to 4570524 Compare December 30, 2025 16:13
// By default, go builds the libraries with '-Wl -z nodelete' which does not
// unload the go runtime. This isn't respected on mac ( https://github.com/golang/go/issues/11100#issuecomment-932638093 )
// so we need to explicitly load the library with RTLD_NODELETE( which prevents unloading )
if cfg!(target_os = "macos") {
Copy link
Contributor

@eitsupi eitsupi Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the cfg attribute #[cfg(...)] is better than cfg! here.
(In this case, cfg! case will compile on any OS.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Updated the code to use #[cfg(...)]

@Pranav2612000 Pranav2612000 force-pushed the fix/fix-hang-during-dlopen-mac branch from 4570524 to 0bfa253 Compare December 30, 2025 16:46
Comment on lines 121 to 123
#[cfg(target_os = "macos")]
use libloading::os::unix::Library;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(target_os = "macos")]
use libloading::os::unix::Library;

I think we can remove this and just qualify the identifier below.

Comment on lines 395 to 400
Library::open(
Some(filename.as_ref()),
RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE,
)
.map(Into::into)
.map_err(libloading_error_to_adbc_error)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Library::open(
Some(filename.as_ref()),
RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE,
)
.map(Into::into)
.map_err(libloading_error_to_adbc_error)?
libloading::os::unix::Library::open(
Some(filename.as_ref()),
RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE,
)
.map_err(libloading_error_to_adbc_error)?
.into()

This feels a little simpler and more idiomatic.

Comment on lines 392 to 394
const RTLD_LAZY: i32 = 0x1;
const RTLD_LOCAL: i32 = 0x4;
const RTLD_NODELETE: i32 = 0x80;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could pull these from the libc crate, is it worth it though? cc @eitsupi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that RTLD_LAZY and RTLD_LOCAL are also defined in libloading? 🤔
https://github.com/nagisa/rust_libloading/blob/dab97c569b33bd515e16637b8dedbdc696d9ec9c/src/os/unix/consts.rs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Only those two, but not RTLD_NODELETE. I don't feel super stressed about hard-coding them here to avoid taking on libc as a dep.

@eitsupi
Copy link
Contributor

eitsupi commented Dec 30, 2025

I don't fully understand the nature of the problem, so this may be off the mark, but is this a problem specific to the Rust Driver Manager?
Is there no need to address this in implementations in other languages?

(Just to be clear, I'm not saying everything should be addressed in this PR, just a question.)

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we should just std::mem::forget the library (bypassing its Drop implementation) and not dlclose anything at all - it's just not safe. (Or remove the platform-specific cfg and do this for all platforms.)

@Pranav2612000
Copy link
Author

Pranav2612000 commented Dec 31, 2025

I don't fully understand the nature of the problem, so this may be off the mark, but is this a problem specific to the Rust Driver Manager? Is there no need to address this in implementations in other languages?

(Just to be clear, I'm not saying everything should be addressed in this PR, just a question.)

I haven't tried with other languages, but yeah, we'll probably need to do the same for other languages as well. I'm thinking of getting the rust fix in first. Doing the same for other languages should be trivial.

@Pranav2612000
Copy link
Author

Actually, I think we should just std::mem::forget the library (bypassing its Drop implementation) and not dlclose anything at all - it's just not safe. (Or remove the platform-specific cfg and do this for all platforms.)

I've tested with std:mem::forget and it works too. I'm don't have a strong preference to any approach. Do you think we should use the drop approach? I'll modify the PR to do this.

@lidavidm
Copy link
Member

IIRC, we don't dlclose in the C++ driver manager, or C#. So Rust may be the only one affected.

forget does technically introduce a small leak - I think RTLD_NODELETE is preferable. Maybe (separately) we can try to get this flag defined upstream?

@Pranav2612000
Copy link
Author

Makes sense. I'll open a separate PR upstream soon to add support for RTLD_NODELETE.

Are there any other changes I should do to make this PR mergeable

@lidavidm
Copy link
Member

I think we can remove the #[cfg] and just have this behavior on all platforms (maybe it doesn't work on Windows?)

@Pranav2612000
Copy link
Author

I'm not sure about windows. Let me see the current implementation and try te replicate it.

@Pranav2612000
Copy link
Author

I think we can remove the #[cfg] and just have this behavior on all platforms (maybe it doesn't work on Windows?)

Yeah. Libloading does have separate definitions for windows and unix ( https://github.com/nagisa/rust_libloading/blob/dab97c569b33bd515e16637b8dedbdc696d9ec9c/src/safe.rs#L6 ).
While we get the changes merged upstream, I'm updating this PR to mirror this - cfg conditions on whether the os is windows or unix ( rather than on mac )

@Pranav2612000 Pranav2612000 force-pushed the fix/fix-hang-during-dlopen-mac branch from 0bfa253 to 252ccd3 Compare December 31, 2025 10:55
@Pranav2612000
Copy link
Author

Pushed. Can you folks take another look? Meanwhile, I'll work on getting the flag, and some other improvements merged upstream to libloading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rust driver_manager: initializing the driver creates some dangling resources which hangs application during shutdown

4 participants