Skip to content

Discussion: Force global DB when using a single thread anyway#179

Merged
staticlibs merged 1 commit intoduckdb:mainfrom
jprafael:jprafael/force_global_db
Mar 16, 2026
Merged

Discussion: Force global DB when using a single thread anyway#179
staticlibs merged 1 commit intoduckdb:mainfrom
jprafael:jprafael/force_global_db

Conversation

@jprafael
Copy link
Copy Markdown
Contributor

Context
This is a follow-up to #177.

In my setup that PR was enough to produce the reported 10x speedup.
After some tweaking in "user" code, the speedup was no longer observable.
My current observation is it now considers the running transaction as transaction->IsReadOnly() which causes the GetScanFunction(...) to not use the global_db. Instead each worker creates its own database, which 1) has even more overhead and 2) causes the cache to have worse results.

Discussion
I've tested removing the if statement and always setting the global_db, and that brings back the 10x speedup. The scenario of always reusing the global db instead of recreating new connections for each worker is clearly beneficial for my use case.

I've added in this PR a half-way solution that checks if the code is going to run in single threaded mode anyway and if so avoids the re-connection. While that is probably good (or at least not bad) for everyone, i'm not sure if that is the best option.

Alternatives include:

  • Always use the global db. Is there a benchmark suite that covers the benefit of this implementation?
  • Allow the caller to specify behavior with a flag and then ducklake would use that for the short-lived metadata reads.
  • The implementation in this PR

Any suggestions? @staticlibs

@staticlibs
Copy link
Copy Markdown
Collaborator

@jprafael

Hi, thanks for the PR!

Is there a benchmark suite that covers the benefit of this implementation

There is no SQLite testing code I am aware of outside of this repo, so I think the answer is "no".

The PR in its current form makes sense to me, as it avoids some overhead when running single threaded. But I would think that the vast majority of the usage will be multithreaded just because it is the default. Whether many user rely on multi-threaded SQLite scans - this is an open question, I would think some do rely on parallel scans.

I think variant 2 from your list - adding an option that allows to force global_db and keeping this option false by default may be the best here.

BTW we are now under time pressure to get the SQLite changes into 1.5.1, basically the changes need to be wrapped today, or tomorrow the latest. So if you can add a new option check (in addition to "threads" check) to this PR - I can merge it promptly and finalize this for 1.5.1.

@jprafael
Copy link
Copy Markdown
Contributor Author

@staticlibs fyi I won't be able to get that through today.

@staticlibs
Copy link
Copy Markdown
Collaborator

@jprafael

Would it be okay if I merge this PR as is and then add a new option check to the same if check as a follow-up commit?

@jprafael
Copy link
Copy Markdown
Contributor Author

Yes, that would be perfect! thank you

@staticlibs staticlibs merged commit e4769ec into duckdb:main Mar 16, 2026
27 checks passed
staticlibs added a commit to staticlibs/duckdb-sqlite that referenced this pull request Mar 16, 2026
This PR is a follow-up to the duckdb#179 PR. It adds an option:

```sql
SET sqlite_disable_multithreaded_scans = FALSE;
```

When enabled, it has the same effect on SQLite scans as the setting
`threads = 1`;

Testing: basic check added that the option can be set.
staticlibs added a commit that referenced this pull request Mar 16, 2026
This PR is a follow-up to the #179 PR. It adds an option:

```sql
SET sqlite_disable_multithreaded_scans = FALSE;
```

When enabled, it has the same effect on SQLite scans as the setting
`threads = 1`;

Testing: basic check added that the option can be set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants