chore: Updating VectorStore batch size to improve performance#182
chore: Updating VectorStore batch size to improve performance#182jamie-ons wants to merge 6 commits into
Conversation
…ngle source of truth
|
|
||
| return result_df | ||
|
|
||
| def search(self, query: VectorStoreSearchInput, n_results=10, batch_size=8) -> VectorStoreSearchOutput: # noqa: C901, PLR0912, PLR0915 |
There was a problem hiding this comment.
I think we'd like to retain the option for users to specify a different batch size at this point, but we'd want the default behaviour to follow the single source of truth.
|
A few things to note for the updates; We're moving to having all (default) batch sizes inherit from the VectorStore's - so we'll need to make sure that
|
|
Have we done any testing for this with the On-Net machines? If not it would be a good idea to test and confirm these findings since our current main user base use these machines |
Yes - the mac in the graph is the on net machine. It performs best on the on net machine which is good.
If you mean the Thinkpad then as the compute of the Thinkpad is far greater than the chosen GCP instances I would assume It will also perform well. |

✨ Summary
VectorStore previously exposed batch_size as a repeated parameter on individual methods, creating multiple independent sources of truth. This PR consolidates that to a single value set at construction time.
To inform the choice of default, a profiling analysis was run across the target GCP instance range at batch sizes from 2 to 250. The default has been updated to the value that minimises search time without risking OOM on the smallest supported instances.
Constraints: must not break or perform significantly worse on 2 vCPU instances; optimised for typical cloud deployments at 4–8 vCPUs.
📜 Changes Introduced
✅ Checklist
🔍 How to Test
DEMO/general_workflow_demo.ipynb, confirm it all runs as usual with a batch size of 250.Additionally test this on one gcp instance (workstation or other) and locally to ensure it works on both hardwares.