feat(medcat): CU-869cw9zmj Improve inference speed by mart-r · Pull Request #410 · CogStack/cogstack-nlp

mart-r · 2026-04-13T15:24:50Z

This PR improves inference speed somewhat (around 10%) by:

~~Using a more efficient unitvec calculation~~
- This gave a nice boost to speed (around 20% on this segment)
- But it was doing the change in place so that polluted other bits which broke a bunch of stuff
- However, this speedup came from a very small section of the overall time
- So the overall speedup of this would have been closer to 1%
Reusing smaller context vectors mutiple times
- We've cont multiple context windows (small, medium, large, extra larget)
- So far we've been recalculating the vectors for all of these for every context window
- However, the smaller ones will always need to be a subset of the bigger ones
- So this change will take advantage of that

This reverts commit 7856c12.

This reverts commit 12b7c52.

alhendrickson · 2026-04-24T08:43:35Z

+            prev_right = tokens_right
+
+            # Center is identical for all window sizes, only compute once
+            if center_vecs is None:


Minor feeling that this might flow better - can do centre tokens on it's own outside of the loop?

Instead of every window doing the check.

prev_left_vecs: list[np.ndarray] = [] prev_right_vecs: list[np.ndarray] = [] # Calc center vecs once outside the loop? if not self.config.context_ignore_center_tokens: tokens_center = self.get_context_tokens( entity, doc, sorted_contexts[0], per_doc_valid_token_cache)[1] # or whatever is the best way center_vecs = list( self._preprocess_center_tokens(cui, tokens_center)) else: center_vecs = [] for context_type, window_size in sorted_contexts: # do everything other than the center_vecs calculation

Yeah, good point. If it's only done once, why do it in the loop at all!

Well, turns out the reason is that the centre tokens aren't available before you get to the loop!
But I do think it logically belongs outside the loop still.

… loop

github-actions Bot added 7 commits April 13, 2026 14:57

CU-869cw9zmj: Use faster way to calculate unit vector

12b7c52

CU-869cw9zmj: Speed up context vector obtaining

93fe94c

CU-869ctq789: Avoid leaking normalized vectors

7856c12

Revert "CU-869ctq789: Avoid leaking normalized vectors"

7759c65

This reverts commit 7856c12.

Revert "CU-869cw9zmj: Use faster way to calculate unit vector"

916b7a5

This reverts commit 12b7c52.

CU-869cw9zmj: Fix usage of tokens in the correct order

b391b25

CU-869cw9zmj: Add small comment on left token slicing

201c94e

alhendrickson reviewed Apr 24, 2026

View reviewed changes

alhendrickson approved these changes Apr 24, 2026

View reviewed changes

github-actions Bot added 3 commits April 24, 2026 10:00

CU-869cw9zmj: Separate centre context vectors calculation outside the…

675d5b4

… loop

CU-869cw9zmj: Get centre tokens separately only if they're required

e3c5fa2

CU-869cw9zmj: Fix linting issue

0281c22

mart-r merged commit cfdbab0 into main Apr 24, 2026
22 checks passed

mart-r deleted the feat/medcat/CU-869cw9zmj-improve-inference-speed branch April 24, 2026 11:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(medcat): CU-869cw9zmj Improve inference speed#410

feat(medcat): CU-869cw9zmj Improve inference speed#410
mart-r merged 10 commits intomainfrom
feat/medcat/CU-869cw9zmj-improve-inference-speed

mart-r commented Apr 13, 2026 •

edited

Loading

Uh oh!

alhendrickson Apr 24, 2026

Uh oh!

mart-r Apr 24, 2026

Uh oh!

mart-r Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mart-r commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alhendrickson Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

mart-r Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

mart-r Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mart-r commented Apr 13, 2026 •

edited

Loading