Skip to content

Angular Separation and Chord Distance: fix cosine similarity calculation#142

Open
taserz wants to merge 1 commit into
evllabs:masterfrom
taserz:fix/angular-chord-cosine
Open

Angular Separation and Chord Distance: fix cosine similarity calculation#142
taserz wants to merge 1 commit into
evllabs:masterfrom
taserz:fix/angular-chord-cosine

Conversation

@taserz
Copy link
Copy Markdown

@taserz taserz commented May 12, 2026

Fixes #140

Both functions had the wrong denominator for cosine similarity. They were summing raw frequencies instead of squared frequencies, which for a normalized histogram means the denominator collapses to sum(xi) * sum(yi) = 1 * 1 = 1. Effectively both were computing the dot product with no normalization at all.

Fixed by accumulating sum(xi^2) and sum(yi^2) in the loop and using sqrt(sumXX * sumYY) as the denominator, which is the standard L2 norm approach.

Angular Separation also now returns arccos(cosine) / pi instead of 1 - cosine. The old formula is a cosine distance approximation, not an actual angular measure. The arccos version gives a properly normalized angle in the range [0, 1].

Chord Distance's sqrt(2 - 2 * cosine) formula was already mathematically correct and just needed an accurate cosine value to work. Test expected values updated to match corrected output — identical vectors now return 0.0 for both functions instead of 0.9 and sqrt(1.8).

Fixes evllabs#140. Both functions had the wrong denominator for cosine similarity.
They were accumulating the sum of raw frequencies instead of the sum of
squared frequencies. For a normalized histogram sum(xi) = 1, so the
denominator collapsed to 1 * 1 = 1 and both functions were effectively
returning the dot product with no normalization.

Fixed by accumulating sum(xi^2) and sum(yi^2) in the loop and using
sqrt(sumXX * sumYY) as the denominator, which is the standard L2 norm.

Angular Separation now returns arccos(cosine) / pi instead of 1 - cosine.
The old formula is a cosine distance approximation, not an actual angular
measure. The arccos version gives a properly normalized angle in [0, 1].

Chord Distance's sqrt(2 - 2 * cosine) formula was already mathematically
correct and just needed an accurate cosine value to work properly.

Test expected values updated: identical vectors now correctly return 0.0
for both functions instead of 0.9 and sqrt(1.8).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@taserz taserz force-pushed the fix/angular-chord-cosine branch from c1cec4d to b01052a Compare May 12, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Angular Separation and Chord Distance

1 participant