Skip to content

Explain cost function in README#185

Merged
sebschmi merged 5 commits into
mainfrom
readme-cost-function
May 19, 2026
Merged

Explain cost function in README#185
sebschmi merged 5 commits into
mainfrom
readme-cost-function

Conversation

@sebschmi
Copy link
Copy Markdown
Collaborator

Added detailed instructions for modifying the cost function in tsalign, including TSM base costs, jump costs, and gap-affine edit costs.

Added detailed instructions for modifying the cost function in tsalign, including TSM base costs, jump costs, and gap-affine edit costs.
Copilot AI review requested due to automatic review settings May 19, 2026 09:42
Updated README to link to main repository for cost function parameters.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the README documentation to explain how to customize tsalign’s alignment cost model, including TSM base costs, geometry/jump costs, and the different gap-affine edit cost sections used across alignment regions.

Changes:

  • Adds a new “Modifying the cost function” section with a step-by-step walkthrough of config.tsa.
  • Documents TSM base cost naming, jump cost functions (including constraints), and gap-affine edit cost tables/vectors.
  • Updates the Features bullet to reference the four-point model paper.
Comments suppressed due to low confidence (3)

README.md:133

  • Typo: “referes” should be “refers”.
In direction, the letter `f` referes to a repeat, and `r` to a TSM.

README.md:153

  • Grammar: “the first input value that the constant cost applies” is missing “to” (e.g., “applies to”). Consider rephrasing this sentence for clarity about what each row represents.
Each TSM additionally incurs cost based on its geometry.
The costs are a piecewise constant function, where the first row is the first input value that the constant cost applies, and the second row is the constant cost.
The cost functions must be V-shaped, i.e. there must be some input value X such that the function is non-ascending before X and non-descending after X.

README.md:159

  • Grammar/pluralization: these sentences read incorrectly (“Length are…”, “LengthDifference are…”, “ForwardAntiPrimaryGap are…”). Consider using singular phrasing (e.g., “Length is…”) or “ costs are…”.
`Length` are costs based on the length of the 2-3-alignment of the TSM.
`LengthDifference` are costs based on the difference between the length of the 2-3-alignment and the difference between points 1 and 4.
`ForwardAntiPrimaryGap` are costs based on the difference between points 1 and 4, specifically `SP4 - SP1`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

README.md:153

  • The cost-function format has additional strict constraints that aren’t mentioned here: parsing requires the first index to equal the type’s minimum value (e.g. -inf for isize-based functions and 0 for usize-based ones like Length), indices must be strictly increasing, and the number of indices must match the number of costs. Calling these out would help users avoid configs that fail to parse/verify.
Each TSM additionally incurs cost based on its geometry.
The costs are a piecewise constant function, where the first row is the first input value that the constant cost applies to, and the second row is the constant cost.
The cost functions must be V-shaped, i.e. there must be some input value X such that the function is non-ascending before X and non-descending after X.

Comment thread README.md Outdated
Comment thread README.md Outdated
sebschmi and others added 2 commits May 19, 2026 12:54
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@sebschmi sebschmi enabled auto-merge May 19, 2026 09:55
@sebschmi sebschmi merged commit 734bd50 into main May 19, 2026
28 checks passed
@sebschmi sebschmi deleted the readme-cost-function branch May 19, 2026 09:57
@jeeeesper
Copy link
Copy Markdown
Collaborator

@sebschmi Do you think it would make sense to cross-link this from https://version.helsinki.fi/kraujasp/twitcher/-/blob/main/docs/costs.md and vice-versa? Or would it be best to maintain both versions independently as they target slightly different user profiles? Or that we potentially align the technical vocabulary of the two versions a bit?

@sebschmi
Copy link
Copy Markdown
Collaborator Author

Ah, I totally forgot that you wrote a nice description already. Ideally the descriptions would be the same. We also thought about making an mdbook or so for tsalign, and we could even make a combined one with twitcher, and host that on readthedocs or so.

About the vocabulary, the terms chosen in tsalign are often not very accurate or contradict definitions from the original TSM papers. I wanted to change that at some point, and probably will do so for writing this book. With that, I could also make the cost function definition file into a better and more flexible and human writable standard file format such as TOML.

Well, there is a large front-end makeover necessary for tsalign. So let's delay this a bit, and I will continue working on this in the next month or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants