-
-
Notifications
You must be signed in to change notification settings - Fork 77
Add overloads for overdetermined system derivatives #842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
tests fail |
|
@ChrisRackauckas only failure I see that could come from this is the adjoint test for lts, but as far as I can tell those are all square systems, so this PR shouldn't affect those. |
ext/LinearSolveForwardDiffExt.jl
Outdated
|
|
||
| # Solve A'A · dx/dθ = rhs for each partial | ||
| # Create a cache for the normal equations and reuse the factorization | ||
| AtA = A' * A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is numerically unstable. Instead just solve A * dx/dθ = rhs, the methods like QR factorization and GMRES will naturally do this without squaring the eigenvalues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of rows don't match in that case though?
infil> rhs_list[1]
3-element Vector{Float64}:
-0.49033292918119054
-0.4863208414656224
-0.49187865599734093
infil> A
10×3 Matrix{Float64}:
0.8753 0.651302 0.9616
0.636808 0.304259 0.0872156
0.542556 0.359615 0.943729
0.0258567 0.0241303 0.070481
0.275383 0.282139 0.866992
0.0623616 0.338967 0.687406
0.481159 0.0197438 0.996632
0.905349 0.555788 0.215757
0.0858424 0.944214 0.506554
0.452954 0.033606 0.38146There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's just A' \ rhs_list[i], which then solves via QR/GMRES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think I get it. Is it better now?
ext/LinearSolveForwardDiffExt.jl
Outdated
| A = cache.linear_cache.A | ||
| is_overdetermined = size(A, 1) > size(A, 2) | ||
|
|
||
| if is_overdetermined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between this and the other branch? Can't they just be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't getting correct derivatives from the other branch.
The original branch is based off of A(p)x(p) = b(p). You can take the derivative of both sides, use the product rule, and rearrange to get a linear solve for dx/dp.
But when we do a \ or a solve a LinearProblem that's overdetermined/underdetermined, we're not really solving Ax=b, because you get an x that doesn't satisfy the equation.
In those cases we're actually getting an answer for A'(p)A(p)x(p) = A'(p)b(p). So we need to do the derivative calculation based on that, so you get a different linear system to solve to find dx/dp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch should give the same answer for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to just check !issquare(A) now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, because the not square branch has to do two linear solves for each partial column, but that's unnecessary for the square case.
Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Additional context
Add any other context about the problem here.