Add more brokers (Degiro, Revolut, eToro) + interest rate taxes from abroad (revolut, VUB)#10
Add more brokers (Degiro, Revolut, eToro) + interest rate taxes from abroad (revolut, VUB)#10BJMg wants to merge 31 commits intodstrupl:mainfrom
Conversation
…nts from multiple employers and brokers
The upstream fix in a42f282 added a test that asserted on the single-currency DividendReport shape (printableDividendList, totalTaxDollar, totalTaxCrown directly on the report). After rebasing the multi-currency refactor on top, those fields live on DividendReport.sections[Currency.USD] instead, and the ExchangeRateProvider SAM now takes (LocalDate, Currency). The fix itself (mutable per-date tax candidate list, remove on match) is retained in DividentReportPreparation.kt across both buildCurrencySection and buildCzkSection.
Previously dividends without a same-date tax record were silently dropped from the §7 report, leading to under-reporting (e.g. Revolut Stocks parsed with whtRate=0.0 emits no TaxRecord, so the dividend would never have appeared on the form). Now both buildCurrencySection and buildCzkSection throw an IllegalStateException with a descriptive message identifying the offending dividend (date, broker, symbol, amount, currency) and suggesting two concrete remedies: emit an explicit zero TaxRecord for genuine 0% WHT cases, or check that the broker statement's tax row matches the dividend date. Test added: dividendWithoutMatchingTaxRecordFailsWithDescriptiveMessage
CNB occasionally publishes the very last fixings of December a few business days late, and any retroactive corrections appear in early January. Caching a year file on Jan 8 risked freezing in a still-growing year. Treat year N as complete only once today() is at least 30 days into year N+1 (i.e. ~Jan 31). That comfortably clears any late-December corrections while still being well before the late-March / early-April Czech tax-filing window, so previous-year rates are downloaded fresh at most a handful of times per filing season.
The 'fixed' branch printed a wrong diff value: the last subtrahend was profit2024WhenUsedDynamicRate instead of profitWhenUsedFixedRate, so the displayed difference between the two strategies was nonsense (it contained the dynamic 2024 figure twice and never subtracted the fixed post-2024 figure). The else branch already had the correct expression (fixedTotal - dynamicTotal); this commit makes the if branch symmetrical (dynamicTotal - fixedTotal) so the printed diff is consistent with the comparison that selected the branch.
The literal '2024' was duplicated across ReportGenerator (twice, in the DateInterval.year() arguments for the rsuReport2024/esppReport2024 fields) and CockroachMain (twice in PDF output filenames and twice in the recommendation banner strings). The number is not arbitrary: it is the Czech tax-law cutover year for unsold RSU/ESPP equity, after which the old vs. new methodology comparison no longer needs to be emitted. Introduce ReportGenerator.LEGISLATIVE_TRANSITION_YEAR and route every caller through it. Variable and report-field names that mention 2024 are intentionally left alone -- they refer to the cutover year specifically, so a literal name carries useful information. Behaviour is unchanged.
…Record The production data classes deliberately require broker / symbol / product / country to be supplied at every call site, so that real parsers cannot silently lose attribution information. Several test call sites were still using the pre-change zero-arg form and stopped compiling. Introduce TestRecords.kt with two tiny factory functions (dividendRecord, interestRecord) that supply harmless defaults for the metadata fields tests do not care about (symbol=TEST, broker=TestBroker, product=TestProduct, country=IE). Migrate the affected test files to use them; tests that intentionally exercise specific metadata (e.g. country=SK for VÚB) keep their explicit arguments. No production code is changed.
The previous usage message only listed the positional Schwab/E-Trade form and gave a one-line nod to the YAML form, despite the YAML config being the only way to feed Degiro, Revolut, eToro and VUB into the report. Users that ran 'cockroach' with no arguments could not discover those options, and a YAML file (or empty Schwab export) with no broker sources crashed with an IllegalArgumentException stack trace from require() in CockroachMain.report(). - Rewrite the help text to lead with the YAML form, list every supported broker key (schwab, etrade, etradeBenefitHistory, degiro, revolut.stocks, revolut.savings, revolut.whtRate, etoro, vub) and keep the legacy positional form clearly marked as Schwab/E-Trade only. - Wrap main() so any IllegalArgumentException (including the existing 'No input sources provided. Specify at least one of: schwab, etrade, degiro, revolut, etoro, vub.' guard) is reported as a one-line stderr message and the process exits with code 1, instead of dumping a stack trace. - Switch System.exit calls to kotlin.system.exitProcess for consistency. No production behaviour changes for valid inputs; tests still pass.
broker (Schwab, E-Trade, Degiro, Revolut, eToro, VUB), each owning its own input shape (single file / list / directory / WHT rate / year). The small file-loading helper that several of them need moves to a top-level loadText() in BrokerSource.kt; the matching helper inside CockroachMain is removed.
There was a problem hiding this comment.
Summary
Reviewed PR #10 end-to-end. The PR is a substantial and well-organized expansion of the tool: a clean BrokerSource interface, four new broker integrations (Degiro, Revolut, eToro, VÚB), interest-income reporting, a YAML config, and an HTTP CNB rate fetcher. Tests build and pass locally on mvn clean install (a couple of tests assumeTrue against private files and skip — that is by design).
I'm posting this as a comment review, not request-changes — none of the findings block the merge, but several are worth fixing before tax season because they will produce wrong numbers or fail silently for users other than yourself.
What I liked
BrokerSourceis a clean abstraction;CockroachMainbecomes a thin orchestrator and the YAML branch is easy to follow.CockroachConfig+ kaml is a much better UX than the positional-argument mode and is well-documented in the README.HttpCnbYearRatesSourcecorrectly distinguishes complete vs. incomplete years and uses an atomic temp-file rename when caching. The chunk-splitting for the 2022 HRK transition is a nice touch.- Test coverage is good for the new parsers; the redacted XLS/XLSX/CSV/PDF fixtures under
src/test/resourcesare a great pattern to keep. - README updates are thorough — every broker gets clear acquisition steps + the new
input/layout.
Findings
Likely-incorrect-output (recommend fixing)
-
Wrong broker name on E-Trade RSU/ESPP PDF reports (pre-existing, but more visible now).
RsuPdfParserandEsppPdfParserhardcodeBROKER_NAME = "Charles Schwab & Co."even when invoked on E-Trade PDFs. After this PR these records are reported throughETradeBrokerSource, so dividends/sales correctly say E-Trade but RSU/ESPP rows still say Schwab. Pass the broker name into the parser (or detect from PDF header). -
VubInterestPdfParseruses the YAMLyearinstead of the statement year.
VubBrokerSource(files, year = config.year)means a December 2024 → January 2025 statement bundled into the 2024 run will book January interest with a 2024 date. Either parse the year from the statement header (Period:row) or require single-year statements explicitly. -
eToro dividends are unconditionally classified as US-source and grossed up at 15%.
The README warns about this, but the code itself doesn't refuse non-US tickers (e.g.VOD.L). Two safer options: (a) require the user to confirm a per-symbol country map in the YAML, or (b) at least log a clear warning per non-US ticker rather than silently labeling it US. -
Revolut
whtRateis per-broker, not per-issuer.
This is fine for accounts that only hold US-domiciled stock under W-8BEN. If the user holds a non-US dividend payer through Revolut, the gross-up will be wrong. Worth a note inRevolutBrokerSourceand in the report so the user spots it. -
DividentReportPreparation15% match heuristic.
The same-day tax-to-dividend matching uses a 15% ratio. With Degiro/eToro/Revolut now in the mix, dividends with 0% (treaty exemption) or non-15% WHT can be mis-paired. Consider matching by(symbol, payDate, currency)when available and using the ratio only as a tiebreaker.
Robustness / correctness
-
HttpCnbYearRatesSourcehas no HTTP timeouts.
URI(...).toURL().openStream()with noconnectTimeout/readTimeoutwill block indefinitely if cnb.cz is slow. UseURLConnection.setConnectTimeout(...)/setReadTimeout(...)(e.g. 10s/60s). -
No offline fallback to
ClasspathCnbYearRatesSource.
You shiprates_*.txtfor completed past years, but the HTTP source ignores them and downloads fresh. If the user is offline or CNB is down, the run fails. Consider: try classpath first for completed years; download only when missing. -
No content validation on the CNB download.
If CNB returns an HTML error page (5xx, captive portal, …),splitByHeaderwill throwno header line found. That's acceptable but don't cache the HTML accidentally — currently you only cache viadownloadToFilewhich is fine, but a sanity check that the response starts withDate|/Datum|would prevent caching garbage if CNB ever changes the URL semantics. -
RevolutParsersavings parser only recognizes EnglishInterest PAID.
Localized statements (the user lives in CZ/SK) would silently report zero interest. Either accept a small allow-list of localized strings or fail loudly when anInterest *row is encountered that the parser doesn't understand. -
EtoroXlsxParseruses hand-written regex over the unzipped XLSX.
The README explains why (POI choked on the file), but the parser hard-codes column letters. A brittle approach. At minimum, validate the header row (column titles) and fail with a clear message if the layout differs from the tested fixture.
Code hygiene
-
Unused imports (CI doesn't fail on these, but easy nits):
ETradeBrokerSource.kt—org.apache.poi.openxml4j.opc.internal.FileHelper(this also pulls in a POI internal package — fragile across POI versions, please remove)JsonExportParser.kt—kotlinx.serialization.builtins.serializerCockroachMain.kt—java.util.logging.Logger
-
CockroachMain.runCockroachis large and procedural.
It does YAML detection, validation ("at least one source must be present"), and then both the YAML and legacy paths inline. Splitting into two private functions (runFromYaml,runFromArgs) plus a small "detect mode" helper would be easier to test. -
Mixed naming.
DividentReportPreparationkeeps the original typo (Divident/Dividend). Fine to leave for diff hygiene; if you ever do a rename pass, fix this and the matching templates together. -
Numbers as
Double.
Pre-existing, not introduced here. Worth flagging for the future: the gross-up + matching logic in Revolut/Degiro accumulates floating-point error; the existing tests pin specific values, butBigDecimalat money boundaries would be the proper fix once the layout stabilizes.
Tests
-
Build passes (
mvn clean install) onkotlin.version=2.3.20with JUnit 5 + vintage engine. Surefire reports 50 tests run, 3 skipped (theassumeTrue(file.exists())ones that need private statements). No failures. -
There's no end-to-end test that runs
CockroachMain.reportagainst a fully-populatedParsedExportand asserts the PDF/HTML output is generated without error. Given how many new code paths this PR touches, an integration test that wires up two or three brokers using the existing fixtures would catch template breakage.
CLAUDE.md
I added a CLAUDE.md at the repo root that summarizes the project, build, conventions, per-broker patterns, and the known issues above. This is intended for future AI agents (or a fresh re-read after a year). Feel free to amend / move it to AGENTS.md if you prefer that name.
— Reviewed using dstrupl (gh auth confirmed before posting).
|
Zdar, nezajdem nekdy na obed to probrat? Sorry, ze jsem nechal udelat to review automatem .... |
…unused imports * HttpCnbYearRatesSource: switch to URLConnection with explicit connectTimeout=10s / readTimeout=60s so a CNB outage no longer hangs the whole run; require the response to contain a 'Date|'/'Datum|' header line before returning or caching it, so a captive-portal page or HTML error body fails loudly instead of being silently cached as empty rates. * EtoroXlsxParser: log a per-row WARNING when an instrument name carries a non-US exchange suffix (e.g. VOD.L, SAP.DE, MC.PA), since country is hard-coded to US and such tickers would be misclassified as US-source dividends. * ETradeBrokerSource, CockroachMain: drop genuinely unused imports (org.apache.poi.openxml4j.opc.internal.FileHelper and java.util.logging.Logger respectively).
…t rows A1 (likely-incorrect output): RsuPdfParser and EsppPdfParser previously hardcoded broker = "Charles Schwab & Co." because the PDF templates are identical for Schwab and E-Trade/Morgan Stanley. After the E-Trade source was added, RSU/ESPP records imported via ETradeBrokerSource still showed 'Charles Schwab & Co.' on Priloha c. 3 / sec. 6, conflicting with the sales/dividend rows from the same source. Pass brokerName explicitly through parse / parseDirectory / parseFromText; ETradeBrokerSource now stamps 'Morgan Stanley & Co.' to match what ETradeBenefitHistoryParser and ETradeGainLossParser already emit. B2 (robustness): RevolutParser.parseSavings only recognises English 'Interest PAID' / 'Interest Reinvested' descriptions. A localised statement (CZ/SK) or a new Revolut row type whose description starts with 'Interest ' would previously fall into the generic 'unrecognised row' warning branch and be silently dropped, under-reporting sec. 8 interest income. Add a fail-loud check that throws IllegalStateException on any 'Interest *' row not matching the known prefixes, with guidance to re-export the statement in English. Tests: RsuPdfParserTest / EsppPdfParserTest updated to thread brokerName through; new RevolutParserTest case asserts the IllegalStateException fires on 'Interest Accrued ...'. mvn test: 51 run, 0 failures, 1 skipped (external-PDF assumeTrue).
…e year A2 (likely-incorrect output): VubInterestPdfParser stamped every posting with the year passed in by the caller (typically tax_YYYY.yaml's year), even though VUB's table only prints DD/MM. Pointing the configuration at the wrong year (e.g. running with year=2024 against a 2025 statement, or vice versa around the December->January boundary) would silently mis-stamp every InterestRecord. Fix: extract the statement year from the 'Account balance as at DD/MM/YYYY' lines in the PDF header (max year across opening and closing balances = closing year = statement year), and require it to match the caller's configured year. Throw IllegalStateException with actionable guidance on mismatch. Tests: new VubInterestPdfParserTest covers extractStatementYear (synthetic text fixture for the closing-balance-pick rule and the no-balance-line error path) and exercises the cross-check via the bundled real PDF using assumeTrue, asserting both the mismatch (year=2024 -> throws) and the happy path (year=2025 -> records all carry 2025, CZK, broker='VUB'). mvn test: 55 run, 0 failures, 1 skipped.
…ot per-issuer A3 (likely-incorrect output): Revolut Stocks reports only net dividends and the parser performs a mathematical gross-up at a single per-broker whtRate (default 0.15 for the US/CZ treaty rate). Today this is correct because Revolut Stocks is a US-only product, but if Revolut adds non-US listings or pays out an ADR whose underlying issuer-country withholds at a different rate, the per-row gross-up will not match what was actually withheld and the user will silently mis-report taxes credited. Fix: - RevolutParser.parseStocks now throws IllegalStateException on any ticker carrying a non-US exchange suffix (.L, .DE, .PA, ...) instead of silently treating it as US, mirroring the existing fail-loud behaviour for unrecognised Interest rows in parseSavings. The error spells out the offending ticker, the per-broker whtRate that would have been mis-applied, and the actionable workaround (split the row out manually, or extend the parser with per-issuer routing). - KDoc on parseStocks spells out the per-broker (vs per-issuer) limitation. - README 'Revolut Stocks' notes call out the same limitation explicitly, next to the existing whtRate / gross-up explanation. Tests: new stocksFailsLoudlyOnNonUsTickerSuffix in RevolutParserTest asserts the exception type and message contents for a VOD.L row. mvn test: 56 run, 0 failures.
…) explicitly
Replace the 15%-of-amount heuristic that picked the closest tax row on the same date
with an explicit composite key (broker, symbol, date) within the per-currency section.
Multiple tax rows that share the same key (e.g. Schwab's same-day correction adjustments)
are now summed before being attached to the dividend bucket, and multiple dividend rows
sharing the same key collapse into a single printable line with summed amounts.
Two new fail-loud invariants:
- any dividend without a matching tax bucket throws IllegalStateException with
broker/symbol/date/amount and a remediation hint;
- any tax row without a matching dividend bucket throws IllegalStateException
with the same diagnostic.
To support keying, TaxRecord and TaxReversalRecord gain non-nullable symbol+broker
fields; all production parsers (Degiro, eToro, Revolut, Morgan Stanley DividendXlsx,
Schwab JsonExport) populate them from the underlying statement, and TestRecords.kt
provides defaulted helpers so existing tests do not have to spell them out.
Updated DividentReportPreparationTest with three scenarios covering the new key:
two brokers paying the same symbol on the same day, multiple tax rows summed against
one dividend, and an orphaned tax row failing loudly.
mvn test: 58 run, 0 failures, 3 skipped.
Past tax years remain reproducible offline and survive cnb.cz outages once their year.txt is shipped under src/main/resources/cz/solutions/cockroach/rates_<year>.txt. * ClasspathCnbYearRatesSource: expose hasYear(year). * New ClasspathOrHttpCnbYearRatesSource: prefers bundled snapshot when available; only consults HTTP for years not on the classpath (typically the current/future year). * CockroachMain.report wraps HttpCnbYearRatesSource with the composite. * README: documents the lookup order and how to pin a new year. * CnbYearRatesSourceTest covers hasYear, bundled-preferred, HTTP delegation, and propagated HTTP failure. mvn test: 62 run, 0 failures, 3 skipped.
Pulls YAML and positional CLI source-list construction out of the top-level dispatcher into invocationFromYaml() and invocationFromPositionalArgs(), each returning a small private Invocation(year, outputDir, sources). runCockroach now only picks the form, parses, and calls CockroachMain.report. No behavior change. mvn test: 62 run, 0 failures, 3 skipped.
Replaces the assumeTrue dependency on a private input/BenefitHistory.xlsx with a checked-in synthetic fixture under src/test/resources/cz/solutions/cockroach/. The fixture is built by BenefitHistoryFixtureGenerator, which mirrors the structural layout of a real Morgan Stanley Stock Plan Connect export (sheet names, headers including the duplicated 'Vested Qty.' column, and the Grant/Vest Schedule/Tax Withholding row sequence) but uses anonymised symbol (ACME), grant ids (G-1001, G-1002, G-1003), quantities, and amounts. G-1003 has zero vested shares and must be excluded by the parser, exercising the same path the production file does. mvn test: 62 run, 0 failures, 1 skipped (was 3 skipped).
V pohode. Uz jsem na to poslal sveho automata a vyresil vetsinu tech poznamek. Co jsem koukal, tak nic z toho by nemelo byt uplne zasadni. Jinak je hlavne uprava pro me, abych z toho dostal kompletni danove priznani. Hodil jsem to sem jako PR, jestli to nahodou nebude chtit pouzit jeste nekdo. Obed klidne muzem dat. Treba ctvrtek bych mel mit relativne volny zatim. |
Uh oh!
There was an error while loading. Please reload this page.