Improve unit test parser error handling by hakonhagland · Pull Request #6933 · OPM/opm-simulators

hakonhagland · 2026-03-12T16:10:14Z

Problem

When readDeck() encounters parse errors in unit tests, it calls std::exit(1) which bypasses Boost.Test entirely. This produces a silent failure (exit code 1, no error message) because:

OpmLog has no registered backend in test fixtures, so error messages are discarded
std::exit() skips Boost.Test's exception handling, so no test failure is reported

Solution

Add throwOnError parameter to readDeck() and FlowGenericVanguard::readDeck() (defaults to false for backward compatibility)
When enabled, throws std::runtime_error with the failure message instead of calling std::exit(1)
Register an OpmLog stderr backend in test fixture constructors so parser diagnostics are printed
Enable both in SimulatorFixture, test_equil, and test_RestartSerialization

Before / after

	Before	After
Error message	Silently discarded	Printed to stderr
Test result	Silent exit code 1	Boost.Test reports `std::runtime_error` with message
Production behavior	Unchanged	Unchanged (`throwOnError` defaults to `false`)

hakonhagland · 2026-03-12T16:10:32Z

jenkins build this please

Copilot

Pull request overview

This PR improves unit-test diagnostics when Eclipse deck parsing fails by avoiding std::exit() (which bypasses Boost.Test) and ensuring parser errors are actually emitted to stderr during tests.

Changes:

Add an optional throwOnError parameter to Opm::readDeck() and FlowGenericVanguard::readDeck() to throw std::runtime_error instead of calling std::exit().
Register an OpmLog stderr backend in relevant test fixtures so parser diagnostics are visible in unit test output.
Enable throwOnError in SimulatorFixture, test_equil, and test_RestartSerialization.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test_equil.cpp	Enables `throwOnError` and adds stderr logging backend in the test fixture.
tests/test_RestartSerialization.cpp	Enables `throwOnError` across restart serialization tests and adds stderr logging backend.
tests/SimulatorFixture.hpp	Adds stderr logging backend and enables `throwOnError` for simulator-based tests.
opm/simulators/utils/readDeck.hpp	Extends `readDeck()` API with `throwOnError` (default `false`).
opm/simulators/utils/readDeck.cpp	Implements `throwOnError` by throwing instead of finalizing + exiting.
opm/simulators/flow/FlowGenericVanguard.hpp	Extends `FlowGenericVanguard::readDeck()` API with `throwOnError` (default `false`).
opm/simulators/flow/FlowGenericVanguard.cpp	Plumbs `throwOnError` through to `Opm::readDeck()`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

bska · 2026-03-12T16:35:59Z

I'll be honest, I think this is a lot of complexity for a very narrow use case. There aren't supposed to be parse errors in the unit tests.

hakonhagland · 2026-03-12T16:46:19Z

There aren't supposed to be parse errors in the unit tests.

@bska It has happened to me several times when adding unit tests with new datafiles to PRs, and then it is very confusing to track down what is happening.

bska · 2026-03-12T16:48:43Z

It has happened to me several times when adding unit tests with new datafiles to PR

Then don't add them as unit test until you've verified that there are no parse errors. You don't need MPI support for that.

When readDeck() encounters parse errors in unit tests, it calls std::exit(1) which bypasses Boost.Test entirely, producing a silent failure with no error message (OpmLog has no backend in tests). Add a throwOnError parameter to readDeck() that throws std::runtime_error instead of calling std::exit(1). Register an OpmLog stderr backend in test fixtures so parser diagnostics are visible. Enable both in SimulatorFixture, test_equil, and test_RestartSerialization.

hakonhagland · 2026-03-12T17:07:47Z

jenkins build this please

hakonhagland · 2026-03-12T17:21:42Z

@bska Thanks for the feedback. A few thoughts:

The change is small (50 insertions) and fully backward-compatible, throwOnError defaults to false, so existing behavior is unchanged.
To clarify: this is not related to MPI. The throwOnError parameter is only used in single-process unit tests. The issue is that readDeck() calls std::exit(1) on parse errors, which bypasses Boost.Test's error reporting. Without the OpmLog backend, the error message is also silently discarded, so you get exit code 1 with no indication of what went wrong.
"Verify first" is good practice, but during development you iterate on both the test code and the data file. When a typo in e.g WELLDIMS causes a silent exit, you end up reaching for GDB with break exit to find the error message. That's the workflow this PR improves.
Even without throwOnError, the OpmLog backend setup alone is valuable: it surfaces parser warnings during test runs that are currently invisible.

If you feel throwOnError is too much, I could reduce the PR to just the OpmLog backend registration in the test fixtures. That alone would make parse errors visible before std::exit(1), which is the most important improvement.

bska · 2026-03-12T17:35:52Z

The change is small (50 insertions) and fully backward-compatible, throwOnError defaults to false, so existing behavior is unchanged.

The number of lines is the only salient point of this.

You're adding a conditional, to a function that's already heavily nested, that will never be needed in a real simulation run yet you're asking every simulation run to pay for that complexity.

"Verify first" is good practice, but during development you iterate on both the test code and the data file. When a typo in e.g WELLDIMS causes a silent exit, you end up reaching for GDB with break exit to find the error message. That's the workflow this PR improves.

Checking if the data file is okay amounts to having something like

BOOST_AUTO_TEST_CASE(Check_Data_File)
{
    const auto deck = Parser{}.parseFile(your_filename_here);
}

and if you're developing the test you're already in the context of a set of unit tests it doesn't cost anything to have that. You can even put logging there if needed.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

atgeirr · 2026-03-13T07:35:54Z

Not judging the merits of this PR here, but it made me wonder: why do we exit() instead of throw from readDeck() in the first place? Normally I would assume that throwing is what we want to do.

Edit: ... and would this not be dealt with by the ParseContext and ErrorGuard?

hakonhagland · 2026-03-13T11:21:33Z

Normally I would assume that throwing is what we want to do.

@atgeirr Good question. I looked into this, ParseContext and ErrorGuard handle errors during parsing, but they don't cover the final termination step. Here's the flow:

ParseContext::handleError() dispatches per error category: THROW_EXCEPTION throws immediately, DELAYED_EXIT1 accumulates in ErrorGuard, WARN/IGNORE just log.
After parsing completes, readDeck() checks if (*errorGuard) for accumulated errors, extracts the messages, clears the guard (to prevent its destructor from also exiting), and then calls MPI_Finalize() + std::exit(EXIT_FAILURE).

So ErrorGuard collects the errors but readDeck() handles the actual termination. And it chose std::exit() rather than throwing, presumably because of the MPI cleanup requirement. But MPI_Finalize() is actually wrong on an error path (it requires cooperative shutdown from all ranks); MPI_Abort() would be more appropriate.

I agree that throwing is the right thing to do here. I've opened #6935 as a simpler alternative to this PR. It just replaces MPI_Finalize() + std::exit() with throw std::runtime_error() in readDeck(), and widens the catch in Main::initialize_() from std::invalid_argument to std::exception (which already calls MPI_Abort()).

bska · 2026-03-13T11:32:14Z

why do we exit() instead of throw from readDeck() in the first place? Normally I would assume that throwing is what we want to do.

All of this was introduced in #2785 and whatever we do we need to be sure that we don't reintroduce the problems that prompted that change.

hakonhagland · 2026-03-13T13:13:52Z

we need to be sure that we don't reintroduce the problems that prompted that change

@bska Thanks for the pointer to #2785. I've reviewed it and the underlying issue #2764.

The problem #2785 solved was that failureMessage was never printed before MPI_Abort(). The error ended up in failureMessage (via the catch block), not in ErrorGuard, so errorGuard->dump() printed nothing. #2785 fixed this by adding OpmLog::error(failureMessage) before termination and moving to MPI_Finalize() + std::exit().

Both #6933 and the simpler alternative #6935 preserve all of #2785's improvements. #6933 adds an optional throwOnError flag before the existing exit path (default false, so production is unchanged). #6935 takes the cleaner approach of replacing std::exit() with throw unconditionally.

blattms · 2026-03-17T11:13:07Z

why do we exit() instead of throw from readDeck() in the first place? Normally I would assume that throwing is what we want to do.

All of this was introduced in #2785

No, it was not. there was a hard MPI_Abort there before, which caused logging multiple times and with that PR this is done only once via MPI_Finalize and exit. The exit s there to have a proper return code from the program for calling scripts.

blattms · 2026-03-17T11:17:53Z

I do think that the subject of this PR does not reflect the real problem that this PR reveals:

MPI_Finalize needs to be called on all processes and this is not the case anymore for reservoir coupling which parses different decks where only some of them might have errors. Maybe it is time to test coupling with some decks being erroneous now before really fixing and merging this.

Hence I think we need to resolve this and this PR might be a good first step.

bska · 2026-03-17T11:23:10Z

All of this was introduced in #2785

No, it was not. there was a hard MPI_Abort there before, which caused logging multiple times and with that PR this is done only once via MPI_Finalize and exit. The exit s there to have a proper return code from the program for calling scripts.

As far as this PR is concerned which talks about removing exit, "this" (cooperative finalisation and shutdown) was indeed all introduced in #2785.

bska · 2026-03-17T11:24:43Z

Maybe it is time to test coupling with some decks being erroneous now before really fixing and merging this.

We're not merging this. The follow-up alternative in #6935 is a much better approach.

blattms · 2026-03-17T11:25:17Z

But MPI_Finalize() is actually wrong on an error path (it requires cooperative shutdown from all ranks); MPI_Abort() would be more appropriate.

I disagree. If you know that all the processes get there MPI_Finalize is the correct and clean way. This was the case here.
MPI_Abort should only be used as a last resort if something really goes wrong. The problem is the amount of MPI error messages that you get with MPI_Abort (one for each process). This has annoyed users quite a bit. You will hardly catch the real problem because of the all the noise.

hakonhagland · 2026-03-19T07:34:47Z

MPI_Finalize needs to be called on all processes and this is not the case anymore for reservoir coupling which parses different decks where only some of them might have errors. Maybe it is time to test coupling with some decks being erroneous now before really fixing and merging this.

@blattms I've created a test case for this scenario in #6951. It runs a reservoir coupling master that spawns a slave whose DATA file has a deliberate parse error (missing INCLUDE file). The slave exits immediately after the parse failure, but the master hangs on MPI_Recv waiting for the slave's initial data exchange.

The test currently passes by detecting the expected (broken) behavior:

MPICH (v4.3.2): hydra kills the master immediately when the slave exits (exit code 9)
OpenMPI (v5.0.8) / system MPICH (v4.2.1): master hangs until a 10s timeout (exit code 124)

The plan for a follow-up PR is to have the slave send a "parse failed" signal as its first MPI message after spawn (before sendAndReceiveInitialData()), so the master can detect the failure and shut down cleanly. Once that fix lands, the test will be updated to expect a clean exit with code 1 on all MPI implementations.

hakonhagland added the manual:irrelevant This PR is a minor fix and should not appear in the manual label Mar 12, 2026

hakonhagland requested review from Copilot March 12, 2026 16:13

Copilot started reviewing on behalf of hakonhagland March 12, 2026 16:14 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Comment thread tests/SimulatorFixture.hpp

Comment thread opm/simulators/utils/readDeck.cpp

hakonhagland force-pushed the improve_ut_parser_err branch from 3104e26 to a9d58b3 Compare March 12, 2026 17:07

Copilot AI reviewed Mar 12, 2026

View reviewed changes

hakonhagland mentioned this pull request Mar 13, 2026

Throw from readDeck() instead of std::exit() #6935

Merged

blattms closed this in #6935 Mar 24, 2026

Conversation

hakonhagland commented Mar 12, 2026

Problem

Solution

Before / after

Uh oh!

hakonhagland commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

bska commented Mar 12, 2026

Uh oh!

hakonhagland commented Mar 12, 2026

Uh oh!

bska commented Mar 12, 2026

Uh oh!

hakonhagland commented Mar 12, 2026

Uh oh!

hakonhagland commented Mar 12, 2026

Uh oh!

bska commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

atgeirr commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hakonhagland commented Mar 13, 2026

Uh oh!

bska commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hakonhagland commented Mar 13, 2026

Uh oh!

blattms commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blattms commented Mar 17, 2026

Uh oh!

bska commented Mar 17, 2026

Uh oh!

bska commented Mar 17, 2026

Uh oh!

blattms commented Mar 17, 2026

Uh oh!

hakonhagland commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bska commented Mar 12, 2026 •

edited

Loading

atgeirr commented Mar 13, 2026 •

edited

Loading

bska commented Mar 13, 2026 •

edited

Loading

blattms commented Mar 17, 2026 •

edited

Loading