[RF] HS3 export: continuous pdf paired with binned dataset has unrecoverable evaluation semantics

**Context:** we are implementing an independent HS3 consumer ([pyhs3](https://github.com/scipp-atlas/pyhs3)) and validating it against quickFit NLL values on an ATLAS diHiggs (bbγγ) workspace exported via RooFit's HS3 JSON export. This is a request to make the evaluation semantics of binned-data-against-continuous-pdf recoverable from the exported file.

cc @cburgard @Phmonski

## The situation

The exported datasets are binned:

```json
{ "name": "AsimovData_0_Run2HM_1", "type": "binned",
  "axes": [{ "name": "atlas_invMass_Run2HM_1",
             "min": 105.0, "max": 160.0, "nbins": 220 }],
  "contents": [0.344, 0.342, "..."] }   // Σ ≈ 40.7 events
```

while the paired channel pdf is a continuous `mixture_dist` over the same variable. This is presumably faithful to the original workspace (a binned Asimov RooDataHist), so not an export bug per se — but the file does not say how RooFit evaluates this pairing:

```
(A) bin centers:    log L = Σ_b c_b · log pdf(x_b^center)
(B) bin integrals:  log L = Σ_b c_b · log ∫_bin_b pdf(x) dx
```

(cf. `IntegrateBins` / binned-likelihood attributes). The two differ by a parameter-dependent amount, so an independent consumer reproducing the NLL curve cannot know which to implement. We have filed a matching HS3-spec issue asking for the semantics to be definable: hep-statistics-serialization-standard/hep-statistics-serialization-standard#93.

## Suggestions (increasing order of usefulness)

- document the evaluation convention the export assumes;
- include the relevant evaluation options (e.g. `IntegrateBins`, binned-likelihood attributes) in the export when they are set;
- optionally support exporting the dataset in the representation the fit actually used.

## Related question

For absolute-NLL comparisons: quickFit/RooFit NLLs include data-only constants (e.g. the −log N! of the extended term and constraint normalization constants) that are not part of the serialized model. A short statement in the RooFitHS3 docs of which constants RooFit's `createNLL` includes would make cross-tool validation much less archaeological.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RF] HS3 export: continuous pdf paired with binned dataset has unrecoverable evaluation semantics #22598

The situation

Suggestions (increasing order of usefulness)

Related question

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[RF] HS3 export: continuous pdf paired with binned dataset has unrecoverable evaluation semantics #22598

Description

The situation

Suggestions (increasing order of usefulness)

Related question

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions