Skip to content

CLDR-19409 clean up ownership of ICUServiceBuilder#5780

Open
srl295 wants to merge 5 commits into
unicode-org:mainfrom
srl295:cldr-19409/icu-service-builder
Open

CLDR-19409 clean up ownership of ICUServiceBuilder#5780
srl295 wants to merge 5 commits into
unicode-org:mainfrom
srl295:cldr-19409/icu-service-builder

Conversation

@srl295
Copy link
Copy Markdown
Member

@srl295 srl295 commented May 30, 2026

CLDR-19409

Completely get rid of ICUServiceBuilder.forLocale()

Instead, ICUServiceFactory contains the cache, and, every Factory ownes one.

Perf should be the same or better.

Also fixes CLDR-19518 and CLDR-19166

ALLOW_MANY_COMMITS=true

@srl295 srl295 force-pushed the cldr-19409/icu-service-builder branch from addd4c0 to 31ad52d Compare May 30, 2026 21:57
@srl295
Copy link
Copy Markdown
Member Author

srl295 commented May 30, 2026

and it works
image

and reports update instantly

Uploading image.png…

@jira-pull-request-webhook
Copy link
Copy Markdown

Notice: the branch changed across the force-push!

  • tools/cldr-apps/src/main/java/org/unicode/cldr/web/DataPage.java is different
  • tools/cldr-apps/src/main/java/org/unicode/cldr/web/SurveyAjax.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/test/BuildIcuCompactDecimalFormat.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/tool/GenerateExampleDependencies.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/tool/ListGrammarData.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/tool/ShowPlurals.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/util/ICUServiceBuilder.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/util/NameGetter.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/util/TimezoneFormatter.java is different
  • tools/cldr-code/src/main/java/org/unicode/cldr/util/VerifyCompactNumbers.java is different
  • tools/cldr-code/src/test/java/org/unicode/cldr/tool/GenerateDateTimeTestDataTest.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/tool/ListProblemDates.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestCompactNumbers.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestDateOrder.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestExampleGenerator.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestLocale.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestPathHeader.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/TestPersonNameFormatter.java is now changed in the branch
  • tools/cldr-code/src/test/java/org/unicode/cldr/unittest/UnicodeSetPrettyPrinterTest.java is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@srl295 srl295 requested review from btangmu and macchiati May 30, 2026 21:59
return longid;
}
TimezoneFormatter tzf = new TimezoneFormatter(cldrFile);
TimezoneFormatter tzf = new TimezoneFormatter(null, cldrFile);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLDRFile constructor initializes a NameGetter which pulls in TimezoneFormatter which pulls in ICUServiceBuilder…… !!!!

I kind of think we should move NameGetter into ICUServiceBuilder. Then we could get rid of this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is reasonable. We might think about whether or not it makes sense to incrementally load ICUBuilder. For example, if we are building a number formatter, maybe we just build that. If the ICU builder gets flushed in the meantime, we haven't spent time building something we don't need.

If we are building a date formatter, then that would need a number formatter. It would call a getter, which would cause the number formatter data to be loaded.

I don't know whether this would be productive or not; it would depend on how much time it takes to build an ICUServiceBuilder.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, splitting up functionality.

This PR makes the ownership clear. So the builder could simply initialize its own functionality only as needed.

Comment on lines +58 to +61
public static final ICUServiceBuilder inefficientSingletonServiceBuilder(
final CLDRFile resolvedFile) {
return new ICUServiceBuilder(resolvedFile, defaultCollationFactory);
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return cldrFile;
}

public static ICUServiceBuilder forLocale(CLDRLocale locale) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note, this is a really old function.

I suspect even outside of the current issues, there may be latent bugs this was hiding.

Comment on lines 113 to 119
{"en", "one", "〖❬1 ❭Bermudan dollar〗", "〖❬1❭ ❬US dollar❭〗〖❬1❭ ❬euro❭〗"},
{
"en",
"other",
"〖❬1.23 ❭Bermudan dollars〗〖❬0.00 ❭Bermudan dollars〗",
"〖❬1.23❭ ❬US dollars❭〗〖❬1.23❭ ❬euros❭〗〖❬0.00❭ ❬US dollars❭〗〖❬0.00❭ ❬euros❭〗"
},
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test data looks suspicious here.

Now fails with:

    testCurrency {
      Error: (TestExampleGenerator.java:134) : en-one-BMD: expected "〖❬1 ❭Bermudan dollar〗", got "〖❬1 ❭value-one〗"
      Error: (TestExampleGenerator.java:139) : en-one-pat: expected "〖❬1❭ ❬US dollar❭〗〖❬1❭ ❬euro❭〗", got "〖❬1❭_❬US dollar❭〗〖❬1❭_❬euro❭〗"
      Error: (TestExampleGenerator.java:134) : en-other-BMD: expected "〖❬1.23 ❭Bermudan dollars〗〖❬0.00 ❭Bermudan dollars〗", got "〖❬1.23 ❭value-other〗〖❬0.00 ❭value-other〗"
      Error: (TestExampleGenerator.java:139) : en-other-pat: expected "〖❬1.23❭ ❬US dollars❭〗〖❬1.23❭ ❬euros❭〗〖❬0.00❭ ❬US dollars❭〗〖❬0.00❭ ❬euros❭〗", got "〖❬1.23❭_❬US dollars❭〗〖❬1.23❭_❬euros❭〗〖❬0.00❭_❬US dollars❭〗〖❬0.00❭_❬euros❭〗"
    } (0.001s) FAILED (4 failure(s))

But the value we're using is "value-one". If anything it looks like it was using the wrong results.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. the following looks reasonable to me. other means plural, all all the examples are plural.

                "en",
                "other",
                "〖❬1.23 ❭Bermudan dollars〗〖❬0.00 ❭Bermudan dollars〗",
                "〖❬1.23❭ ❬US dollars❭〗〖❬1.23❭ ❬euros❭〗〖❬0.00❭ ❬US dollars❭〗〖❬0.00❭ ❬euros❭〗"

Copy link
Copy Markdown
Member Author

@srl295 srl295 Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is the scenario the test sets up: the English value has been changed to value-one and value-other respectively. Just as with French.

I simulated the situation with en_CA locally:

image

Why does the test expect Bermudan dollars to be the value? I think it was passing because of ownership errors in the ExampleGenerator. Compare to the French test in this same unit test, where the values are value-one and value-other.

(locIsExceptional || fileIsRecording)
? new ICUServiceBuilder(resolvedCldrFile, locIsExceptional)
: ICUServiceBuilder.forLocale(CLDRLocale.getInstance(localeId));
this.icuServiceBuilder = cldrFactory.getICUServiceBuilder(CLDRLocale.getInstance(localeId));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails TestCLDRPaths.testGenerateExampleDependencies I recently added in #5761 . The value of boolean fileIsRecording can't be ignored; see the comment a few lines above. If an ExampleGenerator has a RecordingCLDRFile, then its ICUServiceBuilder needs to use that same RecordingCLDRFile.

The boolean locIsExceptional is also getting ignored with this change. It's not clear to me whether that's OK; maybe it is if the proposed cldrFactory.getICUServiceBuilder handles exceptional locales correctly, unlike ICUServiceBuilder.forLocale. If locIsExceptional isn't needed anymore it should be removed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at that now. Yes exceptional should be removed, no more exceptional locales.

Copy link
Copy Markdown
Member Author

@srl295 srl295 Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@btangmu take a look at the last update. This preserves the recording CLDRFile.

Even cleaner would be to have a recording Factory, that would work correctly (and would even cache the service builder by returning the same recording file…). Perhaps a future update.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "the last update". I see 4 commits; is there a 5th commit pending that I don't see yet?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, commited but not pushed. pushed it now.

use singleton ICUSertviceBuilder when using a recording file
@srl295 srl295 requested a review from btangmu June 1, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants