From 0732fd6a99dc0c6c896c664bd2c238019cbc89f1 Mon Sep 17 00:00:00 2001 From: Luke Sneeringer Date: Tue, 2 Feb 2021 10:38:46 -0800 Subject: [PATCH 1/3] =?UTF-8?q?feat:=20AIP-143=20=E2=80=93=20Standardized?= =?UTF-8?q?=20codes?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- aip/general/0143/aip.md | 97 +++++++++++++++++++++++++++++++++++++++ aip/general/0143/aip.yaml | 7 +++ 2 files changed, 104 insertions(+) create mode 100644 aip/general/0143/aip.md create mode 100644 aip/general/0143/aip.yaml diff --git a/aip/general/0143/aip.md b/aip/general/0143/aip.md new file mode 100644 index 00000000..030192ce --- /dev/null +++ b/aip/general/0143/aip.md @@ -0,0 +1,97 @@ +# Standardized codes + +Many common concepts, such as spoken languages, countries, currency, and so on, +have common codes (usually formalized by the [International Organization for +Standardization][iso]) that are used in data communication and processing. +These codes address the issue that there are often different ways to express +the same concept in written language (for example, "United States" and "USA", +or "EspaƱol" and "Spanish"). + +## Guidance + +For concepts where a standardized code exists and is in common use, fields +representing these concepts **should** use the standardized code for both input +and output. + +```typescript +// A message representing a book. +interface Book { + // Other fields... + + // The IETF BCP-47 language code representing the language in which + // the book was originally written. + // https://en.wikipedia.org/wiki/IETF_language_tag + languageCode: string; +} +``` + +- Fields representing standardized concepts **must** use the appropriate data + type for the standard code (usually `string`). + - Fields representing standardized concepts **should not** use enums, even if + they only allow a small subset of possible values. Using enums in this + situation often leads to frustrating lookup tables when using multiple APIs + together. + - Fields representing standardized concepts **must** indicate which standard + they follow, preferably with a link (either to the standard itself, the + Wikipedia description, or something similar). +- The field name **should** end in `_code` or `_type` unless the concept has an + obviously clearer suffix. +- When accepting values provided by users, validation **should** be + case-insensitive unless this would introduce ambiguity (for example, accept + both `en-gb` and `en-GB`). When providing values to users, APIs **should** + use the canonical case (in the example above, `en-GB`). + +### Content types + +Fields representing a content or media type **must** use [IANA media types][]. +For legacy reasons, the field **should** be called `mime_type`. + +### Countries and regions + +Fields representing individual countries or nations **must** use the [Unicode +CLDR region codes][cldr] ([list][]), such as `US` or `CH`, and the field +**must** be called `region_code`. + +**Important:** We use `region_code` and not `country_code` to include regions +distinct from any country, and avoid political disputes over whether or not +some regions are countries. + +### Currency + +Fields representing currency **must** use [ISO-4217 currency codes][iso-4217], +such as `USD` or `CHF`, and the field **must** be called `currency_code`. + +**Note:** For representing an amount of money in a particular currency, rather +than the currency code itself, use [`google.protobuf.Money`][money]. + +### Language + +Fields representing spoken languages **must** use [IETF BCP-47 language +codes][bcp-47] ([list][]), such as `en-US` or `de-CH`, and the field **must** +be called `language_code`. + +### Time zones + +Fields representing a time zone **should** use the [IANA TZ][] codes, and the +field **must** be called `time_zone`. + +Fields also **may** represent a UTC offset rather than a time zone (note that +these are subtly different). In this case, the field **must** use the [ISO-8601 +format][] to represent this, and the field **must** be named `utc_offset`. + +## Changelog + +- **2020-05-12**: Replaced `country_code` guidance with `region_code`, + correcting an original error. + + +[bcp-47]: https://en.wikipedia.org/wiki/IETF_language_tag +[cldr]: http://cldr.unicode.org/ +[iana media types]: https://www.iana.org/assignments/media-types/media-types.xhtml +[iana tz]: http://www.iana.org/time-zones +[iso]: https://www.iso.org/ +[iso-4217]: https://en.wikipedia.org/wiki/ISO_4217 +[iso-8601 format]: https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC +[list]: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry +[money]: https://github.com/googleapis/api-common-protos/blob/master/google/type/money.proto + diff --git a/aip/general/0143/aip.yaml b/aip/general/0143/aip.yaml new file mode 100644 index 00000000..0ace23ee --- /dev/null +++ b/aip/general/0143/aip.yaml @@ -0,0 +1,7 @@ +--- +id: 143 +state: approved +created: 2019-07-24 +placement: + category: fields + order: 40 From 6498613fd3d027d4419ee995069008abbf0694ea Mon Sep 17 00:00:00 2001 From: Luke Sneeringer Date: Tue, 18 May 2021 09:34:13 -0700 Subject: [PATCH 2/3] Incorporate group feedback. --- aip/general/0143/aip.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/aip/general/0143/aip.md b/aip/general/0143/aip.md index 030192ce..afce2cbd 100644 --- a/aip/general/0143/aip.md +++ b/aip/general/0143/aip.md @@ -27,10 +27,6 @@ interface Book { - Fields representing standardized concepts **must** use the appropriate data type for the standard code (usually `string`). - - Fields representing standardized concepts **should not** use enums, even if - they only allow a small subset of possible values. Using enums in this - situation often leads to frustrating lookup tables when using multiple APIs - together. - Fields representing standardized concepts **must** indicate which standard they follow, preferably with a link (either to the standard itself, the Wikipedia description, or something similar). @@ -40,6 +36,12 @@ interface Book { case-insensitive unless this would introduce ambiguity (for example, accept both `en-gb` and `en-GB`). When providing values to users, APIs **should** use the canonical case (in the example above, `en-GB`). +- Standardized code fields **may** have a default; if they do, the sentinel + value **must** be the omission of the field. + +**Note:** The string itself _is_ the immutable, canonical code, used on the +wire format. Services **should not** create a separate enum with a different +wire format, because doing so makes it difficult to use multiple APIs together. ### Content types From df4d51b5eabb2676927fe107624295182f17e99e Mon Sep 17 00:00:00 2001 From: Luke Sneeringer Date: Tue, 18 May 2021 09:36:53 -0700 Subject: [PATCH 3/3] Incorporate @hudlow feedback. --- aip/general/0143/aip.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/aip/general/0143/aip.md b/aip/general/0143/aip.md index afce2cbd..65cbcccb 100644 --- a/aip/general/0143/aip.md +++ b/aip/general/0143/aip.md @@ -41,7 +41,8 @@ interface Book { **Note:** The string itself _is_ the immutable, canonical code, used on the wire format. Services **should not** create a separate enum with a different -wire format, because doing so makes it difficult to use multiple APIs together. +wire format, and **should not** use company-specific strings, because doing so +makes it difficult to use multiple APIs together. ### Content types