Conversation
|
gibson042
left a comment
There was a problem hiding this comment.
👍👍 for exponential canonicalization, but 🤨 for ∅. It might be best to split this PR.
| Returns an exponential notation digit string together with the unit, | ||
| surrounded by square brackets (for example, `"[1.23+0 kilogram]"`). | ||
| If the Amount does not have a unit, | ||
| the null sign `∅` (U+2205) is used in place of the unit (for example, `"[4.2+1 ∅]"`). |
There was a problem hiding this comment.
U+2205 is specifically the code point named "EMPTY SET", and is potentially confusable with U+2300 DIAMETER SIGN ⌀, U+00D8 LATIN CAPITAL LETTER O WITH STROKE Ø, and U+00F8 LATIN SMALL LETTER O WITH STROKE ø. I value the result having something there to avoid being conditionally valid JSON (and restricting the set of accepted units to be disjoint with both this placeholder and the other syntax elements), but I'm not convinced that ∅ is the right placeholder. Other possibilities just from ASCII: #, *, -, =, ?, ~.
There was a problem hiding this comment.
(also U+2D41 TIFINAGH LETTER BERBER ACADEMY YAH ⵁ according to https://unicode.org/Public/17.0.0/security/confusables.txt)
I like using non-ASCII because I think people should stop assuming identifiers are ASCII but also am receptive to arguments that this is the wrong place to move the needle.
"?" seems fine, because it means "I don't know what the unit is", and we use ? in other contexts with null values, like the .? operator.
There was a problem hiding this comment.
My intent with ∅ was to pick something that best communicated "no unit". In so doing I did not look into what other characters appear similar, because I do not expect this serialization to need to be produced in any other place except Amount.prototype.toString(), and because the exact same concern can be expressed about literally any other character. To illustrate that, #, ∗, −, =, ︖, and ⁓ are all visually similar characters to the ASCII possibilities mentioned above.
If we do want to consider some ASCII character here, I don't think that ? is really appropriate, as its implicit "I don't know what the unit is" meaning is not always correct -- sometimes, an Amount is exactly correct to be unitless.
My next best alternative after ∅ would be ~, given how it's already the canonical YAML serialization of null. Would that work for you?
There was a problem hiding this comment.
My intent with
∅was to pick something that best communicated "no unit". In so doing I did not look into what other characters appear similar, because I do not expect this serialization to need to be produced in any other place exceptAmount.prototype.toString(), and because the exact same concern can be expressed about literally any other character. To illustrate that,#,∗,−,=,︖, and⁓are all visually similar characters to the ASCII possibilities mentioned above.
At the specification level, ECMA-262 and derivatives already resolve potential lookalike issues by mapping into Basic Latin where applicable. But more importantly, none of those other lookalikes are identifier characters that could be a well-formed unit identifier.
My next best alternative after
∅would be~, given how it's already the canonical YAML serialization of null. Would that work for you?
Yes, I think that would be better than ∅. But I'd really love to find an approach that represents absence of unit as an empty string.
| a.toString(); // "123.4560[]" | ||
| a.toString(); // "[1.234560e+2 ∅]" |
There was a problem hiding this comment.
It's interesting that the previous syntax actually would work, even for sequence units if we privilege -and- as their separator (e.g., 6e+0[foot] 7e+0[inch]), and mapping an absent unit to the empty string (e.g., 1.234560e+2[]) is fundamentally more intuitive than picking a placeholder. It's unfortunate that it suffers from being accepted as numeric by parseFloat.
|
As noted before, I'm concerned about losing the flexible annotation syntax, because it lets us add annotations for things like error bars without inventing new syntax, but I don't consider the syntax to be a Stage 2 blocker. |
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Fixes #97
See also #99
Applies the solution proposed in #97 (comment), introducing serializations like
[1.23+0 kilogram]and[4.2+1 ∅], i.e. with∅used in its place, and[]square brackets.Includes updates to both the spec and the readme, though the spec update does leave behind a TODO that should get filled in when the spec implementation for #86 lands.
CC @gibson042, @ljharb, @sffc