Skip to content

feat(dom): multi-width unicode character support#121

Open
Wybxc wants to merge 1 commit intoratatui:mainfrom
Wybxc:unicode
Open

feat(dom): multi-width unicode character support#121
Wybxc wants to merge 1 commit intoratatui:mainfrom
Wybxc:unicode

Conversation

@Wybxc
Copy link
Contributor

@Wybxc Wybxc commented Aug 18, 2025

This Pull Request introduces functionality to properly display multi-width Unicode characters (such as Chinese characters and Japanese kana) in DomBackend. It sets the cell following a multi-width Unicode character to display: none, ensuring the correct width is maintained.

Screenshot

Limitations:

  1. This implementation depends on specific fonts that must render CJK characters and Latin letters in an exact 2:1 width ratio. The font used in the example, Maple Mono CN, supports this feature for Chinese characters and Japanese kana but does not maintain the 2:1 ratio for Korean Hangul or emojis. As a result, misalignment may occur when displaying Korean text or emojis.
  2. Occasionally, misalignment may occur after resizing the window. The cause of this issue remains unclear.

@junkdog
Copy link
Member

junkdog commented Sep 6, 2025

neat, i'll take a closer look at it tomorrow.

does not maintain the 2:1 ratio for Korean Hangul or emojis

are you positive these glyphs are actually coming from Maple Mono CN and not a fallback? i've had similar issues when processing fonts - when the requested glyph doesn't exist, it'll look it up in related fonts (and this tends to result in mismatched font metrics).

@Wybxc
Copy link
Contributor Author

Wybxc commented Sep 6, 2025

are you positive these glyphs are actually coming from Maple Mono CN and not a fallback?

Well, Maple Mono CN does not contain glyphs for Korean Hangul or emojis, which is the real cause of the misalignment. However, it still highlights the issue that users must choose fonts carefully to ensure proper font metrics.

Copy link
Member

@junkdog junkdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for submitting! this is indeed addressing a pretty severe shortcoming in ratzilla.

i left a couple of comments, but overall it looks good!

thiserror = "2.0.12"
bitvec = { version = "1.0.1", default-features = false, features = ["alloc", "std"] }
beamterm-renderer = "0.1.1"
unicode-width = "0.2.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a 0.2.1 release of unicode-width

}

pre {
font-family: "Maple Mono NF CN", monospace;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wasn't sure this would work on my computer, but it looked fine (not sure what it's using though)

image

"你好,世界!",
"世界、こんにちは。",
// "헬로우 월드!",
// "👨💻👋🌐",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are the emoji disabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ofc "As a result, misalignment may occur when displaying Korean text or emojis." - what about enforcing the width as width * 2 for double-width symbols instead of calculating it from metrics - how does it look?

// "헬로우 월드!",
// "👨💻👋🌐",
]
.join("\n"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat and compact :)

let mut line_cells: Vec<Element> = Vec::new();
let mut hyperlink: Vec<Cell> = Vec::new();
let mut hyperlink: Vec<(Cell, bool)> = Vec::new();
let mut skip = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't skip be a boolean value ? either the next cell renders as usual or it is skipped/hidden.

are there any symbols extending more than double-width - if we stick to terminals?

}
} else {
let span = create_span(&self.document, cell)?;
let span = create_span(&self.document, cell, overwritten)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be clearer to have the old create_span() as-is and add a create_hidden_span(document); it also doesn't require a ref to cell, since it's only used for reading Cell::symbol. what do you think?

let mut skip = 0;
for (i, cell) in line.iter().enumerate() {
let overwritten = skip > 0;
skip = std::cmp::max(skip, cell.symbol().width()).saturating_sub(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to cache the width of all encountered symbols? cell.symbol().width() looks like it's could be fairly expensive if, for example, scrolling large portions of text at once.

/// accordingly.
fn update_grid(&mut self) -> Result<(), Error> {
for (y, line) in self.buffer.iter().enumerate() {
let mut skip = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment applies here about skip maybe being a boolean

@junkdog junkdog added feature New feature or request DOM DOM backend related labels Sep 7, 2025
@junkdog
Copy link
Member

junkdog commented Sep 7, 2025

2. Occasionally, misalignment may occur after resizing the window. The cause of this issue remains unclear.

does this behavior only trigger when there are double-width symbols? would be nice if we could get it resolved before merging.

benoitlx added a commit to benoitlx/ratzilla that referenced this pull request Dec 1, 2025
junkdog pushed a commit that referenced this pull request Jan 19, 2026
* wip: fix missalignement and glitch with fullwidth char for DOM back

see #135 (comment)

* fix: multiple width glyph support for DomBackend

for now, breaking cursor and resize
#135

* feat: get buffer size from utils

* fix: fix resize for DomBackend

#135

* fix: cursor for DomBackend

* feat(examples): add fullwidth glyph in demo2 emails

* feat: unicode example from #121

* chore: don't mind hyperlinks - expand unicode example

* fix: avoiding vertical flickering

* fix: unicode example name and removing external css

* fix: removing unecessary cursor attribute

* refactor: custom type for css attribute

* perf: avoid calling width method for ascii char

* fix: prevent OOB access

* style: update rustdoc comment

* refactor: remove unecessary temp vector

* refactor: simplify the update_css_field function

* test: update_css_field util function

* ops: test targetting wasm32 using wasm-pack

* fix build for unicode example

* style: style test

* ops: install wasm-pack from source

* refactor: remove unnecessary import

* ops: install wasm-pack with taiki-e install action

* style: import to the top, docstrings

* build: don't specify a version range for unicode-width

* refactor: get cells by reference instead of cloning them
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DOM DOM backend related feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants