Skip to content

Simplify insert/2 common case#19

Merged
fhunleth merged 1 commit into
mainfrom
simplify-insert
May 10, 2026
Merged

Simplify insert/2 common case#19
fhunleth merged 1 commit into
mainfrom
simplify-insert

Conversation

@fhunleth
Copy link
Copy Markdown
Collaborator

This simplifies insertion for the case where the buffer is full and an
element is dropped. This is usually a common real world case for long
running applications.

A simple performance test showed a ~13% speed increase on an M4 MBP.

The following is a Claude-assisted diff of the bytecodes which shows an
eliminated stack frame and BIF call:

Before — ~20 instructions, 4-slot stack frame, BIF call:

  get_map_elements ... [b: x2]         # extract :b
  is_ne_exact x2, nil                  # guard b != []
  allocate 4, 3                        # ⬅︎ stack frame
  init_yregs y0
  move x2 → y1
  move x1 → y2
  move x0 → y3                         # ⬅︎ spill all three live regs
  get_map_elements ... [a: y0]         # extract :a (with deopt fallback)
  ... (deopt path through elixir_erl_pass.no_parens_remote/2) ...
  bif :tl                              # ⬅︎ tl(b) as a BIF call
  test_heap 2,2
  put_list y2, x0, x0                  # cons [item | a]
  put_map_exact ... [a, b]
  deallocate 4
  return

  After — 7 instructions, no stack frame, decomposition opcode:

  get_map_elements ... [b: x3, a: x2]  # ⬅︎ both fields in one shot
  is_nonempty_list x3                  # ⬅︎ single tag check
  test_heap 2,4
  get_tl x3 → x3                       # ⬅︎ tail-of opcode, no BIF call
  put_list x1, x2, x1                  # cons [item | a]
  put_map_exact ... [a, b]
  return

This simplifies insertion for the case where the buffer is full and an
element is dropped. This is usually a common real world case for long
running applications.

A simple performance test showed a ~13% speed increase on an M4 MBP.

The following is a Claude-assisted diff of the bytecodes which shows an
eliminated stack frame and BIF call:

```
Before — ~20 instructions, 4-slot stack frame, BIF call:

  get_map_elements ... [b: x2]         # extract :b
  is_ne_exact x2, nil                  # guard b != []
  allocate 4, 3                        # ⬅︎ stack frame
  init_yregs y0
  move x2 → y1
  move x1 → y2
  move x0 → y3                         # ⬅︎ spill all three live regs
  get_map_elements ... [a: y0]         # extract :a (with deopt fallback)
  ... (deopt path through elixir_erl_pass.no_parens_remote/2) ...
  bif :tl                              # ⬅︎ tl(b) as a BIF call
  test_heap 2,2
  put_list y2, x0, x0                  # cons [item | a]
  put_map_exact ... [a, b]
  deallocate 4
  return

  After — 7 instructions, no stack frame, decomposition opcode:

  get_map_elements ... [b: x3, a: x2]  # ⬅︎ both fields in one shot
  is_nonempty_list x3                  # ⬅︎ single tag check
  test_heap 2,4
  get_tl x3 → x3                       # ⬅︎ tail-of opcode, no BIF call
  put_list x1, x2, x1                  # cons [item | a]
  put_map_exact ... [a, b]
  return
```
@fhunleth fhunleth merged commit 93d40c9 into main May 10, 2026
14 checks passed
@fhunleth fhunleth deleted the simplify-insert branch May 10, 2026 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant