Right now Insn is 120 bytes (gasp!) and InsnId is 64 bits. ruby#16685 does some boxing and bit packing to get it down to 48 bytes but a) that's still pretty big and b) it's still a very ad-hoc representation.
enum Insn {
Const { val: Const },
Param,
LoadArg { idx: u32, id: ID, val_type: Type },
Entries { targets: Vec<BlockId> },
StringCopy { val: InsnId, chilled: bool, state: InsnId },
StringIntern { val: InsnId, state: InsnId },
// ...
FixnumAdd { left: InsnId, right: InsnId, state: InsnId },
// ...
}
If we copy a little bit of e.g. Cranelift's homework, we can instead get both a more regular representation and a much smaller Insn (say, 24 bytes, assuming InsnId is u32):
pub struct InsnData {
opcode: Opcode,
args: [InsnId; 4],
data: DataRef,
}
We can get a lot out of this representation. We have 133 opcodes (wow), of which:
- 14 have no
InsnId at all (10%)
- 24 hold 1
InsnId (18%)
- 45 hold 3
InsnId (33%)
- 1 holds 4
InsnId (0%)
- 1 has an optional
InsnId
- 24 are variadic (18%)
- 59 have some kind of other data attached (44%)
The majority (81%) of all opcodes could store their operands inline in this new representation, with the variadic instructions referencing an operand pool.
Data-holding instructions would reference separate typed data pools (imagine interning Types, a CCallData pool, an Invariant pool, ...).
Some APIs become cleaner: the for_each_... variants no longer need to case-by-case, the Display could probably get cleaned up,
Now, there are some problems with this:
- We have a lot of ergonomic niceties from having enum-of-structs. We
match a lot.
- The migration would be rough. That being said, we could probably make it incremental by adding an
expand instruction that gives us the enum variant back as a temporary holdover.
- It's a big experimental change. I imagine it might improve compile times and shrink native stack frames. I imagine it might make value numbering easier. But I'm not sure.
I'm not wed to any particular new representation but I do have an interest in shrinking Insn and InsnId and generally speeding up the compiler.
This could probably also be applied to LIR, too, which suffers from similar problems (size of LIR Insn is 192).
Right now
Insnis 120 bytes (gasp!) andInsnIdis 64 bits. ruby#16685 does some boxing and bit packing to get it down to 48 bytes but a) that's still pretty big and b) it's still a very ad-hoc representation.If we copy a little bit of e.g. Cranelift's homework, we can instead get both a more regular representation and a much smaller
Insn(say, 24 bytes, assumingInsnIdis u32):We can get a lot out of this representation. We have 133 opcodes (wow), of which:
InsnIdat all (10%)InsnId(18%)InsnId(33%)InsnId(0%)InsnIdThe majority (81%) of all opcodes could store their operands inline in this new representation, with the variadic instructions referencing an operand pool.
Data-holding instructions would reference separate typed data pools (imagine interning
Types, aCCallDatapool, anInvariantpool, ...).Some APIs become cleaner: the
for_each_...variants no longer need to case-by-case, theDisplaycould probably get cleaned up,Now, there are some problems with this:
matcha lot.expandinstruction that gives us the enum variant back as a temporary holdover.I'm not wed to any particular new representation but I do have an interest in shrinking
InsnandInsnIdand generally speeding up the compiler.This could probably also be applied to LIR, too, which suffers from similar problems (size of LIR
Insnis 192).