Implement Tail Call Optimize into ReturnCall Instruction#400
Implement Tail Call Optimize into ReturnCall Instruction#400makachanm wants to merge 2 commits intoSamsung:mainfrom
Conversation
|
Please run |
|
I applied. thanks. |
zherczeg
left a comment
There was a problem hiding this comment.
This sounds like a good idea. I would simplify the parameter storing: the parameters are simply stored in a linear buffer, and copied by a memcpy.
| } | ||
|
|
||
| std::vector<size_t> tco_paramStore; | ||
| size_t tco_paramSize; |
There was a problem hiding this comment.
Vector should have a size.
There was a problem hiding this comment.
I recommend using the predefined Vector structure as shown below.
Also, regarding naming conventions, the Walrus project typically uses member variables that start with m_ and avoids underscores in the middle of names.
Vector<size_t> m_tcoParamStore;If you are using a fixed-length vector, please use VectorWithFixedSize instead:
VectorWithFixedSize<size_t> m_tcoParamStore;
m_tcoParamStore.reserve(newSize); // allocate array
m_tcoParamStore.clear(); // reset array| return reinterpret_cast<ByteCodeStackOffset*>(reinterpret_cast<size_t>(this) + sizeof(Call)); | ||
| } | ||
|
|
||
| uint16_t parameterOffsetsSize() const |
There was a problem hiding this comment.
Correct me if I wrong, but as far as I remember parameters are always stored in a linear memory area. This means their offsets do not need to be stored, they can be computed during running. The size of the tco buffer could be stored so the buffer could be pre-allocated.
|
Parameters are stored in a linear area: Then the tco buffer could be copied here without changes. |
|
I'll rework parameter recover code with memcpy and Walrus's own vector implementation. Thanks for the feedback. |
|
I think lack of JIT version of TCO implement causes CI failure. I'll seperate it for future works. |
| void destroyTCO() | ||
| { | ||
| m_tcoparamStore.clear(); | ||
| m_tcofunctionTarget = nullptr; |
There was a problem hiding this comment.
The names are wrong. They should be m_tcoParamStore and m_tcoFunction store.
|
I've verified that all tests in the walrus suite are passing in the x64 local machine at interpret mode. |
|
There is a conflict in |
67bc7f8 to
daa44c1
Compare
…nCallRef instruction
|
CI is failing when Clang compiler is used. I had no problem with GCC. |
| Table* table = instance->table(code->tableIndex()); | ||
|
|
||
| uint32_t idx = readValue<uint32_t>(bp, code->calleeOffset()); | ||
| if (idx >= table->size()) { |
There was a problem hiding this comment.
Add UNLIKELY to this check code
| Trap::throwException(state, "uninitialized element " + std::to_string(idx)); | ||
| } | ||
| const FunctionType* ft = target->functionType(); | ||
| if (!ft->equals(code->functionType())) { |
| Trap::throwException(state, "null function reference"); | ||
| } | ||
| const FunctionType* ft = target->functionType(); | ||
| if (!ft->equals(code->functionType())) { |
| if (newState.hasTCO()) { | ||
| state.m_tcoParamStore.reserve(newState.m_tcoParamStore.size()); | ||
| VectorCopier<size_t>::copy(state.m_tcoParamStore.data(), newState.m_tcoParamStore.data(), newState.m_tcoParamStore.size()); | ||
| state.m_tcoFunctionTarget = newState.m_tcoFunctionTarget; | ||
| return; | ||
| } |
There was a problem hiding this comment.
At this point, the vector needs to be copied into the upper ExecutionState.
Instead of this copy, what about allocating one global vector and reusing this vector?
This approach could reduce copy overhead.
There was a problem hiding this comment.
I don't have a idea for where I declare a global TCO buffer. I think addition of new execution state manager will be affect the code's consistency. and It can be harder to decide when the timing the destroy a TCO buffer.
There was a problem hiding this comment.
I think that Store would be a good place to keep the TCO buffer.
Store can be accessed through instance->module()->store() in the interpreter.
| uint8_t* bp, | ||
| Instance* instance); | ||
|
|
||
| static void returnCallOperation(ExecutionState& state, |
There was a problem hiding this comment.
It seems that this function is declared but never used
| if (state.hasTCO()) { | ||
| VectorCopier<size_t>::copy((size_t*)functionStackBase, state.m_tcoParamStore.data(), state.m_tcoParamStore.size()); | ||
| state.destroyTCO(); |
There was a problem hiding this comment.
When interpreter is called again due to tail call,
state.destroyTCO(); has been called so state doesn't have any valid TCO data at this point.
Would you check it again?
|
Lastly, have you been able to test this patch for recursive tail calls? |
I'll do it through recursive fibonacci test. Thanks for the mention. |
|
I found fibonacci test makes call stack exhaustion. I'm analyzing what's problem with. |
I implemented a tail call optimize into WebAssembly 3's ReturnCall recursive instruction.
It currently unfinished, (the issue with type checking in the standard's spec, indirect call is not completed..) but it passes Return Call Test code.