Add SIMD alloca* scratch allocation APIs with ParparVM stack lowering and benchmark/compliance coverage#4772
Conversation
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/fe7ec42c-9a02-4f55-a18b-2f038764b2ef Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/fe7ec42c-9a02-4f55-a18b-2f038764b2ef Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/fe7ec42c-9a02-4f55-a18b-2f038764b2ef Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/fe7ec42c-9a02-4f55-a18b-2f038764b2ef Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
shai-almog
left a comment
There was a problem hiding this comment.
We have a verification stage for bytecodes in the maven build that checks bytecode isn't using methods it isn't allowed to. We can use that code to verify that calls to alloca methods don't return that value from the method, don't store it into a variable and don't invoke any method that isn't on the Simd class.
|
|
||
| /// Allocates a scratch byte array for temporary SIMD work. | ||
| /// On ParparVM this may be lowered to a stack-backed faux array, so callers | ||
| /// should keep it method-local and avoid returning or storing it beyond the |
There was a problem hiding this comment.
Fix wording to MUST and mark the methods as deprecated with a note that developers need to be cautious about using them. Don't use the @Deprecated annotation which I don't think we support.
There was a problem hiding this comment.
Updated in 4f33c2c. The wording now uses MUST, and the methods are marked deprecated via JavaDoc only with a cautionary note instead of using the @Deprecated annotation.
| extern JAVA_OBJECT allocArray(CODENAME_ONE_THREAD_STATE, int length, struct clazz* type, int primitiveSize, int dim); | ||
| extern JAVA_OBJECT allocArrayAligned(CODENAME_ONE_THREAD_STATE, int length, struct clazz* type, int primitiveSize, int dim, int alignment); | ||
| extern JAVA_OBJECT allocMultiArray(int* lengths, struct clazz* type, int primitiveSize, int dim); | ||
| #define CN1_SIMD_STACK_ARRAY(length, arrayClass, primitiveSize, dim, alignment) \ |
There was a problem hiding this comment.
alloca arrays are always single dimension and are always primitive. Does it make sense to align them always on the same value too?
This code should be simplified.
There was a problem hiding this comment.
Simplified in e16c0ee. The stack-allocation macro is now fixed to a single primitive, single-dimension layout with a named SIMD alignment constant instead of carrying dimension/alignment parameters.
|
Compared 37 screenshots: 37 matched. Native Android coverage
✅ Native Android screenshot tests passed. Native Android coverage
Benchmark ResultsDetailed Performance Metrics
|
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90bc6a2c-a647-42c2-b5fb-b753d4f84d2e Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90bc6a2c-a647-42c2-b5fb-b753d4f84d2e Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90bc6a2c-a647-42c2-b5fb-b753d4f84d2e Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90bc6a2c-a647-42c2-b5fb-b753d4f84d2e Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90bc6a2c-a647-42c2-b5fb-b753d4f84d2e Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Implemented in 9d015d3. I extended the Maven bytecode-compliance scan with SIMD alloca usage checks so alloca scratch arrays are flagged if they escape via returns/field stores/object-array stores or are passed to non-Simd methods. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
✅ ByteCodeTranslator Quality ReportTest & Coverage
Benchmark Results
Static Analysis
Generated automatically by the PR CI workflow. |
| } | ||
|
|
||
| /** | ||
| * @deprecated This is a special scratch-allocation API. On ParparVM this may be lowered to a |
There was a problem hiding this comment.
Use Java 25 markdown Javadoc comments like the rest of the code.
There was a problem hiding this comment.
Updated in bf3dde7. The alloca*() docs now use the Java 25 markdown JavaDoc /// style used in this codebase, while keeping the MUST/caution wording and the deprecation guidance.
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/bb5fa00f-a32c-4546-b631-d21141725593 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
|
Compared 37 screenshots: 37 matched. Benchmark Results
Build and Run Timing
Detailed Performance Metrics
|
shai-almog
left a comment
There was a problem hiding this comment.
This looks generally good in terms of code but the performance is regressed:
Metric Duration
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 916.000 ms
Base64 CN1 encode 1418.000 ms
Base64 encode ratio (CN1/native) 1.548x (54.8% slower)
Base64 native decode 678.000 ms
Base64 CN1 decode 914.000 ms
Base64 decode ratio (CN1/native) 1.348x (34.8% slower)
Base64 SIMD encode 416.000 ms
Base64 encode ratio (SIMD/native) 0.454x (54.6% faster)
Base64 encode ratio (SIMD/CN1) 0.293x (70.7% faster)
Base64 SIMD decode 419.000 ms
Base64 decode ratio (SIMD/native) 0.618x (38.2% faster)
Base64 decode ratio (SIMD/CN1) 0.458x (54.2% faster)
Image encode benchmark iterations 250
Image PNG encode (SIMD off) 359.000 ms
Image PNG encode (SIMD on) 454.000 ms
Image PNG encode ratio (SIMD on/off) 1.265x (26.5% slower)
Image JPEG encode (SIMD off) 215.000 ms
Image JPEG encode (SIMD on) 308.000 ms
Image JPEG encode ratio (SIMD on/off) 1.433x (43.3% slower)
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/9a8f2cf5-e212-48b0-ab89-e71cd2519b4a Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Fixed in 6e7b209. The image SIMD paths were still reading the |
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/f3eb3bb6-bb5f-492d-80ff-adc83cce9d51 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/f3eb3bb6-bb5f-492d-80ff-adc83cce9d51 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/220180d6-f744-495c-9d75-8899303421b8 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/b34e40cc-4df9-45c8-bc5e-059771a6a271 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/72f1389b-a120-4788-959d-fc0e175b8bfa Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/919c6090-2fc0-4540-904b-1d0e8fba19b3 Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
… benchmark image to 256x256 Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/9aba781a-23d2-4935-b086-6e0334cecb8f Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
…imitives Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/3e643d42-5622-4542-a45c-e0cc1835193a Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Continue the SIMD fusion work started with
blendByMaskTestNonzero. Two remainingImage.javaSIMD paths still issue multiple primitive calls (and therefore multiple native dispatches + multiple passes over the buffer). Fuse them into one new SIMD primitive each.Plan
Menu Button On Android Shows Lightweight Menu Instead of Native Menu #1 — Fuse
applyMask(4 passes → 1)replaceTopByteFromUnsignedBytes(int[] rgbSrc, int rgbSrcOff, byte[] alphaSrc, int alphaSrcOff, int[] dst, int dstOff, int len)computingdst[i] = (rgbSrc[i] & 0x00ffffff) | ((alphaSrc[i] & 0xff) << 24)toSimd.java(Java fallback).IOSSimd.java.IOSSimd.m(vmovl_u8→vshlq_n_u32(24)→vorrq(vandq, …)).JavaSESimd.java.Image.applyMask(Object)to a single call to the new primitive (eliminating the unpack/shl/and/or chain and the int-scratchalloca).SimdTest.java(fallback semantics + registered-array round-trip).Resource Editor - when opened from a generated project actions are disabled #2 — Fuse
removeColorpath (3 passes → 1)blendByMaskTestNonzeroSubstituteOnKeepEq(int[] src, int srcOff, int testMask, int trueKeepMask, int trueOrValue, int removeMatch, int removeValue, int[] dst, int dstOff, int len)computingdst[i] = (src[i] & testMask) == 0 ? src[i] : ((src[i] & trueKeepMask) == removeMatch ? removeValue : (src[i] & trueKeepMask) | trueOrValue)(Java fallback inSimd.java, validating overrides, NEON impl on iOS).Image.replaceAlphaPreserveTransparentRemoveColorSimdto a single call (eliminating thetmp/removeMaskscratch buffers and thecmpEq+selectfollow-up passes).SimdTest.java.Validation
mvn clean verify -DunitTests=true -pl core-unittests -am -Dmaven.javadoc.skip=true -Plocal-dev-javase→ BUILD SUCCESS.ImageTest,IndexedImageTest,DynamicImageTest,RGBImageTest,LabelFeatureTest,ComponentImageTest,SimdTestpass.