CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructi…#2355
Conversation
1efb50d to
1899aa7
Compare
abrown
left a comment
There was a problem hiding this comment.
The scalar_to_vector change looks good to me, as does the Wasm-to-CLIF translation; I'm unfamiliar with the aarch64 backend, though, so someone else should take a look there. Re: tests, I would hope that adding filetests for this type of thing (especially tests that benefit all backends) should not be too difficult. A test compile test (e.g. this one) would verify that the scalar_to_vector lowers to the VCode you expect in aarch64 but a test run test (e.g. this one) should be runnable on all backends if we remove the x64-specific parts (?).
|
@abrown Regarding tests, I expect that |
Agreed, I still think it is valuable to test this at the CLIF level as well, especially in the interim. |
cfallin
left a comment
There was a problem hiding this comment.
LGTM for the AArch64 changes. Echoing that it would be nice to have a test case; an aarch64 vcode test should be pretty simple, I think.
|
I added a filetest for the lowering of |
…ons. This patch implements, for aarch64, the following wasm SIMD extensions. v128.load32_zero and v128.load64_zero instructions WebAssembly/simd#237 The changes are straightforward: * no new CLIF instructions. They are translated into an existing CLIF scalar load followed by a CLIF `scalar_to_vector`. * the comment/specification for CLIF `scalar_to_vector` has been changed to match the actual intended semantics, per consulation with Andrew Brown. * translation from `scalar_to_vector` to aarch64 `fmov` instruction. This has been generalised slightly so as to allow both 32- and 64-bit transfers. * special-case zero in `lower_constant_f128` in order to avoid a potentially slow call to `Inst::load_fp_constant128`. * Once "Allow loads to merge into other operations during instruction selection in MachInst backends" (bytecodealliance#2340) lands, we can use that functionality to pattern match the two-CLIF pair and emit a single AArch64 instruction. * A simple filetest has been added. There is no comprehensive testcase in this commit, because that is a separate repo. The implementation has been tested, nevertheless.
1899aa7 to
3713dcc
Compare
| }, | ||
|
|
||
| /// Move from a GPR to a scalar FP register. | ||
| /// Move from a GPR to a vector register. The scalar value is parked in the lowest lane |
There was a problem hiding this comment.
BTW I don't mind the comment at all, but this operation is not special - virtually any instruction that operates on S or D registers (e.g. Inst::FpuRR) has exactly the same behaviour.
cfallin
left a comment
There was a problem hiding this comment.
Updated version looks good; thanks!
…ons.
This patch implements, for aarch64, the following wasm SIMD extensions.
v128.load32_zero and v128.load64_zero instructions
WebAssembly/simd#237
The changes are straightforward:
no new CLIF instructions. They are translated into an existing CLIF scalar
load followed by a CLIF
scalar_to_vector.the comment/specification for CLIF
scalar_to_vectorhas been changed tomatch the actual intended semantics, per consulation with Andrew Brown.
translation from
scalar_to_vectorto the obvious aarch64 insns.special-case zero in
lower_constant_f128in order to avoid apotentially slow call to
Inst::load_fp_constant128.Once "Allow loads to merge into other operations during instruction
selection in MachInst backends"
(Allow loads to merge into other operations during instruction selection in MachInst backends #2340) lands,
we can use that functionality to pattern match the two-CLIF pair and
emit a single AArch64 instruction.
There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.