Skip to content

Conversation

@uweigand
Copy link
Member

@uweigand uweigand commented Apr 6, 2025

This is a follow-on to #10502 implementing the same logic for s390x.

Like other platforms, we now no longer emit any instructions handling return values after the call instruction; instead, everything is done within a pseudo Call instruction. Unlike other platforms, this also has to handle vector lane swapping when calling between ABIs that mix BE and LE lane orders. The pseudo Call instruction needs enough type information to make this choice, therefore I had to add a type field to the RetLocation::Reg case (the ::Stack case already had one).

All other changes are contained within the s390x back-end. To simplify the implementation, the patch also implements a number of clean-ups:

  • Introduce a MemArg::SpillOffset variant
  • Remove the (unnecessary on this platform) island code size check
  • Merge the CallInd instructions into the base Call instructions, using a new CallInstDest type to carry the call destination

@uweigand uweigand requested a review from a team as a code owner April 6, 2025 17:31
@uweigand uweigand requested review from cfallin and removed request for a team April 6, 2025 17:31
@uweigand
Copy link
Member Author

uweigand commented Apr 6, 2025

Hi @cfallin , this is the s390x implementation of the new return-value logic. Note that I'm seeing some slight code-gen (performance) regressions, e.g. in vec-abi-128.clif:

 ;   basr %r14, %r4
-;   vl %v20, 0xb0(%r15)
+;   vl %v1, 0xb0(%r15)
+;   vst %v1, 0xc0(%r15)
 ;   lgr %r2, %r6
-;   vst %v20, 0(%r2)
-;   lmg %r6, %r15, 0xf0(%r15)
+;   vl %v19, 0xc0(%r15)
+;   vst %v19, 0(%r2)
+;   lmg %r6, %r15, 0x100(%r15)
 ;   br %r14

Looks like the register allocator does not allocate the vector return value into a register but to a spill slot, even though most registers would be available. Maybe the problem is that on the platform all vector registers are clobbered by calls, and therefore in the clobber list of the pseudo Call instruction. But that shouldn't prevent the allocator from chosing the register as output - not sure if this can be modeled ...

This is a follow-on to bytecodealliance#10502
implementing the same logic for s390x.

Like other platforms, we now no longer emit any instructions handling
return values after the call instruction; instead, everything is done
within a pseudo Call instruction.  Unlike other platforms, this also
has to handle vector lane swapping when calling between ABIs that mix
BE and LE lane orders.  The pseudo Call instruction needs enough type
information to make this choice, therefore I had to add a type field
to the RetLocation::Reg case (the ::Stack case already had one).

All other changes are contained within the s390x back-end.  To simplify
the implementation, the patch also implements a number of clean-ups:
- Introduce a MemArg::SpillOffset variant
- Remove the (unnecessary on this platform) island code size check
- Merge the CallInd instructions into the base Call instructions,
  using a new CallInstDest type to carry the call destination
@uweigand uweigand force-pushed the s390x-retval-loads branch from b4fd6a5 to 03b3b09 Compare April 6, 2025 17:40
@bjorn3
Copy link
Contributor

bjorn3 commented Apr 6, 2025

A register can only be in the clobber list or the output list, not both at the same time.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen isle Related to the ISLE domain-specific language labels Apr 6, 2025
@github-actions
Copy link

github-actions bot commented Apr 6, 2025

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I will fully admit I am least familiar with s390x out of all of our backends.

Not sure if you also want to take a look at this, @cfallin.

@cfallin
Copy link
Member

cfallin commented Apr 7, 2025

Yep, this is on my queue as well; I'll make a pass over it. Thanks Ulrich for getting this together so quickly!

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me as well. Thanks!

@cfallin cfallin added this pull request to the merge queue Apr 7, 2025
Merged via the queue into bytecodealliance:main with commit 8001451 Apr 7, 2025
71 checks passed
@uweigand uweigand deleted the s390x-retval-loads branch April 9, 2025 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants