[AArch64][GlobalISel] Fix incorrect codegen for FPR16/FPR8 to GPR copies #171499

IanButterworth · 2025-12-09T20:37:18Z

github-actions · 2025-12-09T20:37:35Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-12-09T20:38:05Z

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Ian Butterworth (IanButterworth)

Changes

I reported #171494 and thought I would see if Claude (opus 4.5) was able to suggest a fix.

This PR is the product of that.

Please note that I am not familiar with LLVM development, and this should be reviewed with that in mind.

Claude-generated explanation:

Previously, copyPhysReg() was missing handlers for copies between FPR16/FPR8 and GPR32/GPR64 register classes. These cases fell through to the NZCV handler, which incorrectly generated 'mrs Rd, NZCV' instead of the proper FMOV instruction.

This caused incorrect code generation for patterns like:
%ival = bitcast half %val to i16
store atomic i16 %ival, ptr %addr release, align 2

Which generated 'mrs w8, NZCV' instead of 'fmov w8, h0'.

The fix adds proper copy handlers:

FPR16 <-> GPR32: Use FMOVHWr/FMOVWHr with FullFP16, otherwise promote to FPR32 super-register and use FMOVSWr/FMOVWSr
FPR16 <-> GPR64: Use FMOVHXr/FMOVXHr with FullFP16, otherwise promote to FPR64 super-register and use FMOVDXr/FMOVXDr
FPR8 <-> GPR32: Promote to FPR32 and use FMOVSWr/FMOVWSr
FPR8 <-> GPR64: Promote to FPR64 and use FMOVDXr/FMOVXDr

Fixes #171494

Full diff: https://github.com/llvm/llvm-project/pull/171499.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+96)
(added) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll (+109)

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index f82180fc57b99..5027acec851cf 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -5851,6 +5851,102 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
     return;
   }
 
+  // Copies between GPR32 and FPR16.
+  if (AArch64::FPR16RegClass.contains(DestReg) &&
+      AArch64::GPR32RegClass.contains(SrcReg)) {
+    if (Subtarget.hasFullFP16()) {
+      BuildMI(MBB, I, DL, get(AArch64::FMOVWHr), DestReg)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    } else {
+      MCRegister DestRegS =
+          RI.getMatchingSuperReg(DestReg, AArch64::hsub, &AArch64::FPR32RegClass);
+      BuildMI(MBB, I, DL, get(AArch64::FMOVWSr), DestRegS)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    }
+    return;
+  }
+  if (AArch64::GPR32RegClass.contains(DestReg) &&
+      AArch64::FPR16RegClass.contains(SrcReg)) {
+    if (Subtarget.hasFullFP16()) {
+      BuildMI(MBB, I, DL, get(AArch64::FMOVHWr), DestReg)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    } else {
+      MCRegister SrcRegS =
+          RI.getMatchingSuperReg(SrcReg, AArch64::hsub, &AArch64::FPR32RegClass);
+      BuildMI(MBB, I, DL, get(AArch64::FMOVSWr), DestReg)
+          .addReg(SrcRegS, RegState::Undef)
+          .addReg(SrcReg, RegState::Implicit | getKillRegState(KillSrc));
+    }
+    return;
+  }
+
+  // Copies between GPR64 and FPR16.
+  if (AArch64::FPR16RegClass.contains(DestReg) &&
+      AArch64::GPR64RegClass.contains(SrcReg)) {
+    if (Subtarget.hasFullFP16()) {
+      BuildMI(MBB, I, DL, get(AArch64::FMOVXHr), DestReg)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    } else {
+      MCRegister DestRegD =
+          RI.getMatchingSuperReg(DestReg, AArch64::hsub, &AArch64::FPR64RegClass);
+      BuildMI(MBB, I, DL, get(AArch64::FMOVXDr), DestRegD)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    }
+    return;
+  }
+  if (AArch64::GPR64RegClass.contains(DestReg) &&
+      AArch64::FPR16RegClass.contains(SrcReg)) {
+    if (Subtarget.hasFullFP16()) {
+      BuildMI(MBB, I, DL, get(AArch64::FMOVHXr), DestReg)
+          .addReg(SrcReg, getKillRegState(KillSrc));
+    } else {
+      MCRegister SrcRegD =
+          RI.getMatchingSuperReg(SrcReg, AArch64::hsub, &AArch64::FPR64RegClass);
+      BuildMI(MBB, I, DL, get(AArch64::FMOVDXr), DestReg)
+          .addReg(SrcRegD, RegState::Undef)
+          .addReg(SrcReg, RegState::Implicit | getKillRegState(KillSrc));
+    }
+    return;
+  }
+
+  // Copies between GPR32 and FPR8.
+  if (AArch64::FPR8RegClass.contains(DestReg) &&
+      AArch64::GPR32RegClass.contains(SrcReg)) {
+    MCRegister DestRegS =
+        RI.getMatchingSuperReg(DestReg, AArch64::bsub, &AArch64::FPR32RegClass);
+    BuildMI(MBB, I, DL, get(AArch64::FMOVWSr), DestRegS)
+        .addReg(SrcReg, getKillRegState(KillSrc));
+    return;
+  }
+  if (AArch64::GPR32RegClass.contains(DestReg) &&
+      AArch64::FPR8RegClass.contains(SrcReg)) {
+    MCRegister SrcRegS =
+        RI.getMatchingSuperReg(SrcReg, AArch64::bsub, &AArch64::FPR32RegClass);
+    BuildMI(MBB, I, DL, get(AArch64::FMOVSWr), DestReg)
+        .addReg(SrcRegS, RegState::Undef)
+        .addReg(SrcReg, RegState::Implicit | getKillRegState(KillSrc));
+    return;
+  }
+
+  // Copies between GPR64 and FPR8.
+  if (AArch64::FPR8RegClass.contains(DestReg) &&
+      AArch64::GPR64RegClass.contains(SrcReg)) {
+    MCRegister DestRegD =
+        RI.getMatchingSuperReg(DestReg, AArch64::bsub, &AArch64::FPR64RegClass);
+    BuildMI(MBB, I, DL, get(AArch64::FMOVXDr), DestRegD)
+        .addReg(SrcReg, getKillRegState(KillSrc));
+    return;
+  }
+  if (AArch64::GPR64RegClass.contains(DestReg) &&
+      AArch64::FPR8RegClass.contains(SrcReg)) {
+    MCRegister SrcRegD =
+        RI.getMatchingSuperReg(SrcReg, AArch64::bsub, &AArch64::FPR64RegClass);
+    BuildMI(MBB, I, DL, get(AArch64::FMOVDXr), DestReg)
+        .addReg(SrcRegD, RegState::Undef)
+        .addReg(SrcReg, RegState::Implicit | getKillRegState(KillSrc));
+    return;
+  }
+
   if (DestReg == AArch64::NZCV) {
     assert(AArch64::GPR64RegClass.contains(SrcReg) && "Invalid NZCV copy");
     BuildMI(MBB, I, DL, get(AArch64::MSR))
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll
new file mode 100644
index 0000000000000..0109b4d7c4c08
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll
@@ -0,0 +1,109 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=arm64-apple-ios -global-isel -global-isel-abort=1 -verify-machineinstrs | FileCheck %s --check-prefix=CHECK-NOFP16
+; RUN: llc < %s -mtriple=arm64-apple-ios -mattr=+fullfp16 -global-isel -global-isel-abort=1 -verify-machineinstrs | FileCheck %s --check-prefix=CHECK-FP16
+
+; Test for https://github.com/llvm/llvm-project/issues/171494
+; Atomic store of bitcast half to i16 was generating incorrect code (mrs instead of fmov).
+
+define void @atomic_store_half(ptr %addr, half %val) {
+; CHECK-NOFP16-LABEL: atomic_store_half:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    fmov w8, s0
+; CHECK-NOFP16-NEXT:    stlrh w8, [x0]
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_store_half:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    fmov w8, h0
+; CHECK-FP16-NEXT:    stlrh w8, [x0]
+; CHECK-FP16-NEXT:    ret
+  %ival = bitcast half %val to i16
+  store atomic i16 %ival, ptr %addr release, align 2
+  ret void
+}
+
+define half @atomic_load_half(ptr %addr) {
+; CHECK-NOFP16-LABEL: atomic_load_half:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    ldarh w8, [x0]
+; CHECK-NOFP16-NEXT:    fmov s0, w8
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_load_half:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    ldarh w8, [x0]
+; CHECK-FP16-NEXT:    fmov h0, w8
+; CHECK-FP16-NEXT:    ret
+  %ival = load atomic i16, ptr %addr acquire, align 2
+  %val = bitcast i16 %ival to half
+  ret half %val
+}
+
+define void @atomic_store_bfloat(ptr %addr, bfloat %val) {
+; CHECK-NOFP16-LABEL: atomic_store_bfloat:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    fmov w8, s0
+; CHECK-NOFP16-NEXT:    stlrh w8, [x0]
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_store_bfloat:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    fmov w8, h0
+; CHECK-FP16-NEXT:    stlrh w8, [x0]
+; CHECK-FP16-NEXT:    ret
+  %ival = bitcast bfloat %val to i16
+  store atomic i16 %ival, ptr %addr release, align 2
+  ret void
+}
+
+define bfloat @atomic_load_bfloat(ptr %addr) {
+; CHECK-NOFP16-LABEL: atomic_load_bfloat:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    ldarh w8, [x0]
+; CHECK-NOFP16-NEXT:    fmov s0, w8
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_load_bfloat:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    ldarh w8, [x0]
+; CHECK-FP16-NEXT:    fmov h0, w8
+; CHECK-FP16-NEXT:    ret
+  %ival = load atomic i16, ptr %addr acquire, align 2
+  %val = bitcast i16 %ival to bfloat
+  ret bfloat %val
+}
+
+; Test FPR8 to GPR32 copies (bitcast <1 x i8> to i8 for atomic store)
+define void @atomic_store_v1i8(ptr %addr, <1 x i8> %val) {
+; CHECK-NOFP16-LABEL: atomic_store_v1i8:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    fmov w8, s0
+; CHECK-NOFP16-NEXT:    stlrb w8, [x0]
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_store_v1i8:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    fmov w8, s0
+; CHECK-FP16-NEXT:    stlrb w8, [x0]
+; CHECK-FP16-NEXT:    ret
+  %ival = bitcast <1 x i8> %val to i8
+  store atomic i8 %ival, ptr %addr release, align 1
+  ret void
+}
+
+define <1 x i8> @atomic_load_v1i8(ptr %addr) {
+; CHECK-NOFP16-LABEL: atomic_load_v1i8:
+; CHECK-NOFP16:       ; %bb.0:
+; CHECK-NOFP16-NEXT:    ldarb w8, [x0]
+; CHECK-NOFP16-NEXT:    fmov s0, w8
+; CHECK-NOFP16-NEXT:    ret
+;
+; CHECK-FP16-LABEL: atomic_load_v1i8:
+; CHECK-FP16:       ; %bb.0:
+; CHECK-FP16-NEXT:    ldarb w8, [x0]
+; CHECK-FP16-NEXT:    fmov s0, w8
+; CHECK-FP16-NEXT:    ret
+  %ival = load atomic i8, ptr %addr acquire, align 1
+  %val = bitcast i8 %ival to <1 x i8>
+  ret <1 x i8> %val
+}

github-actions · 2025-12-09T23:19:25Z

✅ With the latest revision this PR passed the C/C++ code formatter.

github-actions · 2025-12-09T23:46:27Z

🐧 Linux x64 Test Results

187269 tests passed
4944 tests skipped

✅ The build succeeded and all tests passed.

github-actions · 2025-12-09T23:46:27Z

🪟 Windows x64 Test Results

128546 tests passed
2805 tests skipped

✅ The build succeeded and all tests passed.

xal-0 · 2025-12-10T05:30:31Z

I would guess this could be handled more easily in legalization, or in AArch64InstructionSelector::select, which already has some special handling for stronger-than-monotonic stores. copyPhysReg feels like the wrong place to me.

This change to legalization fixes the bug and handles a pair of TODOs:

diff --git i/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp w/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 44a148940ec9..1761b356b655 100644
--- i/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ w/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -568,6 +568,10 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
         return Query.Types[0] == s128 &&
                Query.MMODescrs[0].Ordering != AtomicOrdering::NotAtomic;
       })
+      .widenScalarIf(
+          all(scalarNarrowerThan(0, 32),
+              atomicOrderingAtLeastOrStrongerThan(0, AtomicOrdering::Release)),
+          changeTo(0, s32))
       .legalForTypesWithMemDesc(
           {{s8, p0, s8, 8},     {s16, p0, s8, 8},  // truncstorei8 from s16
            {s32, p0, s8, 8},                       // truncstorei8 from s32
diff --git i/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll w/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
index c80b18d17888..de12866fc2f4 100644
--- i/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
+++ w/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
@@ -357,12 +357,10 @@ define void @store_atomic_i128_unaligned_seq_cst(i128 %value, ptr %ptr) {
     ret void
 }
 
-; TODO: missed opportunity to emit a stlurb w/ GISel
 define void @store_atomic_i8_from_gep() {
 ; GISEL-LABEL: store_atomic_i8_from_gep:
 ; GISEL:    bl init
-; GISEL:    add x9, x8, #1
-; GISEL:    stlrb w8, [x9]
+; GISEL:    stlurb w8, [x9, #1]
 ;
 ; SDAG-LABEL: store_atomic_i8_from_gep:
 ; SDAG:    bl init
@@ -374,12 +372,10 @@ define void @store_atomic_i8_from_gep() {
   ret void
 }
 
-; TODO: missed opportunity to emit a stlurh w/ GISel
 define void @store_atomic_i16_from_gep() {
 ; GISEL-LABEL: store_atomic_i16_from_gep:
 ; GISEL:    bl init
-; GISEL:    add x9, x8, #2
-; GISEL:    stlrh w8, [x9]
+; GISEL:    stlurh w8, [x9, #2]
 ;
 ; SDAG-LABEL: store_atomic_i16_from_gep:
 ; SDAG:    bl init

IanButterworth · 2025-12-10T07:58:19Z

Thanks. I added that fix keeping the previous as they seem complimentary?

IanButterworth · 2025-12-10T12:43:43Z

Can the CI be run here please.

Previously, copyPhysReg() was missing handlers for copies between FPR16/FPR8 and GPR32/GPR64 register classes. These cases fell through to the NZCV handler, which incorrectly generated 'mrs Rd, NZCV' instead of the proper FMOV instruction. This caused incorrect code generation for patterns like: %ival = bitcast half %val to i16 store atomic i16 %ival, ptr %addr release, align 2 Which generated 'mrs w8, NZCV' instead of 'fmov w8, h0'. The fix adds proper copy handlers: - FPR16 <-> GPR32: Use FMOVHWr/FMOVWHr with FullFP16, otherwise promote to FPR32 super-register and use FMOVSWr/FMOVWSr - FPR16 <-> GPR64: Use FMOVHXr/FMOVXHr with FullFP16, otherwise promote to FPR64 super-register and use FMOVDXr/FMOVXDr - FPR8 <-> GPR32: Promote to FPR32 and use FMOVSWr/FMOVWSr - FPR8 <-> GPR64: Promote to FPR64 and use FMOVDXr/FMOVXDr Fixes llvm#171494 Co-authored-by: Claude <noreply@anthropic.com>

- Fix LLVM code style: use trailing parameter alignment for getMatchingSuperReg calls instead of assignment-aligned continuation - Update CHECK-FP16 expectations for atomic_load_half/bfloat: GlobalISel allocates FPR32 (s0) for GPR->FPR copies even with FullFP16, then narrows to h0 via kill annotation. The FullFP16 optimization only applies to FPR->GPR (store) direction where the value arrives in h0. - Add kill annotations to CHECK-NOFP16 load tests to match actual output Co-Authored-By: Claude <noreply@anthropic.com>

@xal-0

Per review suggestion from @xal-0, handle the atomic store issue in the legalizer instead of copyPhysReg. This adds a widenScalarIf rule for G_STORE that widens scalar types narrower than 32 bits to s32 for atomic stores with release ordering or stronger. This approach: - Fixes the original bug (mrs instead of fmov for half->i16 atomic store) - Enables stlurb/stlurh codegen for GISel (removes two TODOs) - Is a more appropriate place to handle type legalization Co-authored-by: Claude <noreply@anthropic.com>

- Add kill annotations for store tests (def $h0 killed $h0 def $s0) - Remove v1i8 tests: they use umov.b (vector extract), not the FPR8->GPR copy path, so they don't test the fixes in this PR Co-authored-by: Claude <noreply@anthropic.com>

arsenm · 2025-12-10T14:30:22Z

llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp

        return Query.Types[0] == s128 &&
               Query.MMODescrs[0].Ordering != AtomicOrdering::NotAtomic;
      })
+      .widenScalarIf(


The legal cases should be listed first in the legalizer rules, any of these modifying actions should be moved closer to the end of the rule list

It seems widenScalarIf must be before legalForTypesWithMemDesc (see failures in 07c8acc) so I moved it back here. Does that sound right?

No, that shouldn't be an issue

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll

davemgreen

Copies should only be between elements of the same size.
The legalization is maybe fine. We should be producing a GPR register in regbankselect for all atomics outside of LRCPC3. We could handle that later but we don't generate fp STLUR at the moment, so maybe it is fine to just force everything in legalization to 32bit, until we properly support it.

@arsenm

- Remove copyPhysReg changes: per @arsenm and @davemgreen, copyPhysReg shouldn't need to handle new situations for GlobalISel, and copies should only be between elements of the same size - Move widenScalarIf rule to end of G_STORE legalizer chain: per @arsenm, legal cases should be listed first, modifying actions should be near end - Remove -verify-machineinstrs from test RUN lines: per @arsenm suggestion Co-authored-by: Claude <noreply@anthropic.com>

The legalizer widenScalarIf rule only applies to G_STORE with release+ ordering, so the atomic_load_half and atomic_load_bfloat tests don't actually test the fix. Co-authored-by: Claude <noreply@anthropic.com>

IanButterworth · 2025-12-10T16:25:52Z

I've addressed the reviews, if CI can be re-run please.

The widenScalarIf rule must come BEFORE legalForTypesWithMemDesc so it can intercept atomic stores with release+ ordering before they match as legal s8/s16 types. Moving it to the end caused the rule to never fire because the stores were already matched as legal. Co-authored-by: Claude <noreply@anthropic.com>

IanButterworth · 2025-12-11T00:47:28Z

Tests pass locally so worth running CI again.

IanButterworth · 2025-12-12T08:18:55Z

@arsenm @davemgreen does this look ok now?

davemgreen

I think this is OK for the time being. Thanks for the bug fix. LGTM.

github-actions · 2025-12-13T10:48:24Z

@IanButterworth Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

llvm-ci · 2025-12-13T13:11:21Z

LLVM Buildbot has detected a new failure on builder reverse-iteration running on hexagon-build-03 while building llvm at step 6 "check_all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/110/builds/6781

Here is the relevant piece of the build log for the reference

Step 6 (check_all) failure: test (failure)
******************** TEST 'Clang :: Interpreter/dynamic-library.cpp' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 17
cat /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/dynamic-library.cpp | env LD_LIBRARY_PATH=/local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/Inputs:$LD_LIBRARY_PATH /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.obj/bin/clang-repl | /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.obj/bin/FileCheck /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/dynamic-library.cpp
# executed command: cat /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/dynamic-library.cpp
# .---command stdout------------
# | // REQUIRES: host-supports-jit, x86_64-linux
# | 
# | // To generate libdynamic-library-test.so :
# | // clang -xc++ -o libdynamic-library-test.so -fPIC -shared
# | //
# | // extern "C" {
# | //
# | // int ultimate_answer = 0;
# | // 
# | // int calculate_answer() {
# | //   ultimate_answer = 42;
# | //   return 5;
# | // }
# | //
# | // }
# | 
# | // RUN: cat %s | env LD_LIBRARY_PATH=%S/Inputs:$LD_LIBRARY_PATH clang-repl | FileCheck %s
# | 
# | extern "C" int printf(const char* format, ...);
# | 
# | extern "C" int ultimate_answer;
# | extern "C" int calculate_answer();
# | 
# | %lib libdynamic-library-test.so
# | 
# | printf("Return value: %d\n", calculate_answer());
# | // CHECK: Return value: 5
# | 
# | printf("Variable: %d\n", ultimate_answer);
# | // CHECK-NEXT: Variable: 42
# | 
# | %quit
# `-----------------------------
# executed command: env 'LD_LIBRARY_PATH=/local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/Inputs:$LD_LIBRARY_PATH' /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.obj/bin/clang-repl
# .---command stderr------------
# | /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.obj/bin/clang-repl: error while loading shared libraries: libc++.so.1: cannot open shared object file: No such file or directory
# `-----------------------------
# error: command failed with exit status: 127
# executed command: /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.obj/bin/FileCheck /local/mnt/workspace/bots/hexagon-build-03/reverse-iteration/llvm.src/clang/test/Interpreter/dynamic-library.cpp
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
...

IanButterworth · 2025-12-14T13:59:47Z

Assuming unrelated as happened on x86_64

…ies (llvm#171499) Fixes llvm#171494

llvmbot added backend:AArch64 llvm:globalisel labels Dec 9, 2025

IanButterworth force-pushed the ib/fix_171494 branch from 9454c68 to c93d641 Compare December 10, 2025 02:53

IanButterworth and others added 4 commits December 10, 2025 09:23

Fix test expectations to match actual codegen

07ca666

- Add kill annotations for store tests (def $h0 killed $h0 def $s0) - Remove v1i8 tests: they use umov.b (vector extract), not the FPR8->GPR copy path, so they don't test the fixes in this PR Co-authored-by: Claude <noreply@anthropic.com>

IanButterworth force-pushed the ib/fix_171494 branch from ceea9c9 to 07ca666 Compare December 10, 2025 14:23

arsenm reviewed Dec 10, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp Outdated Show resolved Hide resolved

arsenm reviewed Dec 10, 2025

View reviewed changes

llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-store-fp16.ll Outdated Show resolved Hide resolved

davemgreen reviewed Dec 10, 2025

View reviewed changes

IanButterworth and others added 2 commits December 10, 2025 09:52

Remove atomic load tests that don't exercise the fix

07c8acc

The legalizer widenScalarIf rule only applies to G_STORE with release+ ordering, so the atomic_load_half and atomic_load_bfloat tests don't actually test the fix. Co-authored-by: Claude <noreply@anthropic.com>

IanButterworth and others added 2 commits December 10, 2025 15:25

Merge branch 'main' into ib/fix_171494

dc120e0

davemgreen approved these changes Dec 13, 2025

View reviewed changes

davemgreen merged commit bea172c into llvm:main Dec 13, 2025
10 checks passed

IanButterworth deleted the ib/fix_171494 branch December 13, 2025 12:45

IanButterworth mentioned this pull request Dec 13, 2025

jitlayers: Use GlobalISel on AArch64 at -O0/-O1 JuliaLang/julia#60339

Open

anonymouspc pushed a commit to anonymouspc/llvm that referenced this pull request Dec 15, 2025

[AArch64][GlobalISel] Fix incorrect codegen for FPR16/FPR8 to GPR cop…

7e70bbc

…ies (llvm#171499) Fixes llvm#171494

[AArch64][GlobalISel] Fix incorrect codegen for FPR16/FPR8 to GPR copies #171499

[AArch64][GlobalISel] Fix incorrect codegen for FPR16/FPR8 to GPR copies #171499

Conversation

IanButterworth commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

llvmbot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

github-actions bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪟 Windows x64 Test Results

Uh oh!

xal-0 commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IanButterworth commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IanButterworth commented Dec 10, 2025

Uh oh!

arsenm Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

IanButterworth Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

IanButterworth commented Dec 10, 2025

Uh oh!

IanButterworth commented Dec 11, 2025

Uh oh!

IanButterworth commented Dec 12, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Dec 13, 2025

Uh oh!

llvm-ci commented Dec 13, 2025

Uh oh!

IanButterworth commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

IanButterworth commented Dec 9, 2025 •

edited

Loading

llvmbot commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 9, 2025 •

edited

Loading

xal-0 commented Dec 10, 2025 •

edited

Loading

IanButterworth commented Dec 10, 2025 •

edited

Loading

IanButterworth Dec 10, 2025 •

edited

Loading