Skip to content

Update coresight decoder#2

Open
ram9ir wants to merge 7 commits intoAFLplusplus:masterfrom
ram9ir:dev
Open

Update coresight decoder#2
ram9ir wants to merge 7 commits intoAFLplusplus:masterfrom
ram9ir:dev

Conversation

@ram9ir
Copy link

@ram9ir ram9ir commented Mar 18, 2026

Added/Fixed:

  • added support trace decoding for 32-bit ARM binary;
  • added processing Exact Match Address packet and the decoder now is able to update the three address registers when decoding the address-related packets;
  • added the ability to save a stream of instructions in Module + Offset (modoff) format to a file. This format is compatible with the lighthouse for code coverage exploration;
  • added processing some new address packets;
  • fixed processing indirect branches.

@ram9ir
Copy link
Author

ram9ir commented Mar 19, 2026

hey @vanhauser-thc, we a ready for a review here, pls

@vanhauser-thc vanhauser-thc requested a review from Copilot March 19, 2026 09:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the CoreSight (ETMv4) trace decoder to support additional packet formats and 32-bit ARM binaries, and adds an optional instruction-flow export for Lighthouse-compatible coverage exploration.

Changes:

  • Added decoding support for additional address packet types (IS1 variants, 32-bit address+context, exact match address packets) and updated address-register tracking.
  • Added optional instruction-flow collection and export to coverage.txt in module+offset format (gated by INSN_SAVE).
  • Added ARM32 (A32) disassembly mode selection (gated by ARM32) and updated branch handling.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/processor.cpp Enables instruction-flow saving via env var; passes binary/module name into MemoryImage.
src/process.cpp Adds global instruction-flow buffer + export; expands packet/state handling to new address packet types.
src/libcsdec.cpp Extends C API init to pass image paths into MemoryImage; enables instruction-flow saving via env var.
src/disassembler.cpp Adds ARM32 mode support; logs instruction flow during disassembly; adjusts branch metadata.
src/decoder.cpp Adds decoding for new address packet headers/types; implements exact-match address packets; tracks multiple address registers.
src/common.cpp Extends MemoryImage to store a binary/module name.
include/common.hpp Adds binary_name to MemoryImage and updates constructors.
include/decoder.hpp Adds new packet types and address-register storage/APIs to Decoder.
include/libcsdec.h Extends libcsdec_memory_image to include a path field used as module name.
HOWTO.md Documents INSN_SAVE=1 for exporting coverage.txt.
Comments suppressed due to low confidence (2)

src/process.cpp:555

  • PathProcess exception handling only transitions on ETM4_PKT_I_ADDR_L_64IS{0,1}. If the trace emits a short address, exact-match address, 32-bit long address, or an address-with-context packet in the exception sequence, the state machine will remain stuck in EXCEPTION_ADDR1/2. Consider broadening these checks to accept the same set of supported address packet types as the main Process decoder (including the new 32-bit/context/exact-match packets).
    case DecodeState::EXCEPTION_ADDR1: {
      if (packet.type == PacketType::ETM4_PKT_I_ADDR_L_64IS0 || packet.type == PacketType::ETM4_PKT_I_ADDR_L_64IS1) {
        this->decoder.state = DecodeState::EXCEPTION_ADDR2;
      }
      break;
    }

    case DecodeState::EXCEPTION_ADDR2: {
      if (packet.type == PacketType::ETM4_PKT_I_ADDR_L_64IS0 || packet.type == PacketType::ETM4_PKT_I_ADDR_L_64IS1) {
        this->decoder.state = DecodeState::TRACE;
      }
      break;
    }

src/process.cpp:31

  • insn_flow is a global that accumulates entries across decoding sessions, but Process::reset() doesn't clear it. If the same process instance is reused (e.g., via repeated libcsdec_reset_edge calls), coverage.txt will contain stale data from previous runs and memory usage will grow unbounded. Clear insn_flow during reset (or at the start/end of run/final).
std::vector<std::pair<std::string, uint64_t>> insn_flow;
bool need_save_insn_flow = false;

void Process::reset(std::vector<MemoryMap> &&memory_maps,
                    const std::uint8_t target_trace_id) {
  this->data.bitmap.reset();
  this->deformatter.reset(target_trace_id);
  this->decoder.reset();
  this->state.reset(std::move(memory_maps));
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +18 to 20
extern std::vector<std::pair<std::string, uint64_t>> insn_flow;
extern bool need_save_insn_flow;
cs_insn *disassembleNextBranchInsn(const csh *handle,
Comment on lines 161 to +170
void Decoder::reset() {
this->trace_data = std::vector<std::uint8_t>();
this->trace_data_offset = 0;
this->state = DecodeState::START;
}

void Decoder::update_address_regs(uint64_t address){
std::rotate(this->address_regs.rbegin(), this->address_regs.rbegin() + 1, this->address_regs.rend());
this->address_regs[0] = address;
}
Comment on lines 59 to 65
std::unique_ptr<Process> process = std::make_unique<Process>(
std::move(memory_images),
Bitmap(reinterpret_cast<std::uint8_t *>(bitmap_addr),
static_cast<std::size_t>(bitmap_size)),
Cache());

need_save_insn_flow = std::getenv("INSN_SAVE");
// Release ownership and pass it to the C API side.
Comment on lines 92 to 96
for (int id = 0; id < binary_file_num; ++id) {
const std::string path = argv[4 + id * 3];
std::vector<std::uint8_t> data = readBinaryFile(path);

memory_images.emplace_back(MemoryImage(std::move(data), (std::size_t)id));
memory_images.emplace_back(MemoryImage(std::move(data), (std::size_t)id, basename(argv[4 + id * 3])));
}
Comment on lines +518 to +521
// Header is correct, but packet size is incomplete.
if (rest_data_size < 10) {
return Packet{PacketType::PKT_INCOMPLETE, rest_data_size, 0, 0, 0};
}
Comment on lines 24 to 28
struct libcsdec_memory_image {
void *data; /**< Binary data of the memory image. */
char path[PATH_MAX];
size_t size; /**< Size of the memory image. */
};
Comment on lines 47 to 52
usage(argv[0]);
std::exit(EXIT_FAILURE);
}

need_save_insn_flow = std::getenv("INSN_SAVE");
const std::string trace_data_filename = argv[1];
const std::uint8_t trace_id = std::stol(argv[2], nullptr, 16);
Comment on lines +43 to +48
std::ios_base::fmtflags f(std::cout.flags());
for (auto el : insn_flow) {
CoverageFile << el.first << "+" << std::hex << el.second << std::endl;
}
std::cout.flags(f);
CoverageFile.close();
Comment on lines +642 to +657
Packet Decoder::decodeExactMatchAddressPacket(){
std::uint8_t header = this->trace_data[this->trace_data_offset];
Packet packet;
// The QE field indicates the queue entry that contains the exact match
switch(header & 0b11){
case 0b00:
packet = {PacketType::ETM4_ADDR_MATCH, 1, 0, 0, address_regs[0]};
break;
case 0b01:
packet = {PacketType::ETM4_ADDR_MATCH, 1, 0, 0, address_regs[1]};
break;
case 0b10:
packet = {PacketType::ETM4_ADDR_MATCH, 1, 0, 0, address_regs[2]};
break;
}
return packet;
Comment on lines +158 to +159
//(type == BranchType::DIRECT_BRANCH) ? offset + insn->size : 0;
@vanhauser-thc
Copy link
Member

as I am clueless about this software I activated copilot :-)
just ping me when you think this can be merged.

@ram9ir
Copy link
Author

ram9ir commented Mar 20, 2026

Fixed the address packet size check (Address with Context 32-bit IS0/IS1 Long).
Ready to merge.
@vanhauser-thc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants