ELF Support¶

Cpu0 backend generated the ELF format of object files.

The ELF (Executable and Linkable Format) is a common standard file format for executables, object code, shared libraries and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unixsystems. In 1999 it was chosen as the standard binary file format for Unix and Unix-like systems on x86 by the x86open project. Please reference [1].

The binary encoding of the Cpu0 instruction set in object files has been verified in previous chapters. However, we did not delve into the ELF file format, such as the ELF header and relocation records, at that time.

In this chapter, you will learn how to use tools such as llvm-objdump, llvm-readelf, and others to analyze ELF files generated by Cpu0. Through these tools, you will also understand the ELF file format itself.

This chapter introduces these tools to readers because understanding the popular ELF format and analysis tools is valuable. An LLVM compiler engineer is responsible for ensuring that their backend generates correct object files.

With these tools, you can verify the correctness of the generated ELF format.

The Cpu0 author has published a book titled “System Software,” which introduces topics such as assemblers, linkers, loaders, compilers, and operating systems in both concept and practice. It demonstrates how to analyze ELF files using binutils and gcc, and includes example code. This is a Chinese-language book on “System Software.”

The book “System Software” [2] written by Beck is a well-known resource for explaining what the compiler, linker, and loader produce, and how they work together conceptually. You may refer to it to understand how Relocation Records work if you need a refresher or are learning this topic for the first time.

[3], [4], [5] are Chinese documents about this topic, available on the Cpu0 author’s website.

ELF format ¶

ELF is a format used in both object and executable files. Therefore, there are two views of it, as shown in Fig. 50.

_images/12.png — Fig. 50 ELF file format overview¶

As shown in Fig. 50, the “Section header table” includes sections .text, .rodata, …, .data, which are used for code, read-only data, and read/write data, respectively. The “Program header table” includes segments used at run time for code and data.

The definition of segments describes the run-time layout of code and data, while sections describe the link-time layout.

ELF header and Section header table ¶

Let’s run Chapter9_3/ with ch6_1.cpp, and dump ELF header information using llvm-readelf -h to see what the ELF header contains.

input$ ~/llvm/test/build/bin/llc -march=cpu0
-relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o

input$ llvm-readelf -h ch6_1.cpu0.o
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           <unknown>: 0xc9
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          176 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         8
  Section header string table index: 5
input$

input$ ~/llvm/test/build/bin/llc
-march=mips -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.mips.o

input$ llvm-readelf -h ch6_1.mips.o
ELF Header:
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           MIPS R3000
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          200 (bytes into file)
  Flags:                             0x50001007, noreorder, pic, cpic, o32, mips32
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         9
  Section header string table index: 6
input$

input$ llvm-readelf -l ch6_1.cpu0.o

There are no program headers in this file.
input$

As shown in the ELF header above, it contains information such as the magic number, version, ABI, and more. The Machine field for Cpu0 is listed as unknown, whereas MIPS is recognized as MIPSR3000.

This happens because Cpu0 is a unknown CPU supported by the llvm-readelf utility.

Let’s check the ELF segments information with the following command:

input$ llvm-readelf -l ch6_1.cpu0.o

There are no program headers in this file.
input$

This result is expected because the Cpu0 object file is meant for linking only, not execution because we don’t implement linker at this point yet. Therefore, the segment table is empty.

Next, let’s check the ELF sections. Each section includes offset and size information.

input$ llvm-readelf -S ch6_1.cpu0.o
There are 10 section headers, starting at offset 0xd4:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000034 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 000310 000018 08      8   1  4
  [ 3] .data             PROGBITS        00000000 000068 000004 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 00006c 000000 00  WA  0   0  4
  [ 5] .eh_frame         PROGBITS        00000000 00006c 000028 00   A  0   0  4
  [ 6] .rel.eh_frame     REL             00000000 000328 000008 08      8   5  4
  [ 7] .shstrtab         STRTAB          00000000 000094 00003e 00      0   0  1
  [ 8] .symtab           SYMTAB          00000000 000264 000090 10      9   6  4
  [ 9] .strtab           STRTAB          00000000 0002f4 00001b 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
input$

Relocation Record ¶

Cpu0 backend translates global variables as follows:

input$ clang -target mips-unknown-linux-gnu -c ch6_1.cpp
-emit-llvm -o ch6_1.bc
input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch6_1.bc -o ch6_1.cpu0.s
input$ cat ch6_1.cpu0.s
  .section .mdebug.abi32
  .previous
  .file "ch6_1.bc"
  .text
  ...
  .cfi_startproc
  .frame  $sp,8,$lr
  .mask   0x00000000,0
  .set  noreorder
  .cpload $t9
  ...
  lui $2, %got_hi(gI)
  addu $2, $2, $gp
  ld $2, %got_lo(gI)($2)
  ...
  .type gI,@object              # @gI
  .data
  .globl  gI
  .align  2
gI:
  .4byte  100                     # 0x64
  .size gI, 4


input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o
input$ llvm-objdump -s ch6_1.cpu0.o

ch6_1.cpu0.o:     file format elf32-big

Contents of section .text:
// .cpload machine instruction
 0000 0fa00000 0daa0000 13aa6000 ........  ..............`.
 ...
 0020 002a0000 00220000 012d0000 0ddd0008  .*..."...-......
 ...
input$

input$ llvm-readelf -tr ch6_1.cpu0.o
There are 8 section headers, starting at offset 0xb0:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000044 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002a8 000020 08   6   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 000078 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000080 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .shstrtab
       STRTAB          00000000 000080 000030 00   0   0  1
       [00000000]:
  [ 6] .symtab
       SYMTAB          00000000 0001f0 000090 10   7   5  4
       [00000000]:
  [ 7] .strtab
       STRTAB          00000000 000280 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2a8 contains 4 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000805 unrecognized: 5       00000000   _gp_disp
00000004  00000806 unrecognized: 6       00000000   _gp_disp
00000020  00000616 unrecognized: 16      00000004   gI
00000028  00000617 unrecognized: 17      00000004   gI


input$ llvm-readelf -tr ch6_1.mips.o
There are 9 section headers, starting at offset 0xc8:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000038 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002f8 000018 08   7   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 00006c 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000074 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .reginfo
       MIPS_REGINFO    00000000 000074 000018 00   0   0  1
       [00000002]: ALLOC
  [ 6] .shstrtab
       STRTAB          00000000 00008c 000039 00   0   0  1
       [00000000]:
  [ 7] .symtab
       SYMTAB          00000000 000230 0000a0 10   8   6  4
       [00000000]:
  [ 8] .strtab
       STRTAB          00000000 0002d0 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2f8 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000905 R_MIPS_HI16       00000000   _gp_disp
00000004  00000906 R_MIPS_LO16       00000000   _gp_disp
0000001c  00000709 R_MIPS_GOT16      00000004   gI

As depicted in section Handle $gp register in PIC addressing mode, it translates “.cpload %reg” into the following.

// Lower ".cpload $reg" to
//  "lui   $gp, %hi(_gp_disp)"
//  "ori $gp, $gp, %lo(_gp_disp)"
//  "addu  $gp, $gp, $t9"

The _gp_disp value is determined by the loader, so it’s undefined in the obj file. You can find both the relocation records for offset 0 and 4 of the .text section referring to the _gp_disp symbol.

The offset 0 and 4 of the .text section correspond to the instructions lui $gp, %hi(_gp_disp) and ori $gp, $gp, %lo(_gp_disp), whose encoded object representations are 0fa00000 and 0daa0000, respectively.

The object file sets the %hi(_gp_disp) and %lo(_gp_disp) fields to zero, since the loader will determine the actual _gp_disp value at runtime and patch these two relocation entries accordingly.

You can verify the correctness of Cpu0’s handling of %hi(_gp_disp) and %lo(_gp_disp) by comparing them to the MIPS relocation records R_MIPS_HI(_gp_disp) and R_MIPS_LO(_gp_disp), even though Cpu0 is not a recognized CPU target by the llvm-readelf utility.

The instruction ld $2, %got(gI)($gp) behaves similarly. Because the actual address of the .data section variable gI is unknown at compile time, Cpu0 sets its address to 0 and creates a relocation record at offset 0x00000020 of the .text section.

The linker or loader will patch this address at link time (for static linking) or load time (for dynamic linking), depending on how the program is built.

llvm-objdump ¶

llvm-objdump -t -r ¶

The llvm-objdump -tr command displays symbol table and relocation record information, similar to the output of llvm-readelf -tr.

To examine the differences, try running llvm-objdump with and without enabling the Cpu0 backend, as shown in the following example:

118-165-83-12:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch9_3.cpp -emit-llvm -o ch9_3.bc
118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch9_3.bc -o
ch9_3.cpu0.o

118-165-78-12:input Jonathan$ objdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o:     file format elf32-big

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp


RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE
00000084 UNKNOWN           _gp_disp
00000088 UNKNOWN           _gp_disp
000000e0 UNKNOWN           _Z5sum_iiz


118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o: file format ELF32-CPU0

RELOCATION RECORDS FOR [.text]:
132 R_CPU0_HI16 _gp_disp
136 R_CPU0_LO16 _gp_disp
224 R_CPU0_CALL16 _Z5sum_iiz

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp

The llvm-objdump tool can correctly display the file format and relocation record information, whereas the GNU objdump cannot. This is because the Cpu0-specific relocation record definitions have been added to ELF.h within LLVM’s source code, enabling llvm-objdump to recognize and interpret them properly.

include/llvm/support/ELF.h

// Machine architectures
enum {
  ...
  EM_CPU0          = 998, // Document LLVM Backend Tutorial Cpu0
  EM_CPU0_LE       = 999  // EM_CPU0_LE: little endian; EM_CPU0: big endian
}

lib/object/ELF.cpp

...

StringRef getELFRelocationTypeName(uint32_t Machine, uint32_t Type) {
  switch (Machine) {
  ...
  case ELF::EM_CPU0:
    switch (Type) {
#include "llvm/Support/ELFRelocs/Cpu0.def"
    default:
      break;
    }
    break;
  ...
  }

include/llvm/Support/ELFRelocs/Cpu0.def

#ifndef ELF_RELOC
#error "ELF_RELOC must be defined"
#endif

ELF_RELOC(R_CPU0_NONE,                0)
ELF_RELOC(R_CPU0_32,                  2)
ELF_RELOC(R_CPU0_HI16,                5)
ELF_RELOC(R_CPU0_LO16,                6)
ELF_RELOC(R_CPU0_GPREL16,             7)
ELF_RELOC(R_CPU0_LITERAL,             8)
ELF_RELOC(R_CPU0_GOT16,               9)
ELF_RELOC(R_CPU0_PC16,               10)
ELF_RELOC(R_CPU0_CALL16,             11)
ELF_RELOC(R_CPU0_GPREL32,            12)
ELF_RELOC(R_CPU0_PC24,               13)
ELF_RELOC(R_CPU0_GOT_HI16,           22)
ELF_RELOC(R_CPU0_GOT_LO16,           23)
ELF_RELOC(R_CPU0_RELGOT,             36)
ELF_RELOC(R_CPU0_TLS_GD,             42)
ELF_RELOC(R_CPU0_TLS_LDM,            43)
ELF_RELOC(R_CPU0_TLS_DTP_HI16,       44)
ELF_RELOC(R_CPU0_TLS_DTP_LO16,       45)
ELF_RELOC(R_CPU0_TLS_GOTTPREL,       46)
ELF_RELOC(R_CPU0_TLS_TPREL32,        47)
ELF_RELOC(R_CPU0_TLS_TP_HI16,        49)
ELF_RELOC(R_CPU0_TLS_TP_LO16,        50)
ELF_RELOC(R_CPU0_GLOB_DAT,           51)
ELF_RELOC(R_CPU0_JUMP_SLOT,          127)

include/llvm/Object/ELFObjectFile.h

template<support::endianness target_endianness, bool is64Bits>
error_code ELFObjectFile<target_endianness, is64Bits>
            ::getRelocationValueString(DataRefImpl Rel,
                      SmallVectorImpl<char> &Result) const {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  res = symname;
  break;
  ...
}

template<support::endianness target_endianness, bool is64Bits>
StringRef ELFObjectFile<target_endianness, is64Bits>
             ::getFileFormatName() const {
  switch(Header->e_ident[ELF::EI_CLASS]) {
  case ELF::ELFCLASS32:
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
    return "ELF32-CPU0";
  ...
}

template<support::endianness target_endianness, bool is64Bits>
unsigned ELFObjectFile<target_endianness, is64Bits>::getArch() const {
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  return (target_endianness == support::little) ?
       Triple::cpu0el : Triple::cpu0;
  ...
}

In addition to llvm-objdump -t -r, the llvm-readobj -h command can be used to display the Cpu0 ELF header information, thanks to the EM_CPU0 definition added earlier.

llvm-objdump -d ¶

Run the example code from the previous chapter using the command llvm-objdump -d to disassemble the ELF file and view its contents in hexadecimal format as shown below:

JonathantekiiMac:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch8_1_1.cpp -emit-llvm -o ch8_1_1.bc
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_1_1.bc
-o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o: file format ELF32-unknown

Disassembly of section .text:error: no disassembler for target cpu0-unknown-
unknown

To support llvm-objdump, the following code is added in Chapter10_1/. (Note: The DecoderMethod for brtarget24 was added in a previous chapter.)

lbdex/chapters/Chapter10_1/CMakeLists.txt

tablegen(LLVM Cpu0GenDisassemblerTables.inc -gen-disassembler)

  Cpu0Disassembler

add_subdirectory(Disassembler)

lbdex/chapters/Chapter10_1/Cpu0InstrInfo.td

let isBranch=1, isTerminator=1, isBarrier=1, imm16=0, hasDelaySlot = 1,
    isIndirectBranch = 1 in
class JumpFR<bits<8> op, string instr_asm, RegisterClass RC>:
  FL<op, (outs), (ins RC:$ra),
     !strconcat(instr_asm, "\t$ra"), [(brind RC:$ra)], IIBranch> {
  let rb = 0;
  let imm16 = 0;
//#if CH >= CH10_1 1.5
  let DecoderMethod = "DecodeJumpFR";
//#endif
}

  class JumpLink<bits<8> op, string instr_asm>:
    FJ<op, (outs), (ins calltarget:$target, variable_ops),
       !strconcat(instr_asm, "\t$target"), [(Cpu0JmpLink imm:$target)],
       IIBranch> {
//#if CH >= CH10_1 2
       let DecoderMethod = "DecodeJumpTarget";
//#endif
       }

lbdex/chapters/Chapter10_1/Disassembler/CMakeLists.txt

add_llvm_component_library(LLVMCpu0Disassembler
  Cpu0Disassembler.cpp

  LINK_COMPONENTS
  MCDisassembler
  Cpu0Info
  Support

  ADD_TO_COMPONENT
  Cpu0
  )

lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp

//===- Cpu0Disassembler.cpp - Disassembler for Cpu0 -------------*- C++ -*-===//
//
//                     The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file is part of the Cpu0 Disassembler.
//
//===----------------------------------------------------------------------===//

#include "Cpu0.h"

#include "Cpu0RegisterInfo.h"
#include "Cpu0Subtarget.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCFixedLenDisassembler.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/TargetRegistry.h"

using namespace llvm;

#define DEBUG_TYPE "cpu0-disassembler"

typedef MCDisassembler::DecodeStatus DecodeStatus;

namespace {

/// Cpu0DisassemblerBase - a disasembler class for Cpu0.
class Cpu0DisassemblerBase : public MCDisassembler {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0DisassemblerBase(const MCSubtargetInfo &STI, MCContext &Ctx,
                       bool bigEndian) :
    MCDisassembler(STI, Ctx),
    IsBigEndian(bigEndian) {}

  virtual ~Cpu0DisassemblerBase() {}

protected:
  bool IsBigEndian;
};

/// Cpu0Disassembler - a disasembler class for Cpu032.
class Cpu0Disassembler : public Cpu0DisassemblerBase {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0Disassembler(const MCSubtargetInfo &STI, MCContext &Ctx, bool bigEndian)
      : Cpu0DisassemblerBase(STI, Ctx, bigEndian) {
  }

  /// getInstruction - See MCDisassembler.
  DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
                              ArrayRef<uint8_t> Bytes, uint64_t Address,
                              raw_ostream &CStream) const override;
};

} // end anonymous namespace

// Decoder tables for GPR register
static const unsigned CPURegsTable[] = {
  Cpu0::ZERO, Cpu0::AT, Cpu0::V0, Cpu0::V1,
  Cpu0::A0, Cpu0::A1, Cpu0::T9, Cpu0::T0, 
  Cpu0::T1, Cpu0::S0, Cpu0::S1, Cpu0::GP, 
  Cpu0::FP, Cpu0::SP, Cpu0::LR, Cpu0::SW
};

// Decoder tables for co-processor 0 register
static const unsigned C0RegsTable[] = {
  Cpu0::PC, Cpu0::EPC
};

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder);
static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder);
static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder);
static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

namespace llvm {
extern Target TheCpu0elTarget, TheCpu0Target, TheCpu064Target,
              TheCpu064elTarget;
}

static MCDisassembler *createCpu0Disassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, true);
}

static MCDisassembler *createCpu0elDisassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, false);
}

extern "C" void LLVMInitializeCpu0Disassembler() {
  // Register the disassembler.
  TargetRegistry::RegisterMCDisassembler(TheCpu0Target,
                                         createCpu0Disassembler);
  TargetRegistry::RegisterMCDisassembler(TheCpu0elTarget,
                                         createCpu0elDisassembler);
}

#if 0
#undef LLVM_DEBUG
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"

/// Read four bytes from the ArrayRef and return 32 bit word sorted
/// according to the given endianess
static DecodeStatus readInstruction32(ArrayRef<uint8_t> Bytes, uint64_t Address,
                                      uint64_t &Size, uint32_t &Insn,
                                      bool IsBigEndian) {
  // We want to read exactly 4 Bytes of data.
  if (Bytes.size() < 4) {
    Size = 0;
    return MCDisassembler::Fail;
  }

  if (IsBigEndian) {
    // Encoded as a big-endian 32-bit word in the stream.
    Insn = (Bytes[3] <<  0) |
           (Bytes[2] <<  8) |
           (Bytes[1] << 16) |
           (Bytes[0] << 24);
  }
  else {
    // Encoded as a small-endian 32-bit word in the stream.
    Insn = (Bytes[0] <<  0) |
           (Bytes[1] <<  8) |
           (Bytes[2] << 16) |
           (Bytes[3] << 24);
  }

  return MCDisassembler::Success;
}

DecodeStatus
Cpu0Disassembler::getInstruction(MCInst &Instr, uint64_t &Size,
                                              ArrayRef<uint8_t> Bytes,
                                              uint64_t Address,
                                              raw_ostream &CStream) const {
  uint32_t Insn;

  DecodeStatus Result;

  Result = readInstruction32(Bytes, Address, Size, Insn, IsBigEndian);

  if (Result == MCDisassembler::Fail)
    return MCDisassembler::Fail;

  // Calling the auto-generated decoder function.
  Result = decodeInstruction(DecoderTableCpu032, Instr, Insn, Address,
                             this, STI);
  if (Result != MCDisassembler::Fail) {
    Size = 4;
    return Result;
  }

  return MCDisassembler::Fail;
}

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  if (RegNo > 15)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(CPURegsTable[RegNo]));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder) {
  if (RegNo > 1)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(C0RegsTable[RegNo]));
  return MCDisassembler::Success;
}

//@DecodeMem {
static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder) {
//@DecodeMem body {
  int Offset = SignExtend32<16>(Insn & 0xffff);
  int Reg = (int)fieldFromInstruction(Insn, 20, 4);
  int Base = (int)fieldFromInstruction(Insn, 16, 4);

  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg]));
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Base]));
  Inst.addOperand(MCOperand::createImm(Offset));

  return MCDisassembler::Success;
}

static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 16);
  if (BranchOffset > 0x8fff)
  	BranchOffset = -1*(0x10000 - BranchOffset);
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

/* CBranch instruction define $ra and then imm24; The printOperand() print 
operand 1 (operand 0 is $ra and operand 1 is imm24), so we Create register 
operand first and create imm24 next, as follows,

// Cpu0InstrInfo.td
class CBranch<bits<8> op, string instr_asm, RegisterClass RC,
                   list<Register> UseRegs>:
  FJ<op, (outs), (ins RC:$ra, brtarget:$addr),
             !strconcat(instr_asm, "\t$addr"),
             [(brcond RC:$ra, bb:$addr)], IIBranch> {

// Cpu0AsmWriter.inc
void Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {
...
  case 3:
    // CMP, JEQ, JGE, JGT, JLE, JLT, JNE
    printOperand(MI, 1, O); 
    break;
*/
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 24);
  if (BranchOffset > 0x8fffff)
  	BranchOffset = -1*(0x1000000 - BranchOffset);
  Inst.addOperand(MCOperand::createReg(Cpu0::SW));
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {

  unsigned JumpOffset = fieldFromInstruction(Insn, 0, 24);
  Inst.addOperand(MCOperand::createImm(JumpOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {
  int Reg_a = (int)fieldFromInstruction(Insn, 20, 4);
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg_a]));
// exapin in http://jonathan2251.github.io/lbd/llvmstructure.html#jr-note
  if (CPURegsTable[Reg_a] == Cpu0::LR)
    Inst.setOpcode(Cpu0::RET);
  else
    Inst.setOpcode(Cpu0::JR);
  return MCDisassembler::Success;
}

static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder) {
  Inst.addOperand(MCOperand::createImm(SignExtend32<16>(Insn)));
  return MCDisassembler::Success;
}

As shown in the above code, it adds the Disassembler directory to handle the reverse translation from obj to assembly. Therefore, Disassembler/Cpu0Disassembler.cpp is added, and the CMakeLists.txt is modified to build the Disassembler directory and enable the disassembler table generated by setting has_disassembler = 1. Most of the code is handled by the table defined in *.td files.

Not every instruction in the *.td files can be disassembled without trouble, even though they can be successfully translated into assembly and obj. For those that cannot be disassembled, LLVM provides the "let DecoderMethod" keyword to allow programmers to implement their own decode functions.

For example, in Cpu0, we define functions such as DecodeBranch24Target(), DecodeJumpTarget(), and DecodeJumpFR() in Cpu0Disassembler.cpp. We then inform llvm-tblgen by writing "let DecoderMethod = ..." in the corresponding instruction definitions or ISD nodes of Cpu0InstrInfo.td.

LLVM will call these DecoderMethods when the user uses disassembler tools, such as llvm-objdump -d.

Finally, cpu032II includes all instructions from cpu032I and adds some new instructions. When llvm-objdump -d is invoked, the function selectCpu0ArchFeature() will be called through createCpu0MCSubtargetInfo(). Since llvm-objdump cannot set CPU options like llc -mcpu=cpu032I, the variable CPU in selectCpu0ArchFeature() is empty when invoked by llvm-objdump -d. To ensure that all instructions are disassembled, we set Cpu0ArchFeature to "+cpu032II" so that it can disassemble all instructions from cpu032II (which includes all instructions from cpu032I and adds new ones).

lbdex/chapters/Chapter10_1/MCTargetDesc/Cpu0MCTargetDesc.cpp

/// Select the Cpu0 Architecture Feature for the given triple and cpu name.
/// The function will be called at command 'llvm-objdump -d' for Cpu0 elf input.
static std::string selectCpu0ArchFeature(const Triple &TT, StringRef CPU) {
  std::string Cpu0ArchFeature;
  if (CPU.empty() || CPU == "generic") {
    if (TT.getArch() == Triple::cpu0 || TT.getArch() == Triple::cpu0el) {
      if (CPU.empty() || CPU == "cpu032II") {
        Cpu0ArchFeature = "+cpu032II";
      }
      else {
        if (CPU == "cpu032I") {
          Cpu0ArchFeature = "+cpu032I";
        }
      }
    }
  }
  return Cpu0ArchFeature;
}

Now, run Chapter10_1/ with command llvm-objdump -d ch8_1_1.cpu0.o will get the following result.

JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj
ch8_1_1.bc -o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o:       file format ELF32-CPU0

Disassembly of section .text:
_Z13test_control1v:
       0: 09 dd ff d8                                   addiu $sp, $sp, -40
       4: 09 30 00 00                                   addiu $3, $zero, 0
       8: 02 3d 00 24                                   st  $3, 36($sp)
       c: 09 20 00 01                                   addiu $2, $zero, 1
      10: 02 2d 00 20                                   st  $2, 32($sp)
      14: 09 40 00 02                                   addiu $4, $zero, 2
      18: 02 4d 00 1c                                   st  $4, 28($sp)
      ...

Disassembler Structure ¶

The flow of disassembly is shown in Fig. 51.

digraph G {
rankdir=TD;
"disassembleObject()" -> "getInstruction()" [label="1. [AsmPrinter::llvm-objdump -d]\nBytes"];
"disassembleObject()" -> "PrettyPrinter::printInst()" [label="2. MCInst,Address"];
"getInstruction()" -> "disassembleObject()" [label="MCInst"];
"PrettyPrinter::printInst()" -> "printInst()" [label="MCInst,Address"];
"getInstruction()" -> "decodeInstruction()" [label="(DecoderTableCpu032,insn,Address)"];
"decodeInstruction()" -> "getInstruction()" [label="MCInst"];
"decodeInstruction()" -> "fieldFromInstruction()";
"decodeInstruction()" -> "checkDecoderPredicate()";
"decodeInstruction()" -> "decodeToMCInst()";
"decodeToMCInst()" -> "DecodeMem()";
"decodeToMCInst()" -> "DecodeBranch16Target()";
"decodeToMCInst()" -> "DecodeBranch24Target()";
"decodeToMCInst()" -> "DecodeJumpTarget()";
"decodeToMCInst()" -> "DecodeJumpFR()";
"decodeToMCInst()" -> "DecodeSimm16()";
subgraph clusterObjdump {
label = "llvm-objdump.cpp";
"disassembleObject()";
"PrettyPrinter::printInst()";
}
subgraph clusterCpu0Dis1 {
label = "Cpu0Disassembler.cpp";
"getInstruction()";
"readInstruction32()";
"getInstruction()" -> "readInstruction32()" [label="Bytes"];
"readInstruction32()" -> "getInstruction()" [label="insn"];
}
subgraph clusterCpu0Dis2 {
label = "Cpu0Disassembler.cpp\n These functions specified in Cpu0InstrInfo.td";
"DecodeMem()";
"DecodeBranch16Target()";
"DecodeBranch24Target()";
"DecodeJumpTarget()";
"DecodeJumpFR()";
"DecodeSimm16()";
}
subgraph clusterInc {
label = "Cpu0GenDisassemblerTables.inc";
"fieldFromInstruction()";
"checkDecoderPredicate()";
"decodeToMCInst()";
"decodeInstruction()";
}
subgraph clusterCpu0InstPrinter {
label = "Cpu0InstPrinter";
"printInst()";
}
// label = "Figure: The flow of disassembly";
} — Fig. 51 The flow of disassembly.¶

After getInstruction() of Cpu0Disassembler.cpp, disassembleObject() of llvm-objdump.cpp call printInst() of Cpu0InstPrinter.cpp to print (address: binary assembly) for the instruction, for example “(4: 09 30 00 00 addiu $3, $zero, 0)”.
- printInst() of Cpu0InstPrinter.cpp: reference Fig. 24.
Bytes: 4-byte (32-bits) for Cpu0. insn: Convert Bytes to big or little endian of 32-bit (unsigned int) binary instruction.

List DecoderTableCpu032 and decodeInstruction() as follows:

build/lib/Target/Cpu0/Cpu0GenDisassemblerTables.inc

static const uint8_t DecoderTableCpu032[] = {
/* 0 */       MCD::OPC_ExtractField, 24, 8,  // Inst{31-24} ...
/* 3 */       MCD::OPC_FilterValue, 0, 11, 0, 0, // Skip to: 19
/* 8 */       MCD::OPC_CheckField, 0, 24, 0, 149, 4, 0, // Skip to: 1188
/* 15 */      MCD::OPC_Decode, 178, 2, 0, // Opcode: NOP
/* 19 */      MCD::OPC_FilterValue, 1, 4, 0, 0, // Skip to: 28
/* 24 */      MCD::OPC_Decode, 161, 2, 1, // Opcode: LD
/* 28 */      MCD::OPC_FilterValue, 2, 4, 0, 0, // Skip to: 37
/* 33 */      MCD::OPC_Decode, 201, 2, 1, // Opcode: ST
/* 37 */      MCD::OPC_FilterValue, 3, 9, 0, 0, // Skip to: 51
/* 42 */      MCD::OPC_CheckPredicate, 0, 117, 4, 0, // Skip to: 1188
/* 47 */      MCD::OPC_Decode, 159, 2, 1, // Opcode: LB
/* 51 */      MCD::OPC_FilterValue, 4, 9, 0, 0, // Skip to: 65
/* 56 */      MCD::OPC_CheckPredicate, 0, 103, 4, 0, // Skip to: 1188
/* 61 */      MCD::OPC_Decode, 160, 2, 1, // Opcode: LBu
/* 65 */      MCD::OPC_FilterValue, 5, 9, 0, 0, // Skip to: 79
/* 70 */      MCD::OPC_CheckPredicate, 0, 89, 4, 0, // Skip to: 1188
/* 75 */      MCD::OPC_Decode, 187, 2, 1, // Opcode: SB
/* 79 */      MCD::OPC_FilterValue, 6, 9, 0, 0, // Skip to: 93
/* 84 */      MCD::OPC_CheckPredicate, 0, 75, 4, 0, // Skip to: 1188
/* 89 */      MCD::OPC_Decode, 163, 2, 1, // Opcode: LH
/* 93 */      MCD::OPC_FilterValue, 7, 9, 0, 0, // Skip to: 107
/* 98 */      MCD::OPC_CheckPredicate, 0, 61, 4, 0, // Skip to: 1188
/* 103 */     MCD::OPC_Decode, 164, 2, 1, // Opcode: LHu
/* 107 */     MCD::OPC_FilterValue, 8, 9, 0, 0, // Skip to: 121
/* 112 */     MCD::OPC_CheckPredicate, 0, 47, 4, 0, // Skip to: 1188
...

template <typename InsnType>
static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], MCInst &MI,
                                      InsnType insn, uint64_t Address,
                                      const void *DisAsm,
                                      const MCSubtargetInfo &STI) {
  const FeatureBitset &Bits = STI.getFeatureBits();

  const uint8_t *Ptr = DecodeTable;
  InsnType CurFieldValue = 0;
  DecodeStatus S = MCDisassembler::Success;
  while (true) {
    ptrdiff_t Loc = Ptr - DecodeTable;
    switch (*Ptr) {
    default:
      errs() << Loc << ": Unexpected decode table opcode!\n";
      return MCDisassembler::Fail;
    case MCD::OPC_ExtractField: {
      unsigned Start = *++Ptr;
      unsigned Len = *++Ptr;
      ++Ptr;
      CurFieldValue = fieldFromInstruction(insn, Start, Len);
      LLVM_DEBUG(dbgs() << Loc << ": OPC_ExtractField(" << Start << ", "
                   << Len << "): " << CurFieldValue << "\n");
      break;
    }
    case MCD::OPC_FilterValue: {
      // Decode the field value.
      unsigned Len;
      InsnType Val = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // Perform the filter operation.
      if (Val != CurFieldValue)
        Ptr += NumToSkip;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_FilterValue(" << Val << ", " << NumToSkip
                   << "): " << ((Val != CurFieldValue) ? "FAIL:" : "PASS:")
                   << " continuing at " << (Ptr - DecodeTable) << "\n");

      break;
    }
    case MCD::OPC_CheckField: {
      unsigned Start = *++Ptr;
      unsigned Len = *++Ptr;
      InsnType FieldValue = fieldFromInstruction(insn, Start, Len);
      // Decode the field value.
      InsnType ExpectedValue = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // If the actual and expected values don't match, skip.
      if (ExpectedValue != FieldValue)
        Ptr += NumToSkip;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckField(" << Start << ", "
                   << Len << ", " << ExpectedValue << ", " << NumToSkip
                   << "): FieldValue = " << FieldValue << ", ExpectedValue = "
                   << ExpectedValue << ": "
                   << ((ExpectedValue == FieldValue) ? "PASS\n" : "FAIL\n"));
      break;
    }
    case MCD::OPC_CheckPredicate: {
      unsigned Len;
      // Decode the Predicate Index value.
      unsigned PIdx = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;
      // Check the predicate.
      bool Pred;
      if (!(Pred = checkDecoderPredicate(PIdx, Bits)))
        Ptr += NumToSkip;
      (void)Pred;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckPredicate(" << PIdx << "): "
            << (Pred ? "PASS\n" : "FAIL\n"));

      break;
    }
    case MCD::OPC_Decode: {
      unsigned Len;
      // Decode the Opcode value.
      unsigned Opc = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
      Ptr += Len;

      MI.clear();
      MI.setOpcode(Opc);
      bool DecodeComplete;
      S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, DecodeComplete);
      assert(DecodeComplete);

      LLVM_DEBUG(dbgs() << Loc << ": OPC_Decode: opcode " << Opc
                   << ", using decoder " << DecodeIdx << ": "
                   << (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
      return S;
    }
    case MCD::OPC_TryDecode: {
      unsigned Len;
      // Decode the Opcode value.
      unsigned Opc = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // Perform the decode operation.
      MCInst TmpMI;
      TmpMI.setOpcode(Opc);
      bool DecodeComplete;
      S = decodeToMCInst(S, DecodeIdx, insn, TmpMI, Address, DisAsm, DecodeComplete);
      LLVM_DEBUG(dbgs() << Loc << ": OPC_TryDecode: opcode " << Opc
                   << ", using decoder " << DecodeIdx << ": ");

      if (DecodeComplete) {
        // Decoding complete.
        LLVM_DEBUG(dbgs() << (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
        MI = TmpMI;
        return S;
      } else {
        assert(S == MCDisassembler::Fail);
        // If the decoding was incomplete, skip.
        Ptr += NumToSkip;
        LLVM_DEBUG(dbgs() << "FAIL: continuing at " << (Ptr - DecodeTable) << "\n");
        // Reset decode status. This also drops a SoftFail status that could be
        // set before the decode attempt.
        S = MCDisassembler::Success;
      }
      break;
    }
    case MCD::OPC_SoftFail: {
      // Decode the mask values.
      unsigned Len;
      InsnType PositiveMask = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      InsnType NegativeMask = decodeULEB128(Ptr, &Len);
      Ptr += Len;
      bool Fail = (insn & PositiveMask) || (~insn & NegativeMask);
      if (Fail)
        S = MCDisassembler::SoftFail;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_SoftFail: " << (Fail ? "FAIL\n" : "PASS\n"));
      break;
    }
    case MCD::OPC_Fail: {
      LLVM_DEBUG(dbgs() << Loc << ": OPC_Fail\n");
      return MCDisassembler::Fail;
    }
    }
  }
  llvm_unreachable("bogosity detected in disassembler state machine!");
}

List the tracing of decodeInstruction() by enabling “#if 1” in Cpu0Disassembler.cpp and running llvm-objdump as follows:

lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp

#if 1
#undef LLVM_DEBUG(X)
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"

input % ~/llvm/debug/build/bin/clang -target mips-unknown-linux-gnu -c ch3.cpp -emit-llvm -o ch3.bc
input % ~/llvm/test/build/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch3.bc -o ch3.cpu0.o
input % ~/llvm/test/build/bin/llvm-objdump -d ch3.cpu0.o 

ch3.cpu0.o:	file format elf32-cpu0


Disassembly of section .text:

00000000 /* main:*/
OPC_ExtractField(24, 8): 9
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): PASS: continuing at 126
OPC_Decode: opcode 264, using decoder 2: PASS
       0: 09 dd ff f8  	addiu	$sp, $sp, -8
OPC_ExtractField(24, 8): 2
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): PASS: continuing at 33
OPC_Decode: opcode 329, using decoder 1: PASS
       4: 02 cd 00 04  	st	$fp, 4($sp)
OPC_ExtractField(24, 8): 17
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): FAIL: continuing at 130
OPC_FilterValue(10, 16): FAIL: continuing at 151
OPC_FilterValue(11, 16): FAIL: continuing at 172
OPC_FilterValue(12, 9): FAIL: continuing at 186
OPC_FilterValue(13, 9): FAIL: continuing at 200
OPC_FilterValue(14, 9): FAIL: continuing at 214
OPC_FilterValue(15, 16): FAIL: continuing at 235
OPC_FilterValue(17, 11): PASS: continuing at 240
OPC_CheckField(0, 1, 0, 941): FieldValue = 0, ExpectedValue = 0: PASS
OPC_Decode: opcode 265, using decoder 6: PASS
       8: 11 cd 00 00  	move	$fp, $sp
OPC_ExtractField(24, 8): 9
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): PASS: continuing at 126
OPC_Decode: opcode 264, using decoder 2: PASS
       c: 09 20 00 00  	addiu	$2, $zero, 0
OPC_ExtractField(24, 8): 2
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): PASS: continuing at 33
OPC_Decode: opcode 329, using decoder 1: PASS
02 2c 00 00  	st	$2, 0($fp)
OPC_ExtractField(24, 8): 17
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): FAIL: continuing at 130
OPC_FilterValue(10, 16): FAIL: continuing at 151
OPC_FilterValue(11, 16): FAIL: continuing at 172
OPC_FilterValue(12, 9): FAIL: continuing at 186
OPC_FilterValue(13, 9): FAIL: continuing at 200
OPC_FilterValue(14, 9): FAIL: continuing at 214
OPC_FilterValue(15, 16): FAIL: continuing at 235
OPC_FilterValue(17, 11): PASS: continuing at 240
OPC_CheckField(0, 1, 0, 941): FieldValue = 0, ExpectedValue = 0: PASS
OPC_Decode: opcode 265, using decoder 6: PASS
11 dc 00 00  	move	$sp, $fp
OPC_ExtractField(24, 8): 1
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): PASS: continuing at 24
OPC_Decode: opcode 289, using decoder 1: PASS
01 cd 00 04  	ld	$fp, 4($sp)
OPC_ExtractField(24, 8): 9
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): PASS: continuing at 126
OPC_Decode: opcode 264, using decoder 2: PASS
      1c: 09 dd 00 08  	addiu	$sp, $sp, 8
OPC_ExtractField(24, 8): 60
OPC_FilterValue(0, 11): FAIL: continuing at 19
OPC_FilterValue(1, 4): FAIL: continuing at 28
OPC_FilterValue(2, 4): FAIL: continuing at 37
OPC_FilterValue(3, 9): FAIL: continuing at 51
OPC_FilterValue(4, 9): FAIL: continuing at 65
OPC_FilterValue(5, 9): FAIL: continuing at 79
OPC_FilterValue(6, 9): FAIL: continuing at 93
OPC_FilterValue(7, 9): FAIL: continuing at 107
OPC_FilterValue(8, 9): FAIL: continuing at 121
OPC_FilterValue(9, 4): FAIL: continuing at 130
OPC_FilterValue(10, 16): FAIL: continuing at 151
OPC_FilterValue(11, 16): FAIL: continuing at 172
OPC_FilterValue(12, 9): FAIL: continuing at 186
OPC_FilterValue(13, 9): FAIL: continuing at 200
OPC_FilterValue(14, 9): FAIL: continuing at 214
OPC_FilterValue(15, 16): FAIL: continuing at 235
OPC_FilterValue(17, 11): FAIL: continuing at 251
OPC_FilterValue(18, 11): FAIL: continuing at 267
OPC_FilterValue(19, 11): FAIL: continuing at 283
OPC_FilterValue(20, 11): FAIL: continuing at 299
OPC_FilterValue(21, 16): FAIL: continuing at 320
OPC_FilterValue(22, 16): FAIL: continuing at 341
OPC_FilterValue(23, 16): FAIL: continuing at 362
OPC_FilterValue(24, 16): FAIL: continuing at 383
OPC_FilterValue(25, 16): FAIL: continuing at 404
OPC_FilterValue(26, 16): FAIL: continuing at 425
OPC_FilterValue(27, 16): FAIL: continuing at 446
OPC_FilterValue(28, 16): FAIL: continuing at 467
OPC_FilterValue(29, 16): FAIL: continuing at 488
OPC_FilterValue(30, 16): FAIL: continuing at 509
OPC_FilterValue(31, 16): FAIL: continuing at 530
OPC_FilterValue(32, 16): FAIL: continuing at 551
OPC_FilterValue(33, 16): FAIL: continuing at 572
OPC_FilterValue(34, 16): FAIL: continuing at 593
OPC_FilterValue(35, 16): FAIL: continuing at 614
OPC_FilterValue(36, 16): FAIL: continuing at 635
OPC_FilterValue(37, 16): FAIL: continuing at 656
OPC_FilterValue(38, 4): FAIL: continuing at 665
OPC_FilterValue(39, 4): FAIL: continuing at 674
OPC_FilterValue(40, 11): FAIL: continuing at 690
OPC_FilterValue(41, 11): FAIL: continuing at 706
OPC_FilterValue(42, 11): FAIL: continuing at 722
OPC_FilterValue(43, 11): FAIL: continuing at 738
OPC_FilterValue(48, 4): FAIL: continuing at 747
OPC_FilterValue(49, 4): FAIL: continuing at 756
OPC_FilterValue(50, 4): FAIL: continuing at 765
OPC_FilterValue(51, 4): FAIL: continuing at 774
OPC_FilterValue(52, 4): FAIL: continuing at 783
OPC_FilterValue(53, 4): FAIL: continuing at 792
OPC_FilterValue(54, 9): FAIL: continuing at 806
OPC_FilterValue(55, 4): FAIL: continuing at 815
OPC_FilterValue(56, 4): FAIL: continuing at 824
OPC_FilterValue(57, 23): FAIL: continuing at 852
OPC_FilterValue(58, 4): FAIL: continuing at 861
OPC_FilterValue(59, 9): FAIL: continuing at 875
OPC_FilterValue(60, 11): PASS: continuing at 880
OPC_CheckField(0, 1, 0, 301): FieldValue = 0, ExpectedValue = 0: PASS
OPC_Decode: opcode 285, using decoder 16: PASS
3c e0 00 00  	ret	$lr
OPC_ExtractField(24, 8): 0
OPC_FilterValue(0, 11): PASS: continuing at 8
OPC_CheckField(0, 1, 0, 1173): FieldValue = 0, ExpectedValue = 0: PASS
OPC_Decode: opcode 306, using decoder 0: PASS
00 00 00 00  	nop

Based on the debug log above, pick the example “addiu $sp, $sp, -8”, which has an opcode of 9, to explain decodeInstruction() as shown in the table and explanation below:

Table 37 The state transformation of decodeInstruction() for “addiu $sp, $sp, -8”¶
state	result
OPC_ExtractField	CurFieldValue <- Opcode:9
OPC_FilterValue	Match entries of DecodeTable == CurFieldValue
OPC_Decode	setOpcode(9) and decode operands by calling decodeToMCInst()

For “move $fp, $sp” and “ret $lr”, they have state OPC_CheckField before OPC_Decode since they are R type of Cpu0 instruction format and “let shamt = 0;” is set in “class ArithLogic” of Cpu0InstrInfo.td.
- For “move $fp, $sp”, fieldFromInstruction(0x11cd0000, 0, 12) = (0x11cd0000 & 0x00000fff). Check bits(20..31) is 0.
DecodeBranch16Target() and DecodeBranch24Target(): decode immediate value to MCInst.operand and set the type of MCInst.operand to immediate type, with value being either positive or negative. Operand of MCInst can be either immediate or register type.