ELF Support

Cpu0 backend generated the ELF format of obj. The ELF (Executable and Linkable Format) is a common standard file format for executables, object code, shared libraries and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unixsystems. In 1999 it was chosen as the standard binary file format for Unix and Unix-like systems on x86 by the x86open project. Please reference [1].

The binary encode of Cpu0 instruction set in obj has been checked in the previous chapters. But we didn’t dig into the ELF file format like elf header and relocation record at that time. You will learn the llvm-objdump, llvm-readelf, …, tools and understand the ELF file format itself through using these tools to analyze the cpu0 generated obj in this chapter.

This chapter introduces the tool to readers since we think it is a valuable knowledge in this popular ELF format and the ELF binutils analysis tool. An LLVM compiler engineer has the responsibility to make sure that his backend generate a correct obj. With this tool, you can verify your generated ELF format.

The cpu0 author has published a “System Software” book which introduces the topics of assembler, linker, loader, compiler and OS in concept, and at same time demonstrates how to use binutils and gcc to analysis ELF through the example code in his book. It’s a Chinese book of “System Software” in concept and practice. This book does the real analysis through binutils. The “System Software” [2] written by Beck is a famous book in concept telling readers what the compiler output about, what the linker output about, what the loader output about, and how they work together in concept. You can reference it to understand how the “Relocation Record” works if you need to refresh or learning this knowledge for this chapter.

[3], [4], [5] are the Chinese documents available from the cpu0 author on web site.

ELF format

ELF is a format used in both obj and executable file. So, there are two views in it as Fig. 50.

_images/12.png

Fig. 50 ELF file format overview

As Fig. 50, the “Section header table” include sections .text, .rodata, …, .data which are sections layout for code, read only data, …, and read/write data, respectively. “Program header table” include segments for run time code and data. The definition of segments is the run time layout for code and data while sections is the link time layout for code and data.

ELF header and Section header table

Let’s run Chapter9_3/ with ch6_1.cpp, and dump ELF header information by llvm-readelf -h to see what information the ELF header contains.

input$ ~/llvm/test/build/bin/llc -march=cpu0
-relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o

input$ llvm-readelf -h ch6_1.cpu0.o
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           <unknown>: 0xc9
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          176 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         8
  Section header string table index: 5
input$

input$ ~/llvm/test/build/bin/llc
-march=mips -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.mips.o

input$ llvm-readelf -h ch6_1.mips.o
ELF Header:
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           MIPS R3000
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          200 (bytes into file)
  Flags:                             0x50001007, noreorder, pic, cpic, o32, mips32
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         9
  Section header string table index: 6
input$

As above ELF header display, it contains information of magic number, version, ABI, …, . The Machine field of cpu0 is unknown while mips is known as MIPSR3000. It is unknown because cpu0 is not a popular CPU recognized by utility llvm-readelf. Let’s check ELF segments information as follows,

input$ llvm-readelf -l ch6_1.cpu0.o

There are no program headers in this file.
input$

The result is in expectation because cpu0 obj is for link only, not for execution. So, the segments is empty. Check ELF sections information as follows. Every section contains offset and size information.

input$ llvm-readelf -S ch6_1.cpu0.o
There are 10 section headers, starting at offset 0xd4:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000034 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 000310 000018 08      8   1  4
  [ 3] .data             PROGBITS        00000000 000068 000004 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 00006c 000000 00  WA  0   0  4
  [ 5] .eh_frame         PROGBITS        00000000 00006c 000028 00   A  0   0  4
  [ 6] .rel.eh_frame     REL             00000000 000328 000008 08      8   5  4
  [ 7] .shstrtab         STRTAB          00000000 000094 00003e 00      0   0  1
  [ 8] .symtab           SYMTAB          00000000 000264 000090 10      9   6  4
  [ 9] .strtab           STRTAB          00000000 0002f4 00001b 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
input$

Relocation Record

Cpu0 backend translate global variable as follows,

input$ clang -target mips-unknown-linux-gnu -c ch6_1.cpp
-emit-llvm -o ch6_1.bc
input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch6_1.bc -o ch6_1.cpu0.s
input$ cat ch6_1.cpu0.s
  .section .mdebug.abi32
  .previous
  .file "ch6_1.bc"
  .text
  ...
  .cfi_startproc
  .frame  $sp,8,$lr
  .mask   0x00000000,0
  .set  noreorder
  .cpload $t9
  ...
  lui $2, %got_hi(gI)
  addu $2, $2, $gp
  ld $2, %got_lo(gI)($2)
  ...
  .type gI,@object              # @gI
  .data
  .globl  gI
  .align  2
gI:
  .4byte  100                     # 0x64
  .size gI, 4


input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o
input$ llvm-objdump -s ch6_1.cpu0.o

ch6_1.cpu0.o:     file format elf32-big

Contents of section .text:
// .cpload machine instruction
 0000 0fa00000 0daa0000 13aa6000 ........  ..............`.
 ...
 0020 002a0000 00220000 012d0000 0ddd0008  .*..."...-......
 ...
input$

input$ llvm-readelf -tr ch6_1.cpu0.o
There are 8 section headers, starting at offset 0xb0:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000044 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002a8 000020 08   6   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 000078 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000080 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .shstrtab
       STRTAB          00000000 000080 000030 00   0   0  1
       [00000000]:
  [ 6] .symtab
       SYMTAB          00000000 0001f0 000090 10   7   5  4
       [00000000]:
  [ 7] .strtab
       STRTAB          00000000 000280 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2a8 contains 4 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000805 unrecognized: 5       00000000   _gp_disp
00000004  00000806 unrecognized: 6       00000000   _gp_disp
00000020  00000616 unrecognized: 16      00000004   gI
00000028  00000617 unrecognized: 17      00000004   gI


input$ llvm-readelf -tr ch6_1.mips.o
There are 9 section headers, starting at offset 0xc8:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000038 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002f8 000018 08   7   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 00006c 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000074 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .reginfo
       MIPS_REGINFO    00000000 000074 000018 00   0   0  1
       [00000002]: ALLOC
  [ 6] .shstrtab
       STRTAB          00000000 00008c 000039 00   0   0  1
       [00000000]:
  [ 7] .symtab
       SYMTAB          00000000 000230 0000a0 10   8   6  4
       [00000000]:
  [ 8] .strtab
       STRTAB          00000000 0002d0 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2f8 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000905 R_MIPS_HI16       00000000   _gp_disp
00000004  00000906 R_MIPS_LO16       00000000   _gp_disp
0000001c  00000709 R_MIPS_GOT16      00000004   gI

As depicted in section Handle $gp register in PIC addressing mode, it translates “.cpload %reg” into the following.

// Lower ".cpload $reg" to
//  "lui   $gp, %hi(_gp_disp)"
//  "ori $gp, $gp, %lo(_gp_disp)"
//  "addu  $gp, $gp, $t9"

The _gp_disp value is determined by loader. So, it’s undefined in obj. You can find both the Relocation Records for offset 0 and 4 of .text section refer to _gp_disp value. The offset 0 and 4 of .text section are instructions “lui $gp, %hi(_gp_disp)” and “ori $gp, $gp, %lo(_gp_disp)” which their corresponding obj encode are 0fa00000 and 0daa0000, respectively. The obj translates the %hi(_gp_disp) and %lo(_gp_disp) into 0 since when loader loads this obj into memory, loader will know the _gp_disp value at run time and will update these two offset relocation records to the correct offset value. You can check if the cpu0 of %hi(_gp_disp) and %lo(_gp_disp) are correct by above mips Relocation Records of R_MIPS_HI(_gp_disp) and R_MIPS_LO(_gp_disp) even though the cpu0 is not a CPU recognized by llvm-readelf utilitly. The instruction “ld $2, %got(gI)($gp)” is same since we don’t know what the address of .data section variable will load to. So, Cpu0 translate the address to 0 and made a relocation record on 0x00000020 of .text section. Linker or Loader will change this address when this program is linked or loaded depends on the program is static link or dynamic link.

llvm-objdump

llvm-objdump -t -r

llvm-objdump -tr can display the information of relocation records like llvm-readelf -tr. Let’s run llvm-objdump with and without Cpu0 backend commands as follows to see the differences.

118-165-83-12:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch9_3.cpp -emit-llvm -o ch9_3.bc
118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch9_3.bc -o
ch9_3.cpu0.o

118-165-78-12:input Jonathan$ objdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o:     file format elf32-big

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp


RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE
00000084 UNKNOWN           _gp_disp
00000088 UNKNOWN           _gp_disp
000000e0 UNKNOWN           _Z5sum_iiz


118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o: file format ELF32-CPU0

RELOCATION RECORDS FOR [.text]:
132 R_CPU0_HI16 _gp_disp
136 R_CPU0_LO16 _gp_disp
224 R_CPU0_CALL16 _Z5sum_iiz

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp

The llvm-objdump can display the file format and relocation records information well while the objdump cannot since we add the relocation records information in ELF.h as follows,

include/llvm/support/ELF.h

// Machine architectures
enum {
  ...
  EM_CPU0          = 998, // Document LLVM Backend Tutorial Cpu0
  EM_CPU0_LE       = 999  // EM_CPU0_LE: little endian; EM_CPU0: big endian
}

lib/object/ELF.cpp

...

StringRef getELFRelocationTypeName(uint32_t Machine, uint32_t Type) {
  switch (Machine) {
  ...
  case ELF::EM_CPU0:
    switch (Type) {
#include "llvm/Support/ELFRelocs/Cpu0.def"
    default:
      break;
    }
    break;
  ...
  }

include/llvm/Support/ELFRelocs/Cpu0.def


#ifndef ELF_RELOC
#error "ELF_RELOC must be defined"
#endif

ELF_RELOC(R_CPU0_NONE,                0)
ELF_RELOC(R_CPU0_32,                  2)
ELF_RELOC(R_CPU0_HI16,                5)
ELF_RELOC(R_CPU0_LO16,                6)
ELF_RELOC(R_CPU0_GPREL16,             7)
ELF_RELOC(R_CPU0_LITERAL,             8)
ELF_RELOC(R_CPU0_GOT16,               9)
ELF_RELOC(R_CPU0_PC16,               10)
ELF_RELOC(R_CPU0_CALL16,             11)
ELF_RELOC(R_CPU0_GPREL32,            12)
ELF_RELOC(R_CPU0_PC24,               13)
ELF_RELOC(R_CPU0_GOT_HI16,           22)
ELF_RELOC(R_CPU0_GOT_LO16,           23)
ELF_RELOC(R_CPU0_RELGOT,             36)
ELF_RELOC(R_CPU0_TLS_GD,             42)
ELF_RELOC(R_CPU0_TLS_LDM,            43)
ELF_RELOC(R_CPU0_TLS_DTP_HI16,       44)
ELF_RELOC(R_CPU0_TLS_DTP_LO16,       45)
ELF_RELOC(R_CPU0_TLS_GOTTPREL,       46)
ELF_RELOC(R_CPU0_TLS_TPREL32,        47)
ELF_RELOC(R_CPU0_TLS_TP_HI16,        49)
ELF_RELOC(R_CPU0_TLS_TP_LO16,        50)
ELF_RELOC(R_CPU0_GLOB_DAT,           51)
ELF_RELOC(R_CPU0_JUMP_SLOT,          127)

include/llvm/Object/ELFObjectFile.h

template<support::endianness target_endianness, bool is64Bits>
error_code ELFObjectFile<target_endianness, is64Bits>
            ::getRelocationValueString(DataRefImpl Rel,
                      SmallVectorImpl<char> &Result) const {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  res = symname;
  break;
  ...
}

template<support::endianness target_endianness, bool is64Bits>
StringRef ELFObjectFile<target_endianness, is64Bits>
             ::getFileFormatName() const {
  switch(Header->e_ident[ELF::EI_CLASS]) {
  case ELF::ELFCLASS32:
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
    return "ELF32-CPU0";
  ...
}

template<support::endianness target_endianness, bool is64Bits>
unsigned ELFObjectFile<target_endianness, is64Bits>::getArch() const {
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  return (target_endianness == support::little) ?
       Triple::cpu0el : Triple::cpu0;
  ...
}

In addition to llvm-objdump -t -r, the llvm-readobj -h can display the Cpu0 elf header information with EM_CPU0 defined above.

llvm-objdump -d

Run the last Chapter example code with command llvm-objdump -d for dumping file from elf to hex as follows,

JonathantekiiMac:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch8_1_1.cpp -emit-llvm -o ch8_1_1.bc
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_1_1.bc
-o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o: file format ELF32-unknown

Disassembly of section .text:error: no disassembler for target cpu0-unknown-
unknown

To support llvm-objdump, the following code added to Chapter10_1/ (the DecoderMethod for brtarget24 has been added in previous chapter).

lbdex/chapters/Chapter10_1/CMakeLists.txt

tablegen(LLVM Cpu0GenDisassemblerTables.inc -gen-disassembler)
  Cpu0Disassembler
add_subdirectory(Disassembler)

lbdex/chapters/Chapter10_1/Cpu0InstrInfo.td

let isBranch=1, isTerminator=1, isBarrier=1, imm16=0, hasDelaySlot = 1,
    isIndirectBranch = 1 in
class JumpFR<bits<8> op, string instr_asm, RegisterClass RC>:
  FL<op, (outs), (ins RC:$ra),
     !strconcat(instr_asm, "\t$ra"), [(brind RC:$ra)], IIBranch> {
  let rb = 0;
  let imm16 = 0;
//#if CH >= CH10_1 1.5
  let DecoderMethod = "DecodeJumpFR";
//#endif
}
  class JumpLink<bits<8> op, string instr_asm>:
    FJ<op, (outs), (ins calltarget:$target, variable_ops),
       !strconcat(instr_asm, "\t$target"), [(Cpu0JmpLink imm:$target)],
       IIBranch> {
//#if CH >= CH10_1 2
       let DecoderMethod = "DecodeJumpTarget";
//#endif
       }

lbdex/chapters/Chapter10_1/Disassembler/CMakeLists.txt

add_llvm_component_library(LLVMCpu0Disassembler
  Cpu0Disassembler.cpp

  LINK_COMPONENTS
  MCDisassembler
  Cpu0Info
  Support

  ADD_TO_COMPONENT
  Cpu0
  )

lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp

//===- Cpu0Disassembler.cpp - Disassembler for Cpu0 -------------*- C++ -*-===//
//
//                     The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file is part of the Cpu0 Disassembler.
//
//===----------------------------------------------------------------------===//

#include "Cpu0.h"

#include "Cpu0RegisterInfo.h"
#include "Cpu0Subtarget.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCFixedLenDisassembler.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/TargetRegistry.h"

using namespace llvm;

#define DEBUG_TYPE "cpu0-disassembler"

typedef MCDisassembler::DecodeStatus DecodeStatus;

namespace {

/// Cpu0DisassemblerBase - a disasembler class for Cpu0.
class Cpu0DisassemblerBase : public MCDisassembler {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0DisassemblerBase(const MCSubtargetInfo &STI, MCContext &Ctx,
                       bool bigEndian) :
    MCDisassembler(STI, Ctx),
    IsBigEndian(bigEndian) {}

  virtual ~Cpu0DisassemblerBase() {}

protected:
  bool IsBigEndian;
};

/// Cpu0Disassembler - a disasembler class for Cpu032.
class Cpu0Disassembler : public Cpu0DisassemblerBase {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0Disassembler(const MCSubtargetInfo &STI, MCContext &Ctx, bool bigEndian)
      : Cpu0DisassemblerBase(STI, Ctx, bigEndian) {
  }

  /// getInstruction - See MCDisassembler.
  DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
                              ArrayRef<uint8_t> Bytes, uint64_t Address,
                              raw_ostream &CStream) const override;
};

} // end anonymous namespace

// Decoder tables for GPR register
static const unsigned CPURegsTable[] = {
  Cpu0::ZERO, Cpu0::AT, Cpu0::V0, Cpu0::V1,
  Cpu0::A0, Cpu0::A1, Cpu0::T9, Cpu0::T0, 
  Cpu0::T1, Cpu0::S0, Cpu0::S1, Cpu0::GP, 
  Cpu0::FP, Cpu0::SP, Cpu0::LR, Cpu0::SW
};

// Decoder tables for co-processor 0 register
static const unsigned C0RegsTable[] = {
  Cpu0::PC, Cpu0::EPC
};

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder);
static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder);
static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder);
static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

namespace llvm {
extern Target TheCpu0elTarget, TheCpu0Target, TheCpu064Target,
              TheCpu064elTarget;
}

static MCDisassembler *createCpu0Disassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, true);
}

static MCDisassembler *createCpu0elDisassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, false);
}

extern "C" void LLVMInitializeCpu0Disassembler() {
  // Register the disassembler.
  TargetRegistry::RegisterMCDisassembler(TheCpu0Target,
                                         createCpu0Disassembler);
  TargetRegistry::RegisterMCDisassembler(TheCpu0elTarget,
                                         createCpu0elDisassembler);
}

#if 0
#undef LLVM_DEBUG
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"

/// Read four bytes from the ArrayRef and return 32 bit word sorted
/// according to the given endianess
static DecodeStatus readInstruction32(ArrayRef<uint8_t> Bytes, uint64_t Address,
                                      uint64_t &Size, uint32_t &Insn,
                                      bool IsBigEndian) {
  // We want to read exactly 4 Bytes of data.
  if (Bytes.size() < 4) {
    Size = 0;
    return MCDisassembler::Fail;
  }

  if (IsBigEndian) {
    // Encoded as a big-endian 32-bit word in the stream.
    Insn = (Bytes[3] <<  0) |
           (Bytes[2] <<  8) |
           (Bytes[1] << 16) |
           (Bytes[0] << 24);
  }
  else {
    // Encoded as a small-endian 32-bit word in the stream.
    Insn = (Bytes[0] <<  0) |
           (Bytes[1] <<  8) |
           (Bytes[2] << 16) |
           (Bytes[3] << 24);
  }

  return MCDisassembler::Success;
}

DecodeStatus
Cpu0Disassembler::getInstruction(MCInst &Instr, uint64_t &Size,
                                              ArrayRef<uint8_t> Bytes,
                                              uint64_t Address,
                                              raw_ostream &CStream) const {
  uint32_t Insn;

  DecodeStatus Result;

  Result = readInstruction32(Bytes, Address, Size, Insn, IsBigEndian);

  if (Result == MCDisassembler::Fail)
    return MCDisassembler::Fail;

  // Calling the auto-generated decoder function.
  Result = decodeInstruction(DecoderTableCpu032, Instr, Insn, Address,
                             this, STI);
  if (Result != MCDisassembler::Fail) {
    Size = 4;
    return Result;
  }

  return MCDisassembler::Fail;
}

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  if (RegNo > 15)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(CPURegsTable[RegNo]));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder) {
  if (RegNo > 1)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(C0RegsTable[RegNo]));
  return MCDisassembler::Success;
}

//@DecodeMem {
static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder) {
//@DecodeMem body {
  int Offset = SignExtend32<16>(Insn & 0xffff);
  int Reg = (int)fieldFromInstruction(Insn, 20, 4);
  int Base = (int)fieldFromInstruction(Insn, 16, 4);

  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg]));
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Base]));
  Inst.addOperand(MCOperand::createImm(Offset));

  return MCDisassembler::Success;
}

static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 16);
  if (BranchOffset > 0x8fff)
  	BranchOffset = -1*(0x10000 - BranchOffset);
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

/* CBranch instruction define $ra and then imm24; The printOperand() print 
operand 1 (operand 0 is $ra and operand 1 is imm24), so we Create register 
operand first and create imm24 next, as follows,

// Cpu0InstrInfo.td
class CBranch<bits<8> op, string instr_asm, RegisterClass RC,
                   list<Register> UseRegs>:
  FJ<op, (outs), (ins RC:$ra, brtarget:$addr),
             !strconcat(instr_asm, "\t$addr"),
             [(brcond RC:$ra, bb:$addr)], IIBranch> {

// Cpu0AsmWriter.inc
void Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {
...
  case 3:
    // CMP, JEQ, JGE, JGT, JLE, JLT, JNE
    printOperand(MI, 1, O); 
    break;
*/
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 24);
  if (BranchOffset > 0x8fffff)
  	BranchOffset = -1*(0x1000000 - BranchOffset);
  Inst.addOperand(MCOperand::createReg(Cpu0::SW));
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {

  unsigned JumpOffset = fieldFromInstruction(Insn, 0, 24);
  Inst.addOperand(MCOperand::createImm(JumpOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {
  int Reg_a = (int)fieldFromInstruction(Insn, 20, 4);
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg_a]));
// exapin in http://jonathan2251.github.io/lbd/llvmstructure.html#jr-note
  if (CPURegsTable[Reg_a] == Cpu0::LR)
    Inst.setOpcode(Cpu0::RET);
  else
    Inst.setOpcode(Cpu0::JR);
  return MCDisassembler::Success;
}

static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder) {
  Inst.addOperand(MCOperand::createImm(SignExtend32<16>(Insn)));
  return MCDisassembler::Success;
}

As above code, it adds directory Disassembler to handle the reverse translation from obj to assembly. So, add Disassembler/Cpu0Disassembler.cpp and modify the CMakeList.txt to build directory Disassembler, and enable the disassembler table generated by “has_disassembler = 1”. Most of code is handled by the table defined in *.td files. Not every instruction in *.td can be disassembled without trouble even though they can be translated into assembly and obj successfully. For those cannot be disassembled, LLVM supply the “let DecoderMethod” keyword to allow programmers implement their decode function. For example in Cpu0, we define functions DecodeBranch24Target(), DecodeJumpTarget() and DecodeJumpFR() in Cpu0Disassembler.cpp and tell the llvm-tblgen by writing “let DecoderMethod = …” in the corresponding instruction definitions or ISD node of Cpu0InstrInfo.td. LLVM will call these DecodeMethod when user uses Disassembler tools, such as llvm-objdump -d.

Finally cpu032II includes all cpu032I instruction set and adds some instrucitons. When llvm-objdump -d is invoked, function selectCpu0ArchFeature() as the following will be called through createCpu0MCSubtargetInfo(). The llvm-objdump cannot set cpu option like llc as llc -mcpu=cpu032I, so the varaible CPU in selectCpu0ArchFeature() is empty when invoked by llvm-objdump -d. Set Cpu0ArchFeature to “+cpu032II” than it can disassemble all instructions (cpu032II include all cpu032I instructions and add some new instructions).

lbdex/chapters/Chapter10_1/MCTargetDesc/Cpu0MCTargetDesc.cpp

/// Select the Cpu0 Architecture Feature for the given triple and cpu name.
/// The function will be called at command 'llvm-objdump -d' for Cpu0 elf input.
static std::string selectCpu0ArchFeature(const Triple &TT, StringRef CPU) {
  std::string Cpu0ArchFeature;
  if (CPU.empty() || CPU == "generic") {
    if (TT.getArch() == Triple::cpu0 || TT.getArch() == Triple::cpu0el) {
      if (CPU.empty() || CPU == "cpu032II") {
        Cpu0ArchFeature = "+cpu032II";
      }
      else {
        if (CPU == "cpu032I") {
          Cpu0ArchFeature = "+cpu032I";
        }
      }
    }
  }
  return Cpu0ArchFeature;
}

Now, run Chapter10_1/ with command llvm-objdump -d ch8_1_1.cpu0.o will get the following result.

JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj
ch8_1_1.bc -o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o:       file format ELF32-CPU0

Disassembly of section .text:
_Z13test_control1v:
       0: 09 dd ff d8                                   addiu $sp, $sp, -40
       4: 09 30 00 00                                   addiu $3, $zero, 0
       8: 02 3d 00 24                                   st  $3, 36($sp)
       c: 09 20 00 01                                   addiu $2, $zero, 1
      10: 02 2d 00 20                                   st  $2, 32($sp)
      14: 09 40 00 02                                   addiu $4, $zero, 2
      18: 02 4d 00 1c                                   st  $4, 28($sp)
      ...

Disassembler structure

The flow of disassembly as Fig. 51.

digraph G {
  rankdir=TD;
  "disassembleObject()" -> "getInstruction()" [label="1. [AsmPrinter::llvm-objdump -d]\nMCInst,Bytes"];
  "disassembleObject()" -> "PrettyPrinter::printInst()" [label="2. MCInst,Address"];
  "PrettyPrinter::printInst()" -> "printInst()" [label="MCInst,Address"];
  "getInstruction()" -> "decodeInstruction()" [label="DecoderTableCpu032,MCInst,insn,Address"];
  "decodeInstruction()" -> "fieldFromInstruction()";
  "decodeInstruction()" -> "checkDecoderPredicate()";
  "decodeInstruction()" -> "decodeToMCInst()";
  "decodeToMCInst()" -> "DecodeMem()";
  "decodeToMCInst()" -> "DecodeBranch16Target()";
  "decodeToMCInst()" -> "DecodeBranch24Target()";
  "decodeToMCInst()" -> "DecodeJumpTarget()";
  "decodeToMCInst()" -> "DecodeJumpFR()";
  "decodeToMCInst()" -> "DecodeSimm16()";
  subgraph clusterObjdump {
    label = "llvm-objdump.cpp";
    "disassembleObject()";
    "PrettyPrinter::printInst()";
  }
  subgraph clusterCpu0Dis1 {
    label = "Cpu0Disassembler.cpp";
    "getInstruction()";
    "readInstruction32()";
    "getInstruction()" -> "readInstruction32()" [label="Bytes"];
    "readInstruction32()" -> "getInstruction()" [label="insn"];
  }
  subgraph clusterCpu0Dis2 {
    label = "Cpu0Disassembler.cpp\n These functions specified in Cpu0InstrInfo.td";
    "DecodeMem()";
    "DecodeBranch16Target()";
    "DecodeBranch24Target()";
    "DecodeJumpTarget()";
    "DecodeJumpFR()";
    "DecodeSimm16()";
  }
  subgraph clusterInc {
    label = "Cpu0GenDisassemblerTables.inc";
    "fieldFromInstruction()";
    "checkDecoderPredicate()";
    "decodeToMCInst()";
    "decodeInstruction()";
  }
  subgraph clusterCpu0InstPrinter {
    label = "Cpu0InstPrinter";
    "printInst()";
  }
//  label = "Figure: The flow of disassembly";
}

Fig. 51 The flow of disassembly.

  • After getInstruction() of Cpu0Disassembler.cpp, disassembleObject() of llvm-objdump.cpp call printInst() of Cpu0InstPrinter.cpp.

    • printInst() of Cpu0InstPrinter.cpp: reference Fig. 24.

  • Bytes: 4-byte (32-bits) for Cpu0. insn: Convert Bytes to big or little endian of 32-bit (unsigned int) binary instruction.

List DecoderTableCpu032 and decodeInstruction() as follows,

build/lib/Target/Cpu0/Cpu0GenDisassemblerTables.inc

static const uint8_t DecoderTableCpu032[] = {
/* 0 */       MCD::OPC_ExtractField, 24, 8,  // Inst{31-24} ...
/* 3 */       MCD::OPC_FilterValue, 0, 11, 0, 0, // Skip to: 19
/* 8 */       MCD::OPC_CheckField, 0, 24, 0, 149, 4, 0, // Skip to: 1188
/* 15 */      MCD::OPC_Decode, 178, 2, 0, // Opcode: NOP
/* 19 */      MCD::OPC_FilterValue, 1, 4, 0, 0, // Skip to: 28
/* 24 */      MCD::OPC_Decode, 161, 2, 1, // Opcode: LD
/* 28 */      MCD::OPC_FilterValue, 2, 4, 0, 0, // Skip to: 37
/* 33 */      MCD::OPC_Decode, 201, 2, 1, // Opcode: ST
/* 37 */      MCD::OPC_FilterValue, 3, 9, 0, 0, // Skip to: 51
/* 42 */      MCD::OPC_CheckPredicate, 0, 117, 4, 0, // Skip to: 1188
/* 47 */      MCD::OPC_Decode, 159, 2, 1, // Opcode: LB
/* 51 */      MCD::OPC_FilterValue, 4, 9, 0, 0, // Skip to: 65
/* 56 */      MCD::OPC_CheckPredicate, 0, 103, 4, 0, // Skip to: 1188
/* 61 */      MCD::OPC_Decode, 160, 2, 1, // Opcode: LBu
/* 65 */      MCD::OPC_FilterValue, 5, 9, 0, 0, // Skip to: 79
/* 70 */      MCD::OPC_CheckPredicate, 0, 89, 4, 0, // Skip to: 1188
/* 75 */      MCD::OPC_Decode, 187, 2, 1, // Opcode: SB
/* 79 */      MCD::OPC_FilterValue, 6, 9, 0, 0, // Skip to: 93
/* 84 */      MCD::OPC_CheckPredicate, 0, 75, 4, 0, // Skip to: 1188
/* 89 */      MCD::OPC_Decode, 163, 2, 1, // Opcode: LH
/* 93 */      MCD::OPC_FilterValue, 7, 9, 0, 0, // Skip to: 107
/* 98 */      MCD::OPC_CheckPredicate, 0, 61, 4, 0, // Skip to: 1188
/* 103 */     MCD::OPC_Decode, 164, 2, 1, // Opcode: LHu
/* 107 */     MCD::OPC_FilterValue, 8, 9, 0, 0, // Skip to: 121
/* 112 */     MCD::OPC_CheckPredicate, 0, 47, 4, 0, // Skip to: 1188
...
template <typename InsnType>
static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], MCInst &MI,
                                      InsnType insn, uint64_t Address,
                                      const void *DisAsm,
                                      const MCSubtargetInfo &STI) {
  const FeatureBitset &Bits = STI.getFeatureBits();

  const uint8_t *Ptr = DecodeTable;
  InsnType CurFieldValue = 0;
  DecodeStatus S = MCDisassembler::Success;
  while (true) {
    ptrdiff_t Loc = Ptr - DecodeTable;
    switch (*Ptr) {
    default:
      errs() << Loc << ": Unexpected decode table opcode!\n";
      return MCDisassembler::Fail;
    case MCD::OPC_ExtractField: {
      unsigned Start = *++Ptr;
      unsigned Len = *++Ptr;
      ++Ptr;
      CurFieldValue = fieldFromInstruction(insn, Start, Len);
      LLVM_DEBUG(dbgs() << Loc << ": OPC_ExtractField(" << Start << ", "
                   << Len << "): " << CurFieldValue << "\n");
      break;
    }
    case MCD::OPC_FilterValue: {
      // Decode the field value.
      unsigned Len;
      InsnType Val = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // Perform the filter operation.
      if (Val != CurFieldValue)
        Ptr += NumToSkip;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_FilterValue(" << Val << ", " << NumToSkip
                   << "): " << ((Val != CurFieldValue) ? "FAIL:" : "PASS:")
                   << " continuing at " << (Ptr - DecodeTable) << "\n");

      break;
    }
    case MCD::OPC_CheckField: {
      unsigned Start = *++Ptr;
      unsigned Len = *++Ptr;
      InsnType FieldValue = fieldFromInstruction(insn, Start, Len);
      // Decode the field value.
      InsnType ExpectedValue = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // If the actual and expected values don't match, skip.
      if (ExpectedValue != FieldValue)
        Ptr += NumToSkip;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckField(" << Start << ", "
                   << Len << ", " << ExpectedValue << ", " << NumToSkip
                   << "): FieldValue = " << FieldValue << ", ExpectedValue = "
                   << ExpectedValue << ": "
                   << ((ExpectedValue == FieldValue) ? "PASS\n" : "FAIL\n"));
      break;
    }
    case MCD::OPC_CheckPredicate: {
      unsigned Len;
      // Decode the Predicate Index value.
      unsigned PIdx = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;
      // Check the predicate.
      bool Pred;
      if (!(Pred = checkDecoderPredicate(PIdx, Bits)))
        Ptr += NumToSkip;
      (void)Pred;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckPredicate(" << PIdx << "): "
            << (Pred ? "PASS\n" : "FAIL\n"));

      break;
    }
    case MCD::OPC_Decode: {
      unsigned Len;
      // Decode the Opcode value.
      unsigned Opc = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
      Ptr += Len;

      MI.clear();
      MI.setOpcode(Opc);
      bool DecodeComplete;
      S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, DecodeComplete);
      assert(DecodeComplete);

      LLVM_DEBUG(dbgs() << Loc << ": OPC_Decode: opcode " << Opc
                   << ", using decoder " << DecodeIdx << ": "
                   << (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
      return S;
    }
    case MCD::OPC_TryDecode: {
      unsigned Len;
      // Decode the Opcode value.
      unsigned Opc = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
      Ptr += Len;
      // NumToSkip is a plain 24-bit integer.
      unsigned NumToSkip = *Ptr++;
      NumToSkip |= (*Ptr++) << 8;
      NumToSkip |= (*Ptr++) << 16;

      // Perform the decode operation.
      MCInst TmpMI;
      TmpMI.setOpcode(Opc);
      bool DecodeComplete;
      S = decodeToMCInst(S, DecodeIdx, insn, TmpMI, Address, DisAsm, DecodeComplete);
      LLVM_DEBUG(dbgs() << Loc << ": OPC_TryDecode: opcode " << Opc
                   << ", using decoder " << DecodeIdx << ": ");

      if (DecodeComplete) {
        // Decoding complete.
        LLVM_DEBUG(dbgs() << (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
        MI = TmpMI;
        return S;
      } else {
        assert(S == MCDisassembler::Fail);
        // If the decoding was incomplete, skip.
        Ptr += NumToSkip;
        LLVM_DEBUG(dbgs() << "FAIL: continuing at " << (Ptr - DecodeTable) << "\n");
        // Reset decode status. This also drops a SoftFail status that could be
        // set before the decode attempt.
        S = MCDisassembler::Success;
      }
      break;
    }
    case MCD::OPC_SoftFail: {
      // Decode the mask values.
      unsigned Len;
      InsnType PositiveMask = decodeULEB128(++Ptr, &Len);
      Ptr += Len;
      InsnType NegativeMask = decodeULEB128(Ptr, &Len);
      Ptr += Len;
      bool Fail = (insn & PositiveMask) || (~insn & NegativeMask);
      if (Fail)
        S = MCDisassembler::SoftFail;
      LLVM_DEBUG(dbgs() << Loc << ": OPC_SoftFail: " << (Fail ? "FAIL\n" : "PASS\n"));
      break;
    }
    case MCD::OPC_Fail: {
      LLVM_DEBUG(dbgs() << Loc << ": OPC_Fail\n");
      return MCDisassembler::Fail;
    }
    }
  }
  llvm_unreachable("bogosity detected in disassembler state machine!");
}


List the tracing of decodeInstruction() by enabling “#if 1” of Cpu0Disassembler.cpp and run “llvm-objdump” as follows,

lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp

#if 1
#undef LLVM_DEBUG(X)
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"

Based on the debug log above, pick example “addiu $sp, $sp, -8” which opcode is 9 to explain decodeInstruction() as table and explanation follows,

Table 37 The state transformation of decodeInstruction() for “addiu $sp, $sp, -8”

state

result

OPC_ExtractField

CurFieldValue <- Opcode:9

OPC_FilterValue

Match entries of DecodeTable == CurFieldValue

OPC_Decode

setOpcode(9) and decode operands by calling decodeToMCInst()

  • For “move $fp, $sp” and “ret $lr”, they have state OPC_CheckField before OPC_Decode since they are R type of Cpu0 instruction format and “let shamt = 0;” is set in “class ArithLogic” of Cpu0InstrInfo.td.

    • For “move $fp, $sp”, fieldFromInstruction(0x11cd0000, 0, 12) = (0x11cd0000 & 0x00000fff). Check bits(20..31) is 0.

  • DecodeBranch16Target() and DecodeBranch24Target(): decode immediate value to MCInst.operand and set the type of MCInst.operand to immediate type and value to positive or negative. Operand of MCInst can be immediate or register type.