ELF Support

Cpu0 backend generated the ELF format of obj. The ELF (Executable and Linkable Format) is a common standard file format for executables, object code, shared libraries and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unixsystems. In 1999 it was chosen as the standard binary file format for Unix and Unix-like systems on x86 by the x86open project. Please reference [1].

The binary encode of Cpu0 instruction set in obj has been checked in the previous chapters. But we didn’t dig into the ELF file format like elf header and relocation record at that time. This chapter will use the binutils which has been installed in “sub-section Install other tools on iMac” of Appendix A: “Installing LLVM” [2] to check the generated cpu0 ELF file. You will learn the objdump, readelf, ..., tools and understand the ELF file format itself through using these tools to analyze the cpu0 generated obj in this chapter. LLVM has the llvm-objdump tool which like objdump. We will make cpu0 support llvm-objdump tool further in this chapter. The binutils is a cross compiler tool chains include a couple of CPU ELF dump function support. Linux platform has binutils already and no need to install it further. The reason we use Linux binutils in this chapter just because my iMac will display Chinese text. The iMac corresponding binutils have no problem except it add g in command name and and display with your area language instead of pure English on iMac. For example, when using gobjdump instead of objdump, I have the result of Chinese language unicode display instead of pure English on my iMac.

The binutils tool we use is not a part of llvm tools, but it’s a powerful tool in ELF analysis. This chapter introduce the tool to readers since we think it is a valuable knowledge in this popular ELF format and the ELF binutils analysis tool. An LLVM compiler engineer has the responsibility to make sure his backend has generated a right obj since the obj is needed to be handled by linker or loader later. With this tool, you can verify your generated ELF format.

The cpu0 author has published a “System Software” book which introduces the topics of assembler, linker, loader, compiler and OS in concept, and at same time demonstrates how to use binutils and gcc to analysis ELF through the example code in his book. It’s a Chinese book of “System Software” in concept and practice. This book does the real analysis through binutils. The “System Software” [3] written by Beck is a famous book in concept of telling readers what about the compiler output, what about the linker output, what about the loader output, and how they work together. But it covers the concept only. You can reference it to understand how the “Relocation Record” works if you need to refresh or learning this knowledge for this chapter.

[4], [5], [6] are the Chinese documents available from the cpu0 author on web site.

ELF format

ELF is a format used in both obj and executable file. So, there are two views in it as Fig. 32.

_images/13.png

Fig. 32 ELF file format overview

As Fig. 32, the “Section header table” include sections .text, .rodata, ..., .data which are sections layout for code, read only data, ..., and read/write data, respectively. “Program header table” include segments for run time code and data. The definition of segments is the run time layout for code and data while sections is the link time layout for code and data.

ELF header and Section header table

Let’s run Chapter9_3/ with ch6_1.cpp, and dump ELF header information by readelf -h to see what information the ELF header contains.

[Gamma@localhost input]$ ~/llvm/test/cmake_debug_build/bin/llc -march=cpu0
-relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o

[Gamma@localhost input]$ readelf -h ch6_1.cpu0.o
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           <unknown>: 0xc9
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          176 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         8
  Section header string table index: 5
[Gamma@localhost input]$

[Gamma@localhost input]$ ~/llvm/test/cmake_debug_build/bin/llc
-march=mips -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.mips.o

[Gamma@localhost input]$ readelf -h ch6_1.mips.o
ELF Header:
  Magic:   7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           MIPS R3000
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          200 (bytes into file)
  Flags:                             0x50001007, noreorder, pic, cpic, o32, mips32
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         9
  Section header string table index: 6
[Gamma@localhost input]$

As above ELF header display, it contains information of magic number, version, ABI, ..., . The Machine field of cpu0 is unknown while mips is known as MIPSR3000. It is unknown because cpu0 is not a popular CPU recognized by utility readelf. Let’s check ELF segments information as follows,

[Gamma@localhost input]$ readelf -l ch6_1.cpu0.o

There are no program headers in this file.
[Gamma@localhost input]$

The result is in expectation because cpu0 obj is for link only, not for execution. So, the segments is empty. Check ELF sections information as follows. Every section contains offset and size information.

[Gamma@localhost input]$ readelf -S ch6_1.cpu0.o
There are 10 section headers, starting at offset 0xd4:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000034 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 000310 000018 08      8   1  4
  [ 3] .data             PROGBITS        00000000 000068 000004 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 00006c 000000 00  WA  0   0  4
  [ 5] .eh_frame         PROGBITS        00000000 00006c 000028 00   A  0   0  4
  [ 6] .rel.eh_frame     REL             00000000 000328 000008 08      8   5  4
  [ 7] .shstrtab         STRTAB          00000000 000094 00003e 00      0   0  1
  [ 8] .symtab           SYMTAB          00000000 000264 000090 10      9   6  4
  [ 9] .strtab           STRTAB          00000000 0002f4 00001b 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
[Gamma@localhost input]$

Relocation Record

Cpu0 backend translate global variable as follows,

[Gamma@localhost input]$ clang -target mips-unknown-linux-gnu -c ch6_1.cpp
-emit-llvm -o ch6_1.bc
[Gamma@localhost input]$ ~/llvm/test/cmake_debug_build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch6_1.bc -o ch6_1.cpu0.s
[Gamma@localhost input]$ cat ch6_1.cpu0.s
  .section .mdebug.abi32
  .previous
  .file "ch6_1.bc"
  .text
  ...
  .cfi_startproc
  .frame  $sp,8,$lr
  .mask   0x00000000,0
  .set  noreorder
  .cpload $t9
  ...
  lui $2, %got_hi(gI)
  addu $2, $2, $gp
  ld $2, %got_lo(gI)($2)
  ...
  .type gI,@object              # @gI
  .data
  .globl  gI
  .align  2
gI:
  .4byte  100                     # 0x64
  .size gI, 4


[Gamma@localhost input]$ ~/llvm/test/cmake_debug_build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o
[Gamma@localhost input]$ objdump -s ch6_1.cpu0.o

ch6_1.cpu0.o:     file format elf32-big

Contents of section .text:
// .cpload machine instruction
 0000 0fa00000 0daa0000 13aa6000 ........  ..............`.
 ...
 0020 002a0000 00220000 012d0000 0ddd0008  .*..."...-......
 ...
[Gamma@localhost input]$ Jonathan$

[Gamma@localhost input]$ readelf -tr ch6_1.cpu0.o
There are 8 section headers, starting at offset 0xb0:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000044 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002a8 000020 08   6   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 000078 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000080 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .shstrtab
       STRTAB          00000000 000080 000030 00   0   0  1
       [00000000]:
  [ 6] .symtab
       SYMTAB          00000000 0001f0 000090 10   7   5  4
       [00000000]:
  [ 7] .strtab
       STRTAB          00000000 000280 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2a8 contains 4 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000805 unrecognized: 5       00000000   _gp_disp
00000004  00000806 unrecognized: 6       00000000   _gp_disp
00000020  00000616 unrecognized: 16      00000004   gI
00000028  00000617 unrecognized: 17      00000004   gI


[Gamma@localhost input]$ readelf -tr ch6_1.mips.o
There are 9 section headers, starting at offset 0xc8:

Section Headers:
  [Nr] Name
       Type            Addr     Off    Size   ES   Lk Inf Al
       Flags
  [ 0]
       NULL            00000000 000000 000000 00   0   0  0
       [00000000]:
  [ 1] .text
       PROGBITS        00000000 000034 000038 00   0   0  4
       [00000006]: ALLOC, EXEC
  [ 2] .rel.text
       REL             00000000 0002f8 000018 08   7   1  4
       [00000000]:
  [ 3] .data
       PROGBITS        00000000 00006c 000008 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 4] .bss
       NOBITS          00000000 000074 000000 00   0   0  4
       [00000003]: WRITE, ALLOC
  [ 5] .reginfo
       MIPS_REGINFO    00000000 000074 000018 00   0   0  1
       [00000002]: ALLOC
  [ 6] .shstrtab
       STRTAB          00000000 00008c 000039 00   0   0  1
       [00000000]:
  [ 7] .symtab
       SYMTAB          00000000 000230 0000a0 10   8   6  4
       [00000000]:
  [ 8] .strtab
       STRTAB          00000000 0002d0 000025 00   0   0  1
       [00000000]:

Relocation section '.rel.text' at offset 0x2f8 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000000  00000905 R_MIPS_HI16       00000000   _gp_disp
00000004  00000906 R_MIPS_LO16       00000000   _gp_disp
0000001c  00000709 R_MIPS_GOT16      00000004   gI

As depicted in section Handle $gp register in PIC addressing mode, it translates “.cpload %reg” into the following.

// Lower ".cpload $reg" to
//  "lui   $gp, %hi(_gp_disp)"
//  "ori $gp, $gp, %lo(_gp_disp)"
//  "addu  $gp, $gp, $t9"

The _gp_disp value is determined by loader. So, it’s undefined in obj. You can find both the Relocation Records for offset 0 and 4 of .text section refer to _gp_disp value. The offset 0 and 4 of .text section are instructions “lui $gp, %hi(_gp_disp)” and “ori $gp, $gp, %lo(_gp_disp)” which their corresponding obj encode are 0fa00000 and 0daa0000, respectively. The obj translates the %hi(_gp_disp) and %lo(_gp_disp) into 0 since when loader loads this obj into memory, loader will know the _gp_disp value at run time and will update these two offset relocation records to the correct offset value. You can check if the cpu0 of %hi(_gp_disp) and %lo(_gp_disp) are correct by above mips Relocation Records of R_MIPS_HI(_gp_disp) and R_MIPS_LO(_gp_disp) even though the cpu0 is not a CPU recognized by readelf utilitly. The instruction “ld $2, %got(gI)($gp)” is same since we don’t know what the address of .data section variable will load to. So, Cpu0 translate the address to 0 and made a relocation record on 0x00000020 of .text section. Linker or Loader will change this address when this program is linked or loaded depends on the program is static link or dynamic link.

llvm-objdump

llvm-objdump -t -r

In iMac, gobjdump -tr can display the information of relocation records like readelf -tr. LLVM tool llvm-objdump is the same tool as objdump. Let’s run gobjdump and llvm-objdump commands as follows to see the differences.

118-165-83-12:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch9_3.cpp -emit-llvm -o ch9_3.bc
118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/
Debug/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch9_3.bc -o
ch9_3.cpu0.o

118-165-78-12:input Jonathan$ gobjdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o:     file format elf32-big

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp


RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE
00000084 UNKNOWN           _gp_disp
00000088 UNKNOWN           _gp_disp
000000e0 UNKNOWN           _Z5sum_iiz


118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/
Debug/bin/llvm-objdump -t -r ch9_3.cpu0.o

ch9_3.cpu0.o: file format ELF32-CPU0

RELOCATION RECORDS FOR [.text]:
132 R_CPU0_HI16 _gp_disp
136 R_CPU0_LO16 _gp_disp
224 R_CPU0_CALL16 _Z5sum_iiz

SYMBOL TABLE:
00000000 l    df *ABS*        00000000 ch9_3.bc
00000000 l    d  .text        00000000 .text
00000000 l    d  .data        00000000 .data
00000000 l    d  .bss 00000000 .bss
00000000 g     F .text        00000084 _Z5sum_iiz
00000084 g     F .text        00000080 main
00000000         *UND*        00000000 _gp_disp

The llvm-objdump can display the file format and relocation records information well while the objdump cannot since we add the relocation records information in ELF.h as follows,

include/llvm/support/ELF.h

// Machine architectures
enum {
  ...
  EM_CPU0          = 998, // Document LLVM Backend Tutorial Cpu0
  EM_CPU0_LE       = 999  // EM_CPU0_LE: little endian; EM_CPU0: big endian
}

lib/object/ELF.cpp

...

StringRef getELFRelocationTypeName(uint32_t Machine, uint32_t Type) {
  switch (Machine) {
  ...
  case ELF::EM_CPU0:
    switch (Type) {
#include "llvm/Support/ELFRelocs/Cpu0.def"
    default:
      break;
    }
    break;
  ...
  }

include/llvm/Support/ELFRelocs/Cpu0.def


#ifndef ELF_RELOC
#error "ELF_RELOC must be defined"
#endif

ELF_RELOC(R_CPU0_NONE,                0)
ELF_RELOC(R_CPU0_32,                  2)
ELF_RELOC(R_CPU0_HI16,                5)
ELF_RELOC(R_CPU0_LO16,                6)
ELF_RELOC(R_CPU0_GPREL16,             7)
ELF_RELOC(R_CPU0_LITERAL,             8)
ELF_RELOC(R_CPU0_GOT16,               9)
ELF_RELOC(R_CPU0_PC16,               10)
ELF_RELOC(R_CPU0_CALL16,             11)
ELF_RELOC(R_CPU0_GPREL32,            12)
ELF_RELOC(R_CPU0_PC24,               13)
ELF_RELOC(R_CPU0_GOT_HI16,           22)
ELF_RELOC(R_CPU0_GOT_LO16,           23)
ELF_RELOC(R_CPU0_RELGOT,             36)
ELF_RELOC(R_CPU0_TLS_GD,             42)
ELF_RELOC(R_CPU0_TLS_LDM,            43)
ELF_RELOC(R_CPU0_TLS_DTP_HI16,       44)
ELF_RELOC(R_CPU0_TLS_DTP_LO16,       45)
ELF_RELOC(R_CPU0_TLS_GOTTPREL,       46)
ELF_RELOC(R_CPU0_TLS_TPREL32,        47)
ELF_RELOC(R_CPU0_TLS_TP_HI16,        49)
ELF_RELOC(R_CPU0_TLS_TP_LO16,        50)
ELF_RELOC(R_CPU0_GLOB_DAT,           51)
ELF_RELOC(R_CPU0_JUMP_SLOT,          127)

include/llvm/Object/ELFObjectFile.h

template<support::endianness target_endianness, bool is64Bits>
error_code ELFObjectFile<target_endianness, is64Bits>
            ::getRelocationValueString(DataRefImpl Rel,
                      SmallVectorImpl<char> &Result) const {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  res = symname;
  break;
  ...
}

template<support::endianness target_endianness, bool is64Bits>
StringRef ELFObjectFile<target_endianness, is64Bits>
             ::getFileFormatName() const {
  switch(Header->e_ident[ELF::EI_CLASS]) {
  case ELF::ELFCLASS32:
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
    return "ELF32-CPU0";
  ...
}

template<support::endianness target_endianness, bool is64Bits>
unsigned ELFObjectFile<target_endianness, is64Bits>::getArch() const {
  switch(Header->e_machine) {
  ...
  case ELF::EM_CPU0:  // llvm-objdump -t -r
  return (target_endianness == support::little) ?
       Triple::cpu0el : Triple::cpu0;
  ...
}

In addition to llvm-objdump -t -r, the llvm-readobj -h can display the Cpu0 elf header information with above EM_CPU0 defined.

llvm-objdump -d

Run the last Chapter example code with command llvm-objdump -d for dumping file from elf to hex as follows,

JonathantekiiMac:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch8_1_1.cpp -emit-llvm -o ch8_1_1.bc
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_
build/Debug/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_1_1.bc
-o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_
build/Debug/bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o: file format ELF32-unknown

Disassembly of section .text:error: no disassembler for target cpu0-unknown-
unknown

To support llvm-objdump, the following code added to Chapter10_1/ (the DecoderMethod for brtarget24 has been added in previous chapter).

lbdex/chapters/Chapter10_1/CMakeLists.txt

tablegen(LLVM Cpu0GenDisassemblerTables.inc -gen-disassembler)
add_subdirectory(Disassembler)

lbdex/chapters/Chapter10_1/LLVMBuild.txt

subdirectories = 
  Disassembler 
has_disassembler = 1

lbdex/chapters/Chapter10_1/Cpu0InstrInfo.td

let isBranch=1, isTerminator=1, isBarrier=1, imm16=0, hasDelaySlot = 1,
    isIndirectBranch = 1 in
class JumpFR<bits<8> op, string instr_asm, RegisterClass RC>:
  FL<op, (outs), (ins RC:$ra),
     !strconcat(instr_asm, "\t$ra"), [(brind RC:$ra)], IIBranch> {
  let rb = 0;
  let imm16 = 0;
//#if CH >= CH10_1 1.5
  let DecoderMethod = "DecodeJumpFR";
//#endif
}
  class JumpLink<bits<8> op, string instr_asm>:
    FJ<op, (outs), (ins calltarget:$target, variable_ops),
       !strconcat(instr_asm, "\t$target"), [(Cpu0JmpLink imm:$target)],
       IIBranch> {
//#if CH >= CH10_1 2
       let DecoderMethod = "DecodeJumpTarget";
//#endif
       }

lbdex/chapters/Chapter10_1/Disassembler/CMakeLists.txt

add_llvm_library(LLVMCpu0Disassembler
  Cpu0Disassembler.cpp
  )

lbdex/chapters/Chapter10_1/Disassembler/LLVMBuild.txt

;===- ./lib/Target/Cpu0/Disassembler/LLVMBuild.txt --------------*- Conf -*--===;
;
;                     The LLVM Compiler Infrastructure
;
; This file is distributed under the University of Illinois Open Source
; License. See LICENSE.TXT for details.
;
;===------------------------------------------------------------------------===;
;
; This is an LLVMBuild description file for the components in this subdirectory.
;
; For more information on the LLVMBuild system, please see:
;
;   http://llvm.org/docs/LLVMBuild.html
;
;===------------------------------------------------------------------------===;

[component_0]
type = Library
name = Cpu0Disassembler
parent = Cpu0
required_libraries = MCDisassembler Support Cpu0Info
add_to_library_groups = Cpu0

lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp

//===- Cpu0Disassembler.cpp - Disassembler for Cpu0 -------------*- C++ -*-===//
//
//                     The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file is part of the Cpu0 Disassembler.
//
//===----------------------------------------------------------------------===//

#include "Cpu0.h"

#include "Cpu0RegisterInfo.h"
#include "Cpu0Subtarget.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCFixedLenDisassembler.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/TargetRegistry.h"

using namespace llvm;

#define DEBUG_TYPE "cpu0-disassembler"

typedef MCDisassembler::DecodeStatus DecodeStatus;

namespace {

/// Cpu0DisassemblerBase - a disasembler class for Cpu0.
class Cpu0DisassemblerBase : public MCDisassembler {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0DisassemblerBase(const MCSubtargetInfo &STI, MCContext &Ctx,
                       bool bigEndian) :
    MCDisassembler(STI, Ctx),
    IsBigEndian(bigEndian) {}

  virtual ~Cpu0DisassemblerBase() {}

protected:
  bool IsBigEndian;
};

/// Cpu0Disassembler - a disasembler class for Cpu032.
class Cpu0Disassembler : public Cpu0DisassemblerBase {
public:
  /// Constructor     - Initializes the disassembler.
  ///
  Cpu0Disassembler(const MCSubtargetInfo &STI, MCContext &Ctx, bool bigEndian)
      : Cpu0DisassemblerBase(STI, Ctx, bigEndian) {
  }

  /// getInstruction - See MCDisassembler.
  DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
                              ArrayRef<uint8_t> Bytes, uint64_t Address,
                              raw_ostream &VStream,
                              raw_ostream &CStream) const override;
};

} // end anonymous namespace

// Decoder tables for GPR register
static const unsigned CPURegsTable[] = {
  Cpu0::ZERO, Cpu0::AT, Cpu0::V0, Cpu0::V1,
  Cpu0::A0, Cpu0::A1, Cpu0::T9, Cpu0::T0, 
  Cpu0::T1, Cpu0::S0, Cpu0::S1, Cpu0::GP, 
  Cpu0::FP, Cpu0::SP, Cpu0::LR, Cpu0::SW
};

// Decoder tables for co-processor 0 register
static const unsigned C0RegsTable[] = {
  Cpu0::PC, Cpu0::EPC
};

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder);
static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder);
static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder);
static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder);
static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder);
static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder);

namespace llvm {
extern Target TheCpu0elTarget, TheCpu0Target, TheCpu064Target,
              TheCpu064elTarget;
}

static MCDisassembler *createCpu0Disassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, true);
}

static MCDisassembler *createCpu0elDisassembler(
                       const Target &T,
                       const MCSubtargetInfo &STI,
                       MCContext &Ctx) {
  return new Cpu0Disassembler(STI, Ctx, false);
}

extern "C" void LLVMInitializeCpu0Disassembler() {
  // Register the disassembler.
  TargetRegistry::RegisterMCDisassembler(TheCpu0Target,
                                         createCpu0Disassembler);
  TargetRegistry::RegisterMCDisassembler(TheCpu0elTarget,
                                         createCpu0elDisassembler);
}

#include "Cpu0GenDisassemblerTables.inc"

/// Read four bytes from the ArrayRef and return 32 bit word sorted
/// according to the given endianess
static DecodeStatus readInstruction32(ArrayRef<uint8_t> Bytes, uint64_t Address,
                                      uint64_t &Size, uint32_t &Insn,
                                      bool IsBigEndian) {
  // We want to read exactly 4 Bytes of data.
  if (Bytes.size() < 4) {
    Size = 0;
    return MCDisassembler::Fail;
  }

  if (IsBigEndian) {
    // Encoded as a big-endian 32-bit word in the stream.
    Insn = (Bytes[3] <<  0) |
           (Bytes[2] <<  8) |
           (Bytes[1] << 16) |
           (Bytes[0] << 24);
  }
  else {
    // Encoded as a small-endian 32-bit word in the stream.
    Insn = (Bytes[0] <<  0) |
           (Bytes[1] <<  8) |
           (Bytes[2] << 16) |
           (Bytes[3] << 24);
  }

  return MCDisassembler::Success;
}

DecodeStatus
Cpu0Disassembler::getInstruction(MCInst &Instr, uint64_t &Size,
                                              ArrayRef<uint8_t> Bytes,
                                              uint64_t Address,
                                              raw_ostream &VStream,
                                              raw_ostream &CStream) const {
  uint32_t Insn;

  DecodeStatus Result;

  Result = readInstruction32(Bytes, Address, Size, Insn, IsBigEndian);

  if (Result == MCDisassembler::Fail)
    return MCDisassembler::Fail;

  // Calling the auto-generated decoder function.
  Result = decodeInstruction(DecoderTableCpu032, Instr, Insn, Address,
                             this, STI);
  if (Result != MCDisassembler::Fail) {
    Size = 4;
    return Result;
  }

  return MCDisassembler::Fail;
}

static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  if (RegNo > 15)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(CPURegsTable[RegNo]));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
                                               unsigned RegNo,
                                               uint64_t Address,
                                               const void *Decoder) {
  return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}

static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
                                              unsigned RegNo,
                                              uint64_t Address,
                                              const void *Decoder) {
  if (RegNo > 1)
    return MCDisassembler::Fail;

  Inst.addOperand(MCOperand::createReg(C0RegsTable[RegNo]));
  return MCDisassembler::Success;
}

//@DecodeMem {
static DecodeStatus DecodeMem(MCInst &Inst,
                              unsigned Insn,
                              uint64_t Address,
                              const void *Decoder) {
//@DecodeMem body {
  int Offset = SignExtend32<16>(Insn & 0xffff);
  int Reg = (int)fieldFromInstruction(Insn, 20, 4);
  int Base = (int)fieldFromInstruction(Insn, 16, 4);

  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg]));
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Base]));
  Inst.addOperand(MCOperand::createImm(Offset));

  return MCDisassembler::Success;
}

static DecodeStatus DecodeBranch16Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 16);
  if (BranchOffset > 0x8fff)
  	BranchOffset = -1*(0x10000 - BranchOffset);
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

/* CBranch instruction define $ra and then imm24; The printOperand() print 
operand 1 (operand 0 is $ra and operand 1 is imm24), so we Create register 
operand first and create imm24 next, as follows,

// Cpu0InstrInfo.td
class CBranch<bits<8> op, string instr_asm, RegisterClass RC,
                   list<Register> UseRegs>:
  FJ<op, (outs), (ins RC:$ra, brtarget:$addr),
             !strconcat(instr_asm, "\t$addr"),
             [(brcond RC:$ra, bb:$addr)], IIBranch> {

// Cpu0AsmWriter.inc
void Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {
...
  case 3:
    // CMP, JEQ, JGE, JGT, JLE, JLT, JNE
    printOperand(MI, 1, O); 
    break;
*/
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
                                       unsigned Insn,
                                       uint64_t Address,
                                       const void *Decoder) {
  int BranchOffset = fieldFromInstruction(Insn, 0, 24);
  if (BranchOffset > 0x8fffff)
  	BranchOffset = -1*(0x1000000 - BranchOffset);
  Inst.addOperand(MCOperand::createReg(Cpu0::SW));
  Inst.addOperand(MCOperand::createImm(BranchOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpTarget(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {

  unsigned JumpOffset = fieldFromInstruction(Insn, 0, 24);
  Inst.addOperand(MCOperand::createImm(JumpOffset));
  return MCDisassembler::Success;
}

static DecodeStatus DecodeJumpFR(MCInst &Inst,
                                     unsigned Insn,
                                     uint64_t Address,
                                     const void *Decoder) {
  int Reg_a = (int)fieldFromInstruction(Insn, 20, 4);
  Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg_a]));
// exapin in http://jonathan2251.github.io/lbd/llvmstructure.html#jr-note
  if (CPURegsTable[Reg_a] == Cpu0::LR)
    Inst.setOpcode(Cpu0::RET);
  else
    Inst.setOpcode(Cpu0::JR);
  return MCDisassembler::Success;
}

static DecodeStatus DecodeSimm16(MCInst &Inst,
                                 unsigned Insn,
                                 uint64_t Address,
                                 const void *Decoder) {
  Inst.addOperand(MCOperand::createImm(SignExtend32<16>(Insn)));
  return MCDisassembler::Success;
}

As above code, it adds directory Disassembler to handle the reverse translation from obj to assembly. So, add Disassembler/Cpu0Disassembler.cpp and modify the CMakeList.txt and LLVMBuild.txt to build directory Disassembler, and enable the disassembler table generated by “has_disassembler = 1”. Most of code is handled by the table defined in *.td files. Not every instruction in *.td can be disassembled without trouble even though they can be translated into assembly and obj successfully. For those cannot be disassembled, LLVM supply the “let DecoderMethod” keyword to allow programmers implement their decode function. For example in Cpu0, we define functions DecodeBranch24Target(), DecodeJumpTarget() and DecodeJumpFR() in Cpu0Disassembler.cpp and tell the llvm-tblgen by writing “let DecoderMethod = ...” in the corresponding instruction definitions or ISD node of Cpu0InstrInfo.td. LLVM will call these DecodeMethod when user uses Disassembler tools, such as llvm-objdump -d.

Finally cpu032II include all cpu032I instruction set and adds some instrucitons. When llvm-objdump -d is invoked, function selectCpu0ArchFeature() as the following will be called through createCpu0MCSubtargetInfo(). The llvm-objdump cannot set cpu option like llc as llc -mcpu=cpu032I, so the varaible CPU in selectCpu0ArchFeature() is empty when invoked by llvm-objdump -d. Set Cpu0ArchFeature to “+cpu032II” than it can disassemble all instructions (cpu032II include all cpu032I instructions and add some new instructions).

lbdex/chapters/Chapter10_1/MCTargetDesc/Cpu0MCTargetDesc.cpp

/// Select the Cpu0 Architecture Feature for the given triple and cpu name.
/// The function will be called at command 'llvm-objdump -d' for Cpu0 elf input.
static StringRef selectCpu0ArchFeature(const Triple &TT, StringRef CPU) {
  std::string Cpu0ArchFeature;
  if (CPU.empty() || CPU == "generic") {
    if (TT.getArch() == Triple::cpu0 || TT.getArch() == Triple::cpu0el) {
      if (CPU.empty() || CPU == "cpu032II") {
        Cpu0ArchFeature = "+cpu032II";
      }
      else {
        if (CPU == "cpu032I") {
          Cpu0ArchFeature = "+cpu032I";
        }
      }
    }
  }
  return Cpu0ArchFeature;
}

Now, run Chapter10_1/ with command llvm-objdump -d ch8_1_1.cpu0.o will get the following result.

JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_
build/Debug/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj
ch8_1_1.bc -o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_
build/Debug/bin/llvm-objdump -d ch8_1_1.cpu0.o

ch8_1_1.cpu0.o:       file format ELF32-CPU0

Disassembly of section .text:
_Z13test_control1v:
       0: 09 dd ff d8                                   addiu $sp, $sp, -40
       4: 09 30 00 00                                   addiu $3, $zero, 0
       8: 02 3d 00 24                                   st  $3, 36($sp)
       c: 09 20 00 01                                   addiu $2, $zero, 1
      10: 02 2d 00 20                                   st  $2, 32($sp)
      14: 09 40 00 02                                   addiu $4, $zero, 2
      18: 02 4d 00 1c                                   st  $4, 28($sp)
      ...
[1]http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
[2]http://jonathan2251.github.io/lbd/install.html#install-other-tools-on-imac
[3]Leland Beck, System Software: An Introduction to Systems Programming.
[4]http://ccckmit.wikidot.com/lk:aout
[5]http://ccckmit.wikidot.com/lk:objfile
[6]http://ccckmit.wikidot.com/lk:elf