ELF Support¶
Cpu0 backend generated the ELF format of object files.
The ELF (Executable and Linkable Format) is a common standard file format for executables, object code, shared libraries and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unixsystems. In 1999 it was chosen as the standard binary file format for Unix and Unix-like systems on x86 by the x86open project. Please reference [1].
The binary encoding of the Cpu0 instruction set in object files has been verified in previous chapters. However, we did not delve into the ELF file format, such as the ELF header and relocation records, at that time.
In this chapter, you will learn how to use tools such as llvm-objdump
,
llvm-readelf
, and others to analyze ELF files generated by Cpu0. Through
these tools, you will also understand the ELF file format itself.
This chapter introduces these tools to readers because understanding the popular ELF format and analysis tools is valuable. An LLVM compiler engineer is responsible for ensuring that their backend generates correct object files.
With these tools, you can verify the correctness of the generated ELF format.
The Cpu0 author has published a book titled “System Software,” which introduces topics such as assemblers, linkers, loaders, compilers, and operating systems in both concept and practice. It demonstrates how to analyze ELF files using binutils and gcc, and includes example code. This is a Chinese-language book on “System Software.”
The book “System Software” [2] written by Beck is a well-known resource for explaining what the compiler, linker, and loader produce, and how they work together conceptually. You may refer to it to understand how Relocation Records work if you need a refresher or are learning this topic for the first time.
[3], [4], [5] are Chinese documents about this topic, available on the Cpu0 author’s website.
ELF format¶
ELF is a format used in both object and executable files. Therefore, there are two views of it, as shown in Fig. 50.

Fig. 50 ELF file format overview¶
As shown in Fig. 50, the “Section header table” includes sections .text, .rodata, …, .data, which are used for code, read-only data, and read/write data, respectively. The “Program header table” includes segments used at run time for code and data.
The definition of segments describes the run-time layout of code and data, while sections describe the link-time layout.
ELF header and Section header table¶
Let’s run Chapter9_3/ with ch6_1.cpp, and dump ELF header information using
llvm-readelf -h
to see what the ELF header contains.
input$ ~/llvm/test/build/bin/llc -march=cpu0
-relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o
input$ llvm-readelf -h ch6_1.cpu0.o
Magic: 7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, big endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: REL (Relocatable file)
Machine: <unknown>: 0xc9
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 176 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 8
Section header string table index: 5
input$
input$ ~/llvm/test/build/bin/llc
-march=mips -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.mips.o
input$ llvm-readelf -h ch6_1.mips.o
ELF Header:
Magic: 7f 45 4c 46 01 02 01 03 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, big endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: REL (Relocatable file)
Machine: MIPS R3000
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 200 (bytes into file)
Flags: 0x50001007, noreorder, pic, cpic, o32, mips32
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 9
Section header string table index: 6
input$
input$ llvm-readelf -l ch6_1.cpu0.o
There are no program headers in this file.
input$
As shown in the ELF header above, it contains information such as the magic number, version, ABI, and more. The Machine field for Cpu0 is listed as unknown, whereas MIPS is recognized as MIPSR3000.
This happens because Cpu0 is a unknown CPU supported by the llvm-readelf utility.
Let’s check the ELF segments information with the following command:
input$ llvm-readelf -l ch6_1.cpu0.o
There are no program headers in this file.
input$
This result is expected because the Cpu0 object file is meant for linking only, not execution because we don’t implement linker at this point yet. Therefore, the segment table is empty.
Next, let’s check the ELF sections. Each section includes offset and size information.
input$ llvm-readelf -S ch6_1.cpu0.o
There are 10 section headers, starting at offset 0xd4:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000034 00 AX 0 0 4
[ 2] .rel.text REL 00000000 000310 000018 08 8 1 4
[ 3] .data PROGBITS 00000000 000068 000004 00 WA 0 0 4
[ 4] .bss NOBITS 00000000 00006c 000000 00 WA 0 0 4
[ 5] .eh_frame PROGBITS 00000000 00006c 000028 00 A 0 0 4
[ 6] .rel.eh_frame REL 00000000 000328 000008 08 8 5 4
[ 7] .shstrtab STRTAB 00000000 000094 00003e 00 0 0 1
[ 8] .symtab SYMTAB 00000000 000264 000090 10 9 6 4
[ 9] .strtab STRTAB 00000000 0002f4 00001b 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
input$
Relocation Record¶
Cpu0 backend translates global variables as follows:
input$ clang -target mips-unknown-linux-gnu -c ch6_1.cpp
-emit-llvm -o ch6_1.bc
input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch6_1.bc -o ch6_1.cpu0.s
input$ cat ch6_1.cpu0.s
.section .mdebug.abi32
.previous
.file "ch6_1.bc"
.text
...
.cfi_startproc
.frame $sp,8,$lr
.mask 0x00000000,0
.set noreorder
.cpload $t9
...
lui $2, %got_hi(gI)
addu $2, $2, $gp
ld $2, %got_lo(gI)($2)
...
.type gI,@object # @gI
.data
.globl gI
.align 2
gI:
.4byte 100 # 0x64
.size gI, 4
input$ ~/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o
input$ llvm-objdump -s ch6_1.cpu0.o
ch6_1.cpu0.o: file format elf32-big
Contents of section .text:
// .cpload machine instruction
0000 0fa00000 0daa0000 13aa6000 ........ ..............`.
...
0020 002a0000 00220000 012d0000 0ddd0008 .*..."...-......
...
input$
input$ llvm-readelf -tr ch6_1.cpu0.o
There are 8 section headers, starting at offset 0xb0:
Section Headers:
[Nr] Name
Type Addr Off Size ES Lk Inf Al
Flags
[ 0]
NULL 00000000 000000 000000 00 0 0 0
[00000000]:
[ 1] .text
PROGBITS 00000000 000034 000044 00 0 0 4
[00000006]: ALLOC, EXEC
[ 2] .rel.text
REL 00000000 0002a8 000020 08 6 1 4
[00000000]:
[ 3] .data
PROGBITS 00000000 000078 000008 00 0 0 4
[00000003]: WRITE, ALLOC
[ 4] .bss
NOBITS 00000000 000080 000000 00 0 0 4
[00000003]: WRITE, ALLOC
[ 5] .shstrtab
STRTAB 00000000 000080 000030 00 0 0 1
[00000000]:
[ 6] .symtab
SYMTAB 00000000 0001f0 000090 10 7 5 4
[00000000]:
[ 7] .strtab
STRTAB 00000000 000280 000025 00 0 0 1
[00000000]:
Relocation section '.rel.text' at offset 0x2a8 contains 4 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000805 unrecognized: 5 00000000 _gp_disp
00000004 00000806 unrecognized: 6 00000000 _gp_disp
00000020 00000616 unrecognized: 16 00000004 gI
00000028 00000617 unrecognized: 17 00000004 gI
input$ llvm-readelf -tr ch6_1.mips.o
There are 9 section headers, starting at offset 0xc8:
Section Headers:
[Nr] Name
Type Addr Off Size ES Lk Inf Al
Flags
[ 0]
NULL 00000000 000000 000000 00 0 0 0
[00000000]:
[ 1] .text
PROGBITS 00000000 000034 000038 00 0 0 4
[00000006]: ALLOC, EXEC
[ 2] .rel.text
REL 00000000 0002f8 000018 08 7 1 4
[00000000]:
[ 3] .data
PROGBITS 00000000 00006c 000008 00 0 0 4
[00000003]: WRITE, ALLOC
[ 4] .bss
NOBITS 00000000 000074 000000 00 0 0 4
[00000003]: WRITE, ALLOC
[ 5] .reginfo
MIPS_REGINFO 00000000 000074 000018 00 0 0 1
[00000002]: ALLOC
[ 6] .shstrtab
STRTAB 00000000 00008c 000039 00 0 0 1
[00000000]:
[ 7] .symtab
SYMTAB 00000000 000230 0000a0 10 8 6 4
[00000000]:
[ 8] .strtab
STRTAB 00000000 0002d0 000025 00 0 0 1
[00000000]:
Relocation section '.rel.text' at offset 0x2f8 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000905 R_MIPS_HI16 00000000 _gp_disp
00000004 00000906 R_MIPS_LO16 00000000 _gp_disp
0000001c 00000709 R_MIPS_GOT16 00000004 gI
As depicted in section Handle $gp register in PIC addressing mode, it translates “.cpload %reg” into the following.
// Lower ".cpload $reg" to
// "lui $gp, %hi(_gp_disp)"
// "ori $gp, $gp, %lo(_gp_disp)"
// "addu $gp, $gp, $t9"
The _gp_disp value is determined by the loader, so it’s undefined in the obj file. You can find both the relocation records for offset 0 and 4 of the .text section referring to the _gp_disp symbol.
The offset 0 and 4 of the .text section correspond to the instructions
lui $gp, %hi(_gp_disp)
and ori $gp, $gp, %lo(_gp_disp)
, whose encoded
object representations are 0fa00000 and 0daa0000, respectively.
The object file sets the %hi(_gp_disp) and %lo(_gp_disp) fields to zero, since the loader will determine the actual _gp_disp value at runtime and patch these two relocation entries accordingly.
You can verify the correctness of Cpu0’s handling of %hi(_gp_disp) and %lo(_gp_disp) by comparing them to the MIPS relocation records R_MIPS_HI(_gp_disp) and R_MIPS_LO(_gp_disp), even though Cpu0 is not a recognized CPU target by the llvm-readelf utility.
The instruction ld $2, %got(gI)($gp)
behaves similarly. Because the actual
address of the .data section variable gI is unknown at compile time, Cpu0
sets its address to 0 and creates a relocation record at offset 0x00000020 of
the .text section.
The linker or loader will patch this address at link time (for static linking) or load time (for dynamic linking), depending on how the program is built.
llvm-objdump¶
llvm-objdump -t -r¶
The llvm-objdump -tr command displays symbol table and relocation record information, similar to the output of llvm-readelf -tr.
To examine the differences, try running llvm-objdump with and without enabling the Cpu0 backend, as shown in the following example:
118-165-83-12:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch9_3.cpp -emit-llvm -o ch9_3.bc
118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch9_3.bc -o
ch9_3.cpu0.o
118-165-78-12:input Jonathan$ objdump -t -r ch9_3.cpu0.o
ch9_3.cpu0.o: file format elf32-big
SYMBOL TABLE:
00000000 l df *ABS* 00000000 ch9_3.bc
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 g F .text 00000084 _Z5sum_iiz
00000084 g F .text 00000080 main
00000000 *UND* 00000000 _gp_disp
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000084 UNKNOWN _gp_disp
00000088 UNKNOWN _gp_disp
000000e0 UNKNOWN _Z5sum_iiz
118-165-83-10:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -t -r ch9_3.cpu0.o
ch9_3.cpu0.o: file format ELF32-CPU0
RELOCATION RECORDS FOR [.text]:
132 R_CPU0_HI16 _gp_disp
136 R_CPU0_LO16 _gp_disp
224 R_CPU0_CALL16 _Z5sum_iiz
SYMBOL TABLE:
00000000 l df *ABS* 00000000 ch9_3.bc
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 g F .text 00000084 _Z5sum_iiz
00000084 g F .text 00000080 main
00000000 *UND* 00000000 _gp_disp
The llvm-objdump tool can correctly display the file format and relocation record information, whereas the GNU objdump cannot. This is because the Cpu0-specific relocation record definitions have been added to ELF.h within LLVM’s source code, enabling llvm-objdump to recognize and interpret them properly.
include/llvm/support/ELF.h
// Machine architectures
enum {
...
EM_CPU0 = 998, // Document LLVM Backend Tutorial Cpu0
EM_CPU0_LE = 999 // EM_CPU0_LE: little endian; EM_CPU0: big endian
}
lib/object/ELF.cpp
...
StringRef getELFRelocationTypeName(uint32_t Machine, uint32_t Type) {
switch (Machine) {
...
case ELF::EM_CPU0:
switch (Type) {
#include "llvm/Support/ELFRelocs/Cpu0.def"
default:
break;
}
break;
...
}
include/llvm/Support/ELFRelocs/Cpu0.def
#ifndef ELF_RELOC
#error "ELF_RELOC must be defined"
#endif
ELF_RELOC(R_CPU0_NONE, 0)
ELF_RELOC(R_CPU0_32, 2)
ELF_RELOC(R_CPU0_HI16, 5)
ELF_RELOC(R_CPU0_LO16, 6)
ELF_RELOC(R_CPU0_GPREL16, 7)
ELF_RELOC(R_CPU0_LITERAL, 8)
ELF_RELOC(R_CPU0_GOT16, 9)
ELF_RELOC(R_CPU0_PC16, 10)
ELF_RELOC(R_CPU0_CALL16, 11)
ELF_RELOC(R_CPU0_GPREL32, 12)
ELF_RELOC(R_CPU0_PC24, 13)
ELF_RELOC(R_CPU0_GOT_HI16, 22)
ELF_RELOC(R_CPU0_GOT_LO16, 23)
ELF_RELOC(R_CPU0_RELGOT, 36)
ELF_RELOC(R_CPU0_TLS_GD, 42)
ELF_RELOC(R_CPU0_TLS_LDM, 43)
ELF_RELOC(R_CPU0_TLS_DTP_HI16, 44)
ELF_RELOC(R_CPU0_TLS_DTP_LO16, 45)
ELF_RELOC(R_CPU0_TLS_GOTTPREL, 46)
ELF_RELOC(R_CPU0_TLS_TPREL32, 47)
ELF_RELOC(R_CPU0_TLS_TP_HI16, 49)
ELF_RELOC(R_CPU0_TLS_TP_LO16, 50)
ELF_RELOC(R_CPU0_GLOB_DAT, 51)
ELF_RELOC(R_CPU0_JUMP_SLOT, 127)
include/llvm/Object/ELFObjectFile.h
template<support::endianness target_endianness, bool is64Bits>
error_code ELFObjectFile<target_endianness, is64Bits>
::getRelocationValueString(DataRefImpl Rel,
SmallVectorImpl<char> &Result) const {
...
case ELF::EM_CPU0: // llvm-objdump -t -r
res = symname;
break;
...
}
template<support::endianness target_endianness, bool is64Bits>
StringRef ELFObjectFile<target_endianness, is64Bits>
::getFileFormatName() const {
switch(Header->e_ident[ELF::EI_CLASS]) {
case ELF::ELFCLASS32:
switch(Header->e_machine) {
...
case ELF::EM_CPU0: // llvm-objdump -t -r
return "ELF32-CPU0";
...
}
template<support::endianness target_endianness, bool is64Bits>
unsigned ELFObjectFile<target_endianness, is64Bits>::getArch() const {
switch(Header->e_machine) {
...
case ELF::EM_CPU0: // llvm-objdump -t -r
return (target_endianness == support::little) ?
Triple::cpu0el : Triple::cpu0;
...
}
In addition to llvm-objdump -t -r
, the llvm-readobj -h
command can be
used to display the Cpu0 ELF header information, thanks to the EM_CPU0
definition added earlier.
llvm-objdump -d¶
Run the example code from the previous chapter using the command
llvm-objdump -d
to disassemble the ELF file and view its contents in
hexadecimal format as shown below:
JonathantekiiMac:input Jonathan$ clang -target mips-unknown-linux-gnu -c
ch8_1_1.cpp -emit-llvm -o ch8_1_1.bc
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_1_1.bc
-o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o
ch8_1_1.cpu0.o: file format ELF32-unknown
Disassembly of section .text:error: no disassembler for target cpu0-unknown-
unknown
To support llvm-objdump
, the following code is added in Chapter10_1/.
(Note: The DecoderMethod
for brtarget24
was added in a previous chapter.)
lbdex/chapters/Chapter10_1/CMakeLists.txt
tablegen(LLVM Cpu0GenDisassemblerTables.inc -gen-disassembler)
Cpu0Disassembler
add_subdirectory(Disassembler)
lbdex/chapters/Chapter10_1/Cpu0InstrInfo.td
let isBranch=1, isTerminator=1, isBarrier=1, imm16=0, hasDelaySlot = 1,
isIndirectBranch = 1 in
class JumpFR<bits<8> op, string instr_asm, RegisterClass RC>:
FL<op, (outs), (ins RC:$ra),
!strconcat(instr_asm, "\t$ra"), [(brind RC:$ra)], IIBranch> {
let rb = 0;
let imm16 = 0;
//#if CH >= CH10_1 1.5
let DecoderMethod = "DecodeJumpFR";
//#endif
}
class JumpLink<bits<8> op, string instr_asm>:
FJ<op, (outs), (ins calltarget:$target, variable_ops),
!strconcat(instr_asm, "\t$target"), [(Cpu0JmpLink imm:$target)],
IIBranch> {
//#if CH >= CH10_1 2
let DecoderMethod = "DecodeJumpTarget";
//#endif
}
lbdex/chapters/Chapter10_1/Disassembler/CMakeLists.txt
add_llvm_component_library(LLVMCpu0Disassembler
Cpu0Disassembler.cpp
LINK_COMPONENTS
MCDisassembler
Cpu0Info
Support
ADD_TO_COMPONENT
Cpu0
)
lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp
//===- Cpu0Disassembler.cpp - Disassembler for Cpu0 -------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file is part of the Cpu0 Disassembler.
//
//===----------------------------------------------------------------------===//
#include "Cpu0.h"
#include "Cpu0RegisterInfo.h"
#include "Cpu0Subtarget.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCFixedLenDisassembler.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/TargetRegistry.h"
using namespace llvm;
#define DEBUG_TYPE "cpu0-disassembler"
typedef MCDisassembler::DecodeStatus DecodeStatus;
namespace {
/// Cpu0DisassemblerBase - a disasembler class for Cpu0.
class Cpu0DisassemblerBase : public MCDisassembler {
public:
/// Constructor - Initializes the disassembler.
///
Cpu0DisassemblerBase(const MCSubtargetInfo &STI, MCContext &Ctx,
bool bigEndian) :
MCDisassembler(STI, Ctx),
IsBigEndian(bigEndian) {}
virtual ~Cpu0DisassemblerBase() {}
protected:
bool IsBigEndian;
};
/// Cpu0Disassembler - a disasembler class for Cpu032.
class Cpu0Disassembler : public Cpu0DisassemblerBase {
public:
/// Constructor - Initializes the disassembler.
///
Cpu0Disassembler(const MCSubtargetInfo &STI, MCContext &Ctx, bool bigEndian)
: Cpu0DisassemblerBase(STI, Ctx, bigEndian) {
}
/// getInstruction - See MCDisassembler.
DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
ArrayRef<uint8_t> Bytes, uint64_t Address,
raw_ostream &CStream) const override;
};
} // end anonymous namespace
// Decoder tables for GPR register
static const unsigned CPURegsTable[] = {
Cpu0::ZERO, Cpu0::AT, Cpu0::V0, Cpu0::V1,
Cpu0::A0, Cpu0::A1, Cpu0::T9, Cpu0::T0,
Cpu0::T1, Cpu0::S0, Cpu0::S1, Cpu0::GP,
Cpu0::FP, Cpu0::SP, Cpu0::LR, Cpu0::SW
};
// Decoder tables for co-processor 0 register
static const unsigned C0RegsTable[] = {
Cpu0::PC, Cpu0::EPC
};
static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeBranch16Target(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeJumpTarget(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeJumpFR(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeMem(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
static DecodeStatus DecodeSimm16(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder);
namespace llvm {
extern Target TheCpu0elTarget, TheCpu0Target, TheCpu064Target,
TheCpu064elTarget;
}
static MCDisassembler *createCpu0Disassembler(
const Target &T,
const MCSubtargetInfo &STI,
MCContext &Ctx) {
return new Cpu0Disassembler(STI, Ctx, true);
}
static MCDisassembler *createCpu0elDisassembler(
const Target &T,
const MCSubtargetInfo &STI,
MCContext &Ctx) {
return new Cpu0Disassembler(STI, Ctx, false);
}
extern "C" void LLVMInitializeCpu0Disassembler() {
// Register the disassembler.
TargetRegistry::RegisterMCDisassembler(TheCpu0Target,
createCpu0Disassembler);
TargetRegistry::RegisterMCDisassembler(TheCpu0elTarget,
createCpu0elDisassembler);
}
#if 0
#undef LLVM_DEBUG
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"
/// Read four bytes from the ArrayRef and return 32 bit word sorted
/// according to the given endianess
static DecodeStatus readInstruction32(ArrayRef<uint8_t> Bytes, uint64_t Address,
uint64_t &Size, uint32_t &Insn,
bool IsBigEndian) {
// We want to read exactly 4 Bytes of data.
if (Bytes.size() < 4) {
Size = 0;
return MCDisassembler::Fail;
}
if (IsBigEndian) {
// Encoded as a big-endian 32-bit word in the stream.
Insn = (Bytes[3] << 0) |
(Bytes[2] << 8) |
(Bytes[1] << 16) |
(Bytes[0] << 24);
}
else {
// Encoded as a small-endian 32-bit word in the stream.
Insn = (Bytes[0] << 0) |
(Bytes[1] << 8) |
(Bytes[2] << 16) |
(Bytes[3] << 24);
}
return MCDisassembler::Success;
}
DecodeStatus
Cpu0Disassembler::getInstruction(MCInst &Instr, uint64_t &Size,
ArrayRef<uint8_t> Bytes,
uint64_t Address,
raw_ostream &CStream) const {
uint32_t Insn;
DecodeStatus Result;
Result = readInstruction32(Bytes, Address, Size, Insn, IsBigEndian);
if (Result == MCDisassembler::Fail)
return MCDisassembler::Fail;
// Calling the auto-generated decoder function.
Result = decodeInstruction(DecoderTableCpu032, Instr, Insn, Address,
this, STI);
if (Result != MCDisassembler::Fail) {
Size = 4;
return Result;
}
return MCDisassembler::Fail;
}
static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder) {
if (RegNo > 15)
return MCDisassembler::Fail;
Inst.addOperand(MCOperand::createReg(CPURegsTable[RegNo]));
return MCDisassembler::Success;
}
static DecodeStatus DecodeGPROutRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder) {
return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}
static DecodeStatus DecodeSRRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder) {
return DecodeCPURegsRegisterClass(Inst, RegNo, Address, Decoder);
}
static DecodeStatus DecodeC0RegsRegisterClass(MCInst &Inst,
unsigned RegNo,
uint64_t Address,
const void *Decoder) {
if (RegNo > 1)
return MCDisassembler::Fail;
Inst.addOperand(MCOperand::createReg(C0RegsTable[RegNo]));
return MCDisassembler::Success;
}
//@DecodeMem {
static DecodeStatus DecodeMem(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
//@DecodeMem body {
int Offset = SignExtend32<16>(Insn & 0xffff);
int Reg = (int)fieldFromInstruction(Insn, 20, 4);
int Base = (int)fieldFromInstruction(Insn, 16, 4);
Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg]));
Inst.addOperand(MCOperand::createReg(CPURegsTable[Base]));
Inst.addOperand(MCOperand::createImm(Offset));
return MCDisassembler::Success;
}
static DecodeStatus DecodeBranch16Target(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
int BranchOffset = fieldFromInstruction(Insn, 0, 16);
if (BranchOffset > 0x8fff)
BranchOffset = -1*(0x10000 - BranchOffset);
Inst.addOperand(MCOperand::createImm(BranchOffset));
return MCDisassembler::Success;
}
/* CBranch instruction define $ra and then imm24; The printOperand() print
operand 1 (operand 0 is $ra and operand 1 is imm24), so we Create register
operand first and create imm24 next, as follows,
// Cpu0InstrInfo.td
class CBranch<bits<8> op, string instr_asm, RegisterClass RC,
list<Register> UseRegs>:
FJ<op, (outs), (ins RC:$ra, brtarget:$addr),
!strconcat(instr_asm, "\t$addr"),
[(brcond RC:$ra, bb:$addr)], IIBranch> {
// Cpu0AsmWriter.inc
void Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {
...
case 3:
// CMP, JEQ, JGE, JGT, JLE, JLT, JNE
printOperand(MI, 1, O);
break;
*/
static DecodeStatus DecodeBranch24Target(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
int BranchOffset = fieldFromInstruction(Insn, 0, 24);
if (BranchOffset > 0x8fffff)
BranchOffset = -1*(0x1000000 - BranchOffset);
Inst.addOperand(MCOperand::createReg(Cpu0::SW));
Inst.addOperand(MCOperand::createImm(BranchOffset));
return MCDisassembler::Success;
}
static DecodeStatus DecodeJumpTarget(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
unsigned JumpOffset = fieldFromInstruction(Insn, 0, 24);
Inst.addOperand(MCOperand::createImm(JumpOffset));
return MCDisassembler::Success;
}
static DecodeStatus DecodeJumpFR(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
int Reg_a = (int)fieldFromInstruction(Insn, 20, 4);
Inst.addOperand(MCOperand::createReg(CPURegsTable[Reg_a]));
// exapin in http://jonathan2251.github.io/lbd/llvmstructure.html#jr-note
if (CPURegsTable[Reg_a] == Cpu0::LR)
Inst.setOpcode(Cpu0::RET);
else
Inst.setOpcode(Cpu0::JR);
return MCDisassembler::Success;
}
static DecodeStatus DecodeSimm16(MCInst &Inst,
unsigned Insn,
uint64_t Address,
const void *Decoder) {
Inst.addOperand(MCOperand::createImm(SignExtend32<16>(Insn)));
return MCDisassembler::Success;
}
As shown in the above code, it adds the Disassembler
directory to handle
the reverse translation from obj to assembly. Therefore, Disassembler/Cpu0Disassembler.cpp
is added, and the CMakeLists.txt
is modified to build the Disassembler
directory
and enable the disassembler table generated by setting has_disassembler = 1
.
Most of the code is handled by the table defined in *.td
files.
Not every instruction in the *.td
files can be disassembled without trouble,
even though they can be successfully translated into assembly and obj. For those
that cannot be disassembled, LLVM provides the "let DecoderMethod"
keyword to
allow programmers to implement their own decode functions.
For example, in Cpu0, we define functions such as DecodeBranch24Target()
,
DecodeJumpTarget()
, and DecodeJumpFR()
in Cpu0Disassembler.cpp
. We
then inform llvm-tblgen
by writing "let DecoderMethod = ..."
in the
corresponding instruction definitions or ISD nodes of Cpu0InstrInfo.td
.
LLVM will call these DecoderMethods
when the user uses disassembler tools,
such as llvm-objdump -d
.
Finally, cpu032II
includes all instructions from cpu032I
and adds some new
instructions. When llvm-objdump -d
is invoked, the function
selectCpu0ArchFeature()
will be called through createCpu0MCSubtargetInfo()
.
Since llvm-objdump
cannot set CPU options like llc -mcpu=cpu032I
, the
variable CPU
in selectCpu0ArchFeature()
is empty when invoked by
llvm-objdump -d
. To ensure that all instructions are disassembled, we set
Cpu0ArchFeature
to "+cpu032II"
so that it can disassemble all instructions
from cpu032II
(which includes all instructions from cpu032I
and adds new ones).
lbdex/chapters/Chapter10_1/MCTargetDesc/Cpu0MCTargetDesc.cpp
/// Select the Cpu0 Architecture Feature for the given triple and cpu name.
/// The function will be called at command 'llvm-objdump -d' for Cpu0 elf input.
static std::string selectCpu0ArchFeature(const Triple &TT, StringRef CPU) {
std::string Cpu0ArchFeature;
if (CPU.empty() || CPU == "generic") {
if (TT.getArch() == Triple::cpu0 || TT.getArch() == Triple::cpu0el) {
if (CPU.empty() || CPU == "cpu032II") {
Cpu0ArchFeature = "+cpu032II";
}
else {
if (CPU == "cpu032I") {
Cpu0ArchFeature = "+cpu032I";
}
}
}
}
return Cpu0ArchFeature;
}
Now, run Chapter10_1/ with command llvm-objdump -d ch8_1_1.cpu0.o
will get
the following result.
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=obj
ch8_1_1.bc -o ch8_1_1.cpu0.o
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-objdump -d ch8_1_1.cpu0.o
ch8_1_1.cpu0.o: file format ELF32-CPU0
Disassembly of section .text:
_Z13test_control1v:
0: 09 dd ff d8 addiu $sp, $sp, -40
4: 09 30 00 00 addiu $3, $zero, 0
8: 02 3d 00 24 st $3, 36($sp)
c: 09 20 00 01 addiu $2, $zero, 1
10: 02 2d 00 20 st $2, 32($sp)
14: 09 40 00 02 addiu $4, $zero, 2
18: 02 4d 00 1c st $4, 28($sp)
...
Disassembler Structure¶
The flow of disassembly is shown in Fig. 51.
![digraph G {
rankdir=TD;
"disassembleObject()" -> "getInstruction()" [label="1. [AsmPrinter::llvm-objdump -d]\nBytes"];
"disassembleObject()" -> "PrettyPrinter::printInst()" [label="2. MCInst,Address"];
"getInstruction()" -> "disassembleObject()" [label="MCInst"];
"PrettyPrinter::printInst()" -> "printInst()" [label="MCInst,Address"];
"getInstruction()" -> "decodeInstruction()" [label="(DecoderTableCpu032,insn,Address)"];
"decodeInstruction()" -> "getInstruction()" [label="MCInst"];
"decodeInstruction()" -> "fieldFromInstruction()";
"decodeInstruction()" -> "checkDecoderPredicate()";
"decodeInstruction()" -> "decodeToMCInst()";
"decodeToMCInst()" -> "DecodeMem()";
"decodeToMCInst()" -> "DecodeBranch16Target()";
"decodeToMCInst()" -> "DecodeBranch24Target()";
"decodeToMCInst()" -> "DecodeJumpTarget()";
"decodeToMCInst()" -> "DecodeJumpFR()";
"decodeToMCInst()" -> "DecodeSimm16()";
subgraph clusterObjdump {
label = "llvm-objdump.cpp";
"disassembleObject()";
"PrettyPrinter::printInst()";
}
subgraph clusterCpu0Dis1 {
label = "Cpu0Disassembler.cpp";
"getInstruction()";
"readInstruction32()";
"getInstruction()" -> "readInstruction32()" [label="Bytes"];
"readInstruction32()" -> "getInstruction()" [label="insn"];
}
subgraph clusterCpu0Dis2 {
label = "Cpu0Disassembler.cpp\n These functions specified in Cpu0InstrInfo.td";
"DecodeMem()";
"DecodeBranch16Target()";
"DecodeBranch24Target()";
"DecodeJumpTarget()";
"DecodeJumpFR()";
"DecodeSimm16()";
}
subgraph clusterInc {
label = "Cpu0GenDisassemblerTables.inc";
"fieldFromInstruction()";
"checkDecoderPredicate()";
"decodeToMCInst()";
"decodeInstruction()";
}
subgraph clusterCpu0InstPrinter {
label = "Cpu0InstPrinter";
"printInst()";
}
// label = "Figure: The flow of disassembly";
}](_images/graphviz-34c1cf1d12b4b9b3ad02a3cd788246d9155a4c9d.png)
Fig. 51 The flow of disassembly.¶
After getInstruction() of Cpu0Disassembler.cpp, disassembleObject() of llvm-objdump.cpp call printInst() of Cpu0InstPrinter.cpp to print (address: binary assembly) for the instruction, for example “(4: 09 30 00 00 addiu $3, $zero, 0)”.
printInst() of Cpu0InstPrinter.cpp: reference Fig. 24.
Bytes: 4-byte (32-bits) for Cpu0. insn: Convert Bytes to big or little endian of 32-bit (unsigned int) binary instruction.
List DecoderTableCpu032 and decodeInstruction() as follows:
build/lib/Target/Cpu0/Cpu0GenDisassemblerTables.inc
static const uint8_t DecoderTableCpu032[] = {
/* 0 */ MCD::OPC_ExtractField, 24, 8, // Inst{31-24} ...
/* 3 */ MCD::OPC_FilterValue, 0, 11, 0, 0, // Skip to: 19
/* 8 */ MCD::OPC_CheckField, 0, 24, 0, 149, 4, 0, // Skip to: 1188
/* 15 */ MCD::OPC_Decode, 178, 2, 0, // Opcode: NOP
/* 19 */ MCD::OPC_FilterValue, 1, 4, 0, 0, // Skip to: 28
/* 24 */ MCD::OPC_Decode, 161, 2, 1, // Opcode: LD
/* 28 */ MCD::OPC_FilterValue, 2, 4, 0, 0, // Skip to: 37
/* 33 */ MCD::OPC_Decode, 201, 2, 1, // Opcode: ST
/* 37 */ MCD::OPC_FilterValue, 3, 9, 0, 0, // Skip to: 51
/* 42 */ MCD::OPC_CheckPredicate, 0, 117, 4, 0, // Skip to: 1188
/* 47 */ MCD::OPC_Decode, 159, 2, 1, // Opcode: LB
/* 51 */ MCD::OPC_FilterValue, 4, 9, 0, 0, // Skip to: 65
/* 56 */ MCD::OPC_CheckPredicate, 0, 103, 4, 0, // Skip to: 1188
/* 61 */ MCD::OPC_Decode, 160, 2, 1, // Opcode: LBu
/* 65 */ MCD::OPC_FilterValue, 5, 9, 0, 0, // Skip to: 79
/* 70 */ MCD::OPC_CheckPredicate, 0, 89, 4, 0, // Skip to: 1188
/* 75 */ MCD::OPC_Decode, 187, 2, 1, // Opcode: SB
/* 79 */ MCD::OPC_FilterValue, 6, 9, 0, 0, // Skip to: 93
/* 84 */ MCD::OPC_CheckPredicate, 0, 75, 4, 0, // Skip to: 1188
/* 89 */ MCD::OPC_Decode, 163, 2, 1, // Opcode: LH
/* 93 */ MCD::OPC_FilterValue, 7, 9, 0, 0, // Skip to: 107
/* 98 */ MCD::OPC_CheckPredicate, 0, 61, 4, 0, // Skip to: 1188
/* 103 */ MCD::OPC_Decode, 164, 2, 1, // Opcode: LHu
/* 107 */ MCD::OPC_FilterValue, 8, 9, 0, 0, // Skip to: 121
/* 112 */ MCD::OPC_CheckPredicate, 0, 47, 4, 0, // Skip to: 1188
...
template <typename InsnType>
static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], MCInst &MI,
InsnType insn, uint64_t Address,
const void *DisAsm,
const MCSubtargetInfo &STI) {
const FeatureBitset &Bits = STI.getFeatureBits();
const uint8_t *Ptr = DecodeTable;
InsnType CurFieldValue = 0;
DecodeStatus S = MCDisassembler::Success;
while (true) {
ptrdiff_t Loc = Ptr - DecodeTable;
switch (*Ptr) {
default:
errs() << Loc << ": Unexpected decode table opcode!\n";
return MCDisassembler::Fail;
case MCD::OPC_ExtractField: {
unsigned Start = *++Ptr;
unsigned Len = *++Ptr;
++Ptr;
CurFieldValue = fieldFromInstruction(insn, Start, Len);
LLVM_DEBUG(dbgs() << Loc << ": OPC_ExtractField(" << Start << ", "
<< Len << "): " << CurFieldValue << "\n");
break;
}
case MCD::OPC_FilterValue: {
// Decode the field value.
unsigned Len;
InsnType Val = decodeULEB128(++Ptr, &Len);
Ptr += Len;
// NumToSkip is a plain 24-bit integer.
unsigned NumToSkip = *Ptr++;
NumToSkip |= (*Ptr++) << 8;
NumToSkip |= (*Ptr++) << 16;
// Perform the filter operation.
if (Val != CurFieldValue)
Ptr += NumToSkip;
LLVM_DEBUG(dbgs() << Loc << ": OPC_FilterValue(" << Val << ", " << NumToSkip
<< "): " << ((Val != CurFieldValue) ? "FAIL:" : "PASS:")
<< " continuing at " << (Ptr - DecodeTable) << "\n");
break;
}
case MCD::OPC_CheckField: {
unsigned Start = *++Ptr;
unsigned Len = *++Ptr;
InsnType FieldValue = fieldFromInstruction(insn, Start, Len);
// Decode the field value.
InsnType ExpectedValue = decodeULEB128(++Ptr, &Len);
Ptr += Len;
// NumToSkip is a plain 24-bit integer.
unsigned NumToSkip = *Ptr++;
NumToSkip |= (*Ptr++) << 8;
NumToSkip |= (*Ptr++) << 16;
// If the actual and expected values don't match, skip.
if (ExpectedValue != FieldValue)
Ptr += NumToSkip;
LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckField(" << Start << ", "
<< Len << ", " << ExpectedValue << ", " << NumToSkip
<< "): FieldValue = " << FieldValue << ", ExpectedValue = "
<< ExpectedValue << ": "
<< ((ExpectedValue == FieldValue) ? "PASS\n" : "FAIL\n"));
break;
}
case MCD::OPC_CheckPredicate: {
unsigned Len;
// Decode the Predicate Index value.
unsigned PIdx = decodeULEB128(++Ptr, &Len);
Ptr += Len;
// NumToSkip is a plain 24-bit integer.
unsigned NumToSkip = *Ptr++;
NumToSkip |= (*Ptr++) << 8;
NumToSkip |= (*Ptr++) << 16;
// Check the predicate.
bool Pred;
if (!(Pred = checkDecoderPredicate(PIdx, Bits)))
Ptr += NumToSkip;
(void)Pred;
LLVM_DEBUG(dbgs() << Loc << ": OPC_CheckPredicate(" << PIdx << "): "
<< (Pred ? "PASS\n" : "FAIL\n"));
break;
}
case MCD::OPC_Decode: {
unsigned Len;
// Decode the Opcode value.
unsigned Opc = decodeULEB128(++Ptr, &Len);
Ptr += Len;
unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
Ptr += Len;
MI.clear();
MI.setOpcode(Opc);
bool DecodeComplete;
S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, DecodeComplete);
assert(DecodeComplete);
LLVM_DEBUG(dbgs() << Loc << ": OPC_Decode: opcode " << Opc
<< ", using decoder " << DecodeIdx << ": "
<< (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
return S;
}
case MCD::OPC_TryDecode: {
unsigned Len;
// Decode the Opcode value.
unsigned Opc = decodeULEB128(++Ptr, &Len);
Ptr += Len;
unsigned DecodeIdx = decodeULEB128(Ptr, &Len);
Ptr += Len;
// NumToSkip is a plain 24-bit integer.
unsigned NumToSkip = *Ptr++;
NumToSkip |= (*Ptr++) << 8;
NumToSkip |= (*Ptr++) << 16;
// Perform the decode operation.
MCInst TmpMI;
TmpMI.setOpcode(Opc);
bool DecodeComplete;
S = decodeToMCInst(S, DecodeIdx, insn, TmpMI, Address, DisAsm, DecodeComplete);
LLVM_DEBUG(dbgs() << Loc << ": OPC_TryDecode: opcode " << Opc
<< ", using decoder " << DecodeIdx << ": ");
if (DecodeComplete) {
// Decoding complete.
LLVM_DEBUG(dbgs() << (S != MCDisassembler::Fail ? "PASS" : "FAIL") << "\n");
MI = TmpMI;
return S;
} else {
assert(S == MCDisassembler::Fail);
// If the decoding was incomplete, skip.
Ptr += NumToSkip;
LLVM_DEBUG(dbgs() << "FAIL: continuing at " << (Ptr - DecodeTable) << "\n");
// Reset decode status. This also drops a SoftFail status that could be
// set before the decode attempt.
S = MCDisassembler::Success;
}
break;
}
case MCD::OPC_SoftFail: {
// Decode the mask values.
unsigned Len;
InsnType PositiveMask = decodeULEB128(++Ptr, &Len);
Ptr += Len;
InsnType NegativeMask = decodeULEB128(Ptr, &Len);
Ptr += Len;
bool Fail = (insn & PositiveMask) || (~insn & NegativeMask);
if (Fail)
S = MCDisassembler::SoftFail;
LLVM_DEBUG(dbgs() << Loc << ": OPC_SoftFail: " << (Fail ? "FAIL\n" : "PASS\n"));
break;
}
case MCD::OPC_Fail: {
LLVM_DEBUG(dbgs() << Loc << ": OPC_Fail\n");
return MCDisassembler::Fail;
}
}
}
llvm_unreachable("bogosity detected in disassembler state machine!");
}
List the tracing of decodeInstruction() by enabling “#if 1” in Cpu0Disassembler.cpp and running llvm-objdump as follows:
lbdex/chapters/Chapter10_1/Disassembler/Cpu0Disassembler.cpp
#if 1
#undef LLVM_DEBUG(X)
#define LLVM_DEBUG(X) X
#endif
#include "Cpu0GenDisassemblerTables.inc"
Based on the debug log above, pick the example “addiu $sp, $sp, -8”, which has an opcode of 9, to explain decodeInstruction() as shown in the table and explanation below:
state |
result |
---|---|
OPC_ExtractField |
CurFieldValue <- Opcode:9 |
OPC_FilterValue |
Match entries of DecodeTable == CurFieldValue |
OPC_Decode |
setOpcode(9) and decode operands by calling decodeToMCInst() |
For “move $fp, $sp” and “ret $lr”, they have state OPC_CheckField before OPC_Decode since they are R type of Cpu0 instruction format and “let shamt = 0;” is set in “class ArithLogic” of Cpu0InstrInfo.td.
For “move $fp, $sp”, fieldFromInstruction(0x11cd0000, 0, 12) = (0x11cd0000 & 0x00000fff). Check bits(20..31) is 0.
DecodeBranch16Target() and DecodeBranch24Target(): decode immediate value to MCInst.operand and set the type of MCInst.operand to immediate type, with value being either positive or negative. Operand of MCInst can be either immediate or register type.