The previous chapters introducing the assembly code generation only.
This chapter adding the elf obj support and verify the generated obj by
objdump utility. With LLVM support, the Cpu0 backend can generate both big
endian and little endian obj files with only a few code added.
The Target Registration mechanism and their structure are introduced in
this chapter.
Currently, we only support translation of llvm IR code into assembly code.
If you try running Chapter4_2/ to translate it into obj code will get the error
message as follows,
[Gamma@localhost 3]$ ~/llvm/test/build/bin/
llc -march=cpu0 -relocation-model=pic -filetype=obj ch4_1_math_math.bc -och4_1_math.cpu0.o~/llvm/test/build/bin/llc: target does notsupport generation of this file type!
Chapter5_1/ support obj file generation.
It produces obj files both for big endian and little endian with command
llc-march=cpu0 and llc-march=cpu0el, respectively.
Run with them will get the obj files as follows,
The first instruction is “addiu $sp, -56” and its corresponding obj is
0x09ddffc8.
The opcode of addiu is 0x09, 8 bits; $sp register number is 13(0xd), 4bits; and
the immediate is 16 bits -56(=0xffc8), so it is correct.
The third instruction “st $2, 52($fp)” and it’s corresponding obj
is 0x022b0034. The st opcode is 0x02, $2 is 0x2, $fp is 0xb and
immediate is 52(0x0034).
Thanks to Cpu0 instruction format which opcode, register operand and
offset(imediate value) size are multiple of 4 bits.
Base on the 4 bits multiple, the obj format is easy to check by eyes.
The big endian (B0, B1, B2, B3) = (09, dd, ff, c8), objdump from B0 to B3 is
0x09ddffc8 and the little endian is (B3, B2, B1, B0) = (09, dd, ff, c8),
objdump from B0 to B3 is 0xc8ffdd09.
//===--Cpu0ELFObjectWriter.cpp-Cpu0ELFWriter-------------------------===//////TheLLVMCompilerInfrastructure////ThisfileisdistributedundertheUniversityofIllinoisOpenSource//License.SeeLICENSE.TXTfordetails.////===----------------------------------------------------------------------===//#include "Cpu0Config.h"#include "MCTargetDesc/Cpu0BaseInfo.h"#include "MCTargetDesc/Cpu0FixupKinds.h"#include "MCTargetDesc/Cpu0MCTargetDesc.h"#include "llvm/MC/MCAssembler.h"#include "llvm/MC/MCELFObjectWriter.h"#include "llvm/MC/MCExpr.h"#include "llvm/MC/MCSection.h"#include "llvm/MC/MCValue.h"#include "llvm/Support/ErrorHandling.h"#include <list>usingnamespacellvm;namespace{classCpu0ELFObjectWriter:publicMCELFObjectTargetWriter{public:Cpu0ELFObjectWriter(uint8_tOSABI,boolHasRelocationAddend,boolIs64);~Cpu0ELFObjectWriter()=default;unsignedgetRelocType(MCContext&Ctx,constMCValue&Target,constMCFixup&Fixup,boolIsPCRel)constoverride;boolneedsRelocateWithSymbol(constMCSymbol&Sym,unsignedType)constoverride;};}Cpu0ELFObjectWriter::Cpu0ELFObjectWriter(uint8_tOSABI,boolHasRelocationAddend,boolIs64):MCELFObjectTargetWriter(/*Is64Bit_=false*/Is64,OSABI,ELF::EM_CPU0,/*HasRelocationAddend_=false*/HasRelocationAddend){}//@GetRelocType{unsignedCpu0ELFObjectWriter::getRelocType(MCContext&Ctx,constMCValue&Target,constMCFixup&Fixup,boolIsPCRel)const{//determinethetypeoftherelocationunsignedType=(unsigned)ELF::R_CPU0_NONE;unsignedKind=(unsigned)Fixup.getKind();switch(Kind){default:llvm_unreachable("invalid fixup kind!");caseFK_Data_4:Type=ELF::R_CPU0_32;break;caseCpu0::fixup_Cpu0_32:Type=ELF::R_CPU0_32;break;caseCpu0::fixup_Cpu0_GPREL16:Type=ELF::R_CPU0_GPREL16;break;caseCpu0::fixup_Cpu0_GOT:Type=ELF::R_CPU0_GOT16;break;caseCpu0::fixup_Cpu0_HI16:Type=ELF::R_CPU0_HI16;break;caseCpu0::fixup_Cpu0_LO16:Type=ELF::R_CPU0_LO16;break;caseCpu0::fixup_Cpu0_GOT_HI16:Type=ELF::R_CPU0_GOT_HI16;break;caseCpu0::fixup_Cpu0_GOT_LO16:Type=ELF::R_CPU0_GOT_LO16;break;}returnType;}//@GetRelocType}boolCpu0ELFObjectWriter::needsRelocateWithSymbol(constMCSymbol&Sym,unsignedType)const{//FIXME:Thisisextremellyconservative.Thisreallyneedstousea//whitelistwithaclearexplanationforwhyeachrealocationneedsto//pointtothesymbol,nottothesection.switch(Type){default:returntrue;caseELF::R_CPU0_GOT16://ForCpu0picmode,Ithinkit's OK to return true but I didn'tconfirm.//llvm_unreachable("Should have been handled already");returntrue;//Theserelocationsmightbepairedwithanotherrelocation.Thepairingis//donebythestaticlinkerbymatchingthesymbol.Sinceweonlyseeone//relocationatatime,wehavetoforcethemtorelocatewithasymbolto//avoidendingupwithapairwhereonepointstoasectionandanother//pointstoasymbol.caseELF::R_CPU0_HI16:caseELF::R_CPU0_LO16://R_CPU0_32shouldbearelocationrecord,Idon't know why Mips set it to //false.caseELF::R_CPU0_32:returntrue;caseELF::R_CPU0_GPREL16:returnfalse;}}std::unique_ptr<MCObjectTargetWriter>llvm::createCpu0ELFObjectWriter(constTriple&TT){uint8_tOSABI=MCELFObjectTargetWriter::getOSABI(TT.getOS());boolIsN64=false;boolHasRelocationAddend=TT.isArch64Bit();returnstd::make_unique<Cpu0ELFObjectWriter>(OSABI,HasRelocationAddend,IsN64);}
//===--Cpu0MCCodeEmitter.h-ConvertCpu0CodetoMachineCode-----------===//////TheLLVMCompilerInfrastructure////ThisfileisdistributedundertheUniversityofIllinoisOpenSource//License.SeeLICENSE.TXTfordetails.////===----------------------------------------------------------------------===//////ThisfiledefinestheCpu0MCCodeEmitterclass.////===----------------------------------------------------------------------===////#ifndef LLVM_LIB_TARGET_CPU0_MCTARGETDESC_CPU0MCCODEEMITTER_H#define LLVM_LIB_TARGET_CPU0_MCTARGETDESC_CPU0MCCODEEMITTER_H#include "Cpu0Config.h"#include "llvm/MC/MCCodeEmitter.h"#include "llvm/Support/DataTypes.h"usingnamespacellvm;namespacellvm{classMCContext;classMCExpr;classMCInst;classMCInstrInfo;classMCFixup;classMCOperand;classMCSubtargetInfo;classraw_ostream;classCpu0MCCodeEmitter:publicMCCodeEmitter{Cpu0MCCodeEmitter(constCpu0MCCodeEmitter&)=delete;voidoperator=(constCpu0MCCodeEmitter&)=delete;constMCInstrInfo&MCII;MCContext&Ctx;boolIsLittleEndian;public:Cpu0MCCodeEmitter(constMCInstrInfo&mcii,MCContext&Ctx_,boolIsLittle):MCII(mcii),Ctx(Ctx_),IsLittleEndian(IsLittle){}~Cpu0MCCodeEmitter()override{}voidEmitByte(unsignedcharC,raw_ostream&OS)const;voidEmitInstruction(uint64_tVal,unsignedSize,raw_ostream&OS)const;voidencodeInstruction(constMCInst&MI,raw_ostream&OS,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)constoverride;//getBinaryCodeForInstr-TableGen'erated function for getting the//binaryencodingforaninstruction.uint64_tgetBinaryCodeForInstr(constMCInst&MI,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;//getBranch16TargetOpValue-Returnbinaryencodingofthebranch//targetoperand,suchasBEQ,BNE.Ifthemachineoperand//requiresrelocation,recordtherelocationandreturnzero.unsignedgetBranch16TargetOpValue(constMCInst&MI,unsignedOpNo,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;//getBranch24TargetOpValue-Returnbinaryencodingofthebranch//targetoperand,suchasJMP#BB01, JEQ, JSUB. If the machine operand//requiresrelocation,recordtherelocationandreturnzero.unsignedgetBranch24TargetOpValue(constMCInst&MI,unsignedOpNo,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;//getJumpTargetOpValue-Returnbinaryencodingofthejump//targetoperand,suchasJSUB#function_addr. //Ifthemachineoperandrequiresrelocation,//recordtherelocationandreturnzero.unsignedgetJumpTargetOpValue(constMCInst&MI,unsignedOpNo,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;//getMachineOpValue-Returnbinaryencodingofoperand.Ifthemachin//operandrequiresrelocation,recordtherelocationandreturnzero.unsignedgetMachineOpValue(constMCInst&MI,constMCOperand&MO,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;unsignedgetMemEncoding(constMCInst&MI,unsignedOpNo,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;unsignedgetExprOpValue(constMCExpr*Expr,SmallVectorImpl<MCFixup>&Fixups,constMCSubtargetInfo&STI)const;};//classCpu0MCCodeEmitter}//namespacellvm.#endif
The ELF encoder calling functions shown as the figure above.
AsmPrinter::OutStreamer is set to MCObjectStreamer when by llc driver when user
input llc-filetype=obj.
The instruction operands information for encoder is got as the figure above.
Steps as follows,
Function encodeInstruction() pass MI.Opcode to getBinaryCodeForInstr().
getBinaryCodeForInstr() pass MI.Operand[n] to getMachineOpValue() and then,
get register number by calling getMachineOpValue().
getBinaryCodeForInstr() return the MI with all number of registers to encodeInstruction().
The MI.Opcode is set in Instruction Selection Stage.
The table gen function getBinaryCodeForInstr() get all the operands information
from the td files set by programmer as the following figure.
For instance, Cpu0 backend will generate “addu $v0, $at, $v1” for the IR
“%0 = add %1, %2” once llvm allocate registers $v0, $at and $v1 for Operands
%0, %1 and %2 individually. The MCOperand structure for MI.Operands[] include
register number set in the pass of llvm allocate registers which can be got in
getMachineOpValue().
The getEncodingValue(Reg) in getMachineOpValue() as the following will get the
RegNo of encode from Register name such as AT, V0, or V1, … by using table gen
information from Cpu0RegisterInfo.td as the following. My comment is after “///”.
voidInitMCRegisterInfo(...,constuint16_t*RET){...RegEncodingTable=RET;}/// \brief Returns the encoding for RegNouint16_tgetEncodingValue(unsignedRegNo)const{assert(RegNo<NumRegs&&"Attempting to get encoding for invalid register number!");returnRegEncodingTable[RegNo];}
The applyFixup() of Cpu0AsmBackend.cpp will fix up the jeq, jub, …
instructions of “address control flow statements” or “function call statements”
used in later chapters.
The setting of true or false for each relocation record in
needsRelocateWithSymbol() of Cpu0ELFObjectWriter.cpp depends on whethor this
relocation record is needed to adjust address value during link or not.
If set true, then linker has chance to adjust this address value with correct
information. On the other hand, if set false, then linker has no correct
information to adjust this relocation record. About relocation record, it will
be introduced in later chapter ELF Support.
When emit elf obj format instruction, the EncodeInstruction() of
Cpu0MCCodeEmitter.cpp will be called since it override the same name of
function in parent class MCCodeEmitter.
The “let EncoderMethod = “getMemEncoding”;” in Cpu0InstrInfo.td as above will
making llvm call function getMemEncoding() when either ld or st
instruction is issued in elf obj since these two instructions use mem
Operand.
The other functions in Cpu0MCCodeEmitter.cpp are called by these two functions.
After encoder, the following code will write the encode instructions to buffer.
Now, let’s examine Cpu0MCTargetDesc.cpp.
Cpu0MCTargetDesc.cpp do the target registration as mentioned in
the previous chapter here 1, and the assembly
output has explained here 2.
List the register functions of ELF obj output as follows,
Register function of elf streamer
// Register the elf streamer.TargetRegistry::RegisterELFStreamer(*T,createMCStreamer);staticMCStreamer*createMCStreamer(constTriple&TT,MCContext&Context,MCAsmBackend&MAB,raw_pwrite_stream&OS,MCCodeEmitter*Emitter,boolRelaxAll){returncreateELFStreamer(Context,MAB,OS,Emitter,RelaxAll);}// MCELFStreamer.cppMCStreamer*llvm::createELFStreamer(MCContext&Context,MCAsmBackend&MAB,raw_pwrite_stream&OS,MCCodeEmitter*CE,boolRelaxAll){MCELFStreamer*S=newMCELFStreamer(Context,MAB,OS,CE);if(RelaxAll)S->getAssembler().setRelaxAll(true);returnS;}
Above createELFStreamer takes care the elf obj streamer.
Fig. 22 as follow is MCELFStreamer inheritance tree.
You can find a lot of operations in that inheritance tree.
// Register the asm target streamer.TargetRegistry::RegisterAsmTargetStreamer(*T,createCpu0AsmTargetStreamer);staticMCTargetStreamer*createCpu0AsmTargetStreamer(MCStreamer&S,formatted_raw_ostream&OS,MCInstPrinter*InstPrint,boolisVerboseAsm){returnnewCpu0TargetAsmStreamer(S,OS);}// Cpu0TargetStreamer.hclassCpu0TargetStreamer:publicMCTargetStreamer{public:Cpu0TargetStreamer(MCStreamer&S);};// This part is for ascii assembly outputclassCpu0TargetAsmStreamer:publicCpu0TargetStreamer{formatted_raw_ostream&OS;public:Cpu0TargetAsmStreamer(MCStreamer&S,formatted_raw_ostream&OS);};
Above instancing MCTargetStreamer instance.
Register function of MC Code Emitter
// Register the MC Code EmitterTargetRegistry::RegisterMCCodeEmitter(TheCpu0Target,createCpu0MCCodeEmitterEB);TargetRegistry::RegisterMCCodeEmitter(TheCpu0elTarget,createCpu0MCCodeEmitterEL);// Cpu0MCCodeEmitter.cppMCCodeEmitter*llvm::createCpu0MCCodeEmitterEB(constMCInstrInfo&MCII,constMCRegisterInfo&MRI,MCContext&Ctx){returnnewCpu0MCCodeEmitter(MCII,Ctx,false);}MCCodeEmitter*llvm::createCpu0MCCodeEmitterEL(constMCInstrInfo&MCII,constMCRegisterInfo&MRI,MCContext&Ctx){returnnewCpu0MCCodeEmitter(MCII,Ctx,true);}
Above instancing two objects Cpu0MCCodeEmitter, one is for
big endian and the other is for little endian.
They take care the obj format generated while RegisterELFStreamer() reuse the
elf streamer class.
Reader maybe has the question: “What are the actual arguments in
createCpu0MCCodeEmitterEB(const MCInstrInfo &MCII, const MCSubtargetInfo &STI,
MCContext &Ctx)?” and “When they are assigned?”
Yes, we didn’t assign it at this point, we register the createXXX() function by
function pointer only (according C, TargetRegistry::RegisterXXX(TheCpu0Target,
createXXX()) where createXXX is function pointer).
LLVM keeps a function pointer to createXXX() when we call target registry, and
will call these createXXX() function back at proper time with arguments
assigned during the target registration process, RegisterXXX().
Register function of asm backend
// Register the asm backend.TargetRegistry::RegisterMCAsmBackend(TheCpu0Target,createCpu0AsmBackendEB32);TargetRegistry::RegisterMCAsmBackend(TheCpu0elTarget,createCpu0AsmBackendEL32);// Cpu0AsmBackend.cppMCAsmBackend*llvm::createCpu0AsmBackendEL32(constTarget&T,constMCRegisterInfo&MRI,constTriple&TT,StringRefCPU){returnnewCpu0AsmBackend(T,TT.getOS(),/*IsLittle*/true);}MCAsmBackend*llvm::createCpu0AsmBackendEB32(constTarget&T,constMCRegisterInfo&MRI,constTriple&TT,StringRefCPU){returnnewCpu0AsmBackend(T,TT.getOS(),/*IsLittle*/false);}// Cpu0AsmBackend.hclassCpu0AsmBackend:publicMCAsmBackend{...}
Above Cpu0AsmBackend class is the bridge for asm to obj.
Two objects take care big endian and little endian, respectively.
It derived from MCAsmBackend.
Most of code for object file generated is implemented by MCELFStreamer and it’s
parent, MCAsmBackend.