Library

The theory of Floating point implementation

Fixed point for representation of floating point as Fig. 6 and calculate as below after.

_images/fixed-point.png

Fig. 6 Fixed point representation

Assume Sign part: 1-bit (0:+, 1:-), Integer part: 2-bit, Fraction part: 2-bit.

  • 3.0 * 0.5 = {0 11 00} * {0 00 10} = {(0 xor 0) (11 00 * 00 10) >> 2} = {0 01 10} = 1.5

The layout for half precision of Floating point as Fig. 7.

_images/floating-point-half.png

Fig. 7 IEEE 754 half precision of Floating point representation [2]

Floating point can be implemented both on software and hardware.

The 16-bit a*b can be calculated by transfering to Fixed point above for both a and b on more bits of memory/registers, calcuate as Fixed point and transfer back to Fixed point as this website [1].

The example for multiplication based on exponent 2 as follows,

  • Precondition: a and b are normalized of IEEE half precision (16-bit) [2]. Exponent bias: zero offset being 15: 15->0, 1-> -14, 30 -> 15. And 31 -> NaN.

  • Transformation for a*b:

      1. {sign-bit(a) xor sign-bit(b)} {exponent-bits(a)+exponent-bits(b)-15} {significand-bits(a)*significand-bits(b) >> 10}

      1. Normalize:

  • ex.

    • a = 0.01 (binary) = {0 01110 1000000000}; b = 1.1 (binary) = {010000 1100000000}

      1. a*b = {0 xor 0} {01110+10000-01111=01111} {1000000000*1100000000 >> 10 = 0110000000}

      1. Normalize: {0 01111 0110000000} -> {0 01110 1100000000} = 0.011

The following is for division operation.

  • Transformation for a/b:

      1. {sign-bit(a) xor sign-bit(b)} {exponent-bits(a)-exponent-bits(b)+15} {significand-bits(a)/significand-bits(b) >> 10}

      1. Normalize:

  • ex.

    • a = 0.01 (binary) = {0 01110 1000000000}; b = 1 (binary) = {0 10000 1000000000}

      1. a*b = {0 xor 0} {01110-10000+01111=01101} {1000000000/1000000000 << 9 = 1000000000}

      1. Normalize: {0 01101 0000000001} -> {0 01101 1000000000} = 0.01

IEEE-754 floating standard also consider NaN (Not a Number) such as 0/0 and \(\infty\) as Fig. 8.

_images/exp-enc.png

Fig. 8 Encoding of exponent for IEEE 754 half precision [2]

Since Normalization applied in Floating precision is the critial code, Cpu0 HW provides clz and clo to speedup Normalization Operation.

Compiler-rt implement multiplication of Floating point considering NaN and \(\infty\) as the same way for implementation above and clz/clo to speedup Normalization as follows,

~/llvm/debug/compiler-rt/lib/builtins/fp_lib.h

#if defined SINGLE_PRECISION
static __inline int rep_clz(rep_t a) { return __builtin_clz(a); }
...
#endif
...
static __inline rep_t toRep(fp_t x) {
  const union {
    fp_t f;
    rep_t i;
  } rep = {.f = x};
  return rep.i;
}
...
static __inline int normalize(rep_t *significand) {
  const int shift = rep_clz(*significand) - rep_clz(implicitBit);
  *significand <<= shift;
  return 1 - shift;
}

~/llvm/debug/compiler-rt/lib/builtins/fp_mul_impl.inc

#include "fp_lib.h"
...
static __inline fp_t __mulXf3__(fp_t a, fp_t b) {
  const unsigned int aExponent = toRep(a) >> significandBits & maxExponent;
  const unsigned int bExponent = toRep(b) >> significandBits & maxExponent;
  ...
  int productExponent = aExponent + bExponent - exponentBias + scale;
  ...
    productHi |= (rep_t)productExponent << significandBits;
  ...
  return fromRep(productHi);
}

The dependence for Cpu0 based on Compiler-rt’s builtin

Since Cpu0 has not hardware float point instructions, it needs software float point library to finish the floating point operation. LLVM compiler-rt project has software floating point library implementation Fig. 9 , so I choose it as the implementation.

Since compiler-rt uses unix/linux rootfs structure, we fill the gap by add few empty include-files in exlbt/include.

// dot -Tpng lib.gv -o lib.png
digraph G {
  rankdir=LR;

  node [shape="",style=filled,fillcolor=lightyellow]; lib [label="lib (libm/soft-float/\nscanf/printf)"];
  node [shape="",style=solid,color=black];
  "User program" -> "clang/llc" [ label = "c/c++" ];
  lib -> "clang/llc" [ label = "c" ];
  "clang/llc" -> lld [ label = "obj" ];
}

Fig. 9 compiler-rt/lib/builtins’ software float library

The dependences for compiler-rt on libm as Fig. 11.

// dot -Tpng compiler-rt-dep-short.gv -o compiler-rt-dep-short.png
digraph G {
  rankdir=LR;

  compound=true;
  node [shape=record];

  subgraph cluster_compiler_rt {
    label = "compiler-rt";
    utb [label="test/builtins/Unit"];
    subgraph cluster_builtins {
      label = "lib/builtins";
      builtins [label="<fdt> float and double types | <ct> complex type"];
    }
  }

  node [label = "sanitizer_printf(%lld)"]; sanitizer_printf;
  node [label = "Cpu0 backend of llvm"]; cpu0;

  subgraph cluster_libm {
    label = "libm";
    libm [label="<c> common | <ma> math"];
  }

  builtins:ct -> libm:c;
  builtins:ct:se -> libm:ma;
  builtins:fdt -> cpu0;
  utb -> sanitizer_printf;
}

Fig. 10 Dependences for compiler-rt on libm

Table 2 lldb dependences

functions

depend on

scanf

newlib/libc

printf

sanitizer_printf.c of compiler-rt

Table 3 sanitizer_printf.c of compiler-rt dependences

functions

depend on

sanitizter_printf.c

builtins of compiler-rt

C Library (Newlib)

Since complex type of compiler-rt depends on libm, I porting NewLib in this section.

Newlib is a C library for bare metal platform. Two libraries in newlib are libc and libm. Libc is for functions of IO, file and string supported while libm is for mathematical functions. Web of newlib is here [3] and newlib/libm here [4] . Since the next section compiler-rt/builtins depends on libm, please run the following bash script to install and build newlib for Cpu0.

lbt/exlbt/newlib-cpu0.sh

#!/usr/bin/env bash

# change this dir for newlib-cygwin
NEWLIB_PARENT_DIR=$HOME/git

NEWLIB_DIR=$NEWLIB_PARENT_DIR/newlib-cygwin
CURR_DIR=`pwd`
CC=$HOME/llvm/test/build/bin/clang
CFLAGS="-target cpu0el-unknown-linux-gnu -static -fintegrated-as -Wno-error=implicit-function-declaration"
AS="$HOME/llvm/test/build/bin/clang -static -fintegrated-as -c"
AR="$HOME/llvm/test/build/bin/llvm-ar"
RANLIB="$HOME/llvm/test/build/bin/llvm-ranlib"
READELF="$HOME/llvm/test/build/bin/llvm-readelf"

install_newlib() {
  pushd $NEWLIB_PARENT_DIR
  git clone git://sourceware.org/git/newlib-cygwin.git
  cd newlib-cygwin
  git checkout dcb25665be227fb5a05497b7178a3d5df88050ec
  cp $CURR_DIR/newlib.patch .
  git apply newlib.patch
  cp -rf $CURR_DIR/newlib-cygwin/newlib/libc/machine/cpu0 newlib/libc/machine/. 
  cp -rf $CURR_DIR/newlib-cygwin/libgloss/cpu0 libgloss/. 
  popd
}

build_cpu0() {
  rm -rf build-$CPU-$ENDIAN
  mkdir build-$CPU-$ENDIAN
  cd build-$CPU-$ENDIAN
  CFLAGS="-target cpu0$ENDIAN-unknown-linux-gnu -mcpu=$CPU -static -fintegrated-as -Wno-error=implicit-function-declaration"
  CC=$CC CFLAGS=$CFLAGS AS=$AS AR=$AR RANLIB=$RANLIB READELF=$READELF ../newlib/configure --host=cpu0
  make
  cd ..
}

build_newlib() {
  pushd $NEWLIB_DIR
  CPU=cpu032I
  ENDIAN=eb
  build_cpu0;
  CPU=cpu032I
  ENDIAN=el
  build_cpu0;
  CPU=cpu032II
  ENDIAN=eb
  build_cpu0;
  CPU=cpu032II
  ENDIAN=el
  build_cpu0;
  popd
}

install_newlib;
build_newlib;

Note

In order to add Cpu0 backend to NewLib, the following changes in lbt/exlbt/newlib.patch

  • lbt/exlbt/newlib-cygwin/newlib/libc/machine/cpu0/setjmp.S is added;

  • newlib-cygwin/config.sub, newlib-cygwin/newlib/configure.host, newlib-cygwin/newlib/libc/include/machine/ieeefp.h, newlib-cygwin/newlib/libc/include/sys/unistd.h and newlib-cygwin/newlib/libc/machine/configure are modified for adding cpu0.

lbt/exlbt/newlib.patch

diff --git a/config.sub b/config.sub
index 63c1f1c8b..575e8d9d7 100755
--- a/config.sub
+++ b/config.sub
@@ -1177,6 +1177,7 @@ case $cpu-$vendor in
 			| d10v | d30v | dlx | dsp16xx \
 			| e2k | elxsi | epiphany \
 			| f30[01] | f700 | fido | fr30 | frv | ft32 | fx80 \
+                        | cpu0 \
 			| h8300 | h8500 \
 			| hppa | hppa1.[01] | hppa2.0 | hppa2.0[nw] | hppa64 \
 			| hexagon \
diff --git a/newlib/configure.host b/newlib/configure.host
index ca6b46f03..7bbf46f25 100644
--- a/newlib/configure.host
+++ b/newlib/configure.host
@@ -176,6 +176,10 @@ case "${host_cpu}" in
   fr30)
 	machine_dir=fr30
 	;;
+  cpu0)
+	machine_dir=cpu0
+	newlib_cflags="${newlib_cflags} -DCOMPACT_CTYPE"
+	;;
   frv)
 	machine_dir=frv
         ;;
@@ -751,6 +755,9 @@ newlib_cflags="${newlib_cflags} -DCLOCK_PROVIDED -DMALLOC_PROVIDED -DEXIT_PROVID
   fr30-*-*)
 	syscall_dir=syscalls
 	;;
+  cpu0-*)
+	syscall_dir=syscalls
+	;;
   frv-*-*)
         syscall_dir=syscalls
 	default_newlib_io_long_long="yes"
diff --git a/newlib/libc/include/machine/ieeefp.h b/newlib/libc/include/machine/ieeefp.h
index 3c1f41e03..1e79a6b26 100644
--- a/newlib/libc/include/machine/ieeefp.h
+++ b/newlib/libc/include/machine/ieeefp.h
@@ -249,6 +249,16 @@
 #define __IEEE_BIG_ENDIAN
 #endif
 
+// pre-defined compiler macro (from llc -march=cpu0${ENDIAN} or 
+// clang -target cpu0${ENDIAN}-unknown-linux-gnu 
+// http://beefchunk.com/documentation/lang/c/pre-defined-c/prearch.html 
+#ifdef __CPU0EL__
+#define __IEEE_LITTLE_ENDIAN
+#endif
+#ifdef __CPU0EB__
+#define __IEEE_BIG_ENDIAN
+#endif
+
 #ifdef __MMIX__
 #define __IEEE_BIG_ENDIAN
 #endif
@@ -507,4 +517,3 @@
 
 #endif /* not __IEEE_LITTLE_ENDIAN */
 #endif /* not __IEEE_BIG_ENDIAN */
-
diff --git a/newlib/libc/include/sys/unistd.h b/newlib/libc/include/sys/unistd.h
index 3cc313080..605929173 100644
--- a/newlib/libc/include/sys/unistd.h
+++ b/newlib/libc/include/sys/unistd.h
@@ -50,7 +50,7 @@ int     dup3 (int __fildes, int __fildes2, int flags);
 int	eaccess (const char *__path, int __mode);
 #endif
 #if __XSI_VISIBLE
-void	encrypt (char *__block, int __edflag);
+void	encrypt (char *__libc_block, int __edflag);
 #endif
 #if __BSD_VISIBLE || (__XSI_VISIBLE && __XSI_VISIBLE < 500)
 void	endusershell (void);
diff --git a/newlib/libc/machine/configure b/newlib/libc/machine/configure
index 62064cdfd..5ef5eec08 100755
--- a/newlib/libc/machine/configure
+++ b/newlib/libc/machine/configure
@@ -823,6 +823,7 @@ csky
 d10v
 d30v
 epiphany
+cpu0
 fr30
 frv
 ft32
@@ -12007,6 +12008,8 @@ subdirs="$subdirs a29k"
 	d30v) subdirs="$subdirs d30v"
  ;;
 	epiphany) subdirs="$subdirs epiphany"
+ ;;
+	cpu0) subdirs="$subdirs cpu0"
  ;;
 	fr30) subdirs="$subdirs fr30"
  ;;

lbt/exlbt/newlib-cygwin/newlib/libc/machine/cpu0/setjmp.S

# setjmp/longjmp for cpu0.  The jmpbuf looks like this:
#	
# Register	jmpbuf offset
# $9		0x00
# $10		0x04
# $11		0x08
# $12		0x0c
# $13		0x10
# $14		0x14
# $15		0x18
	
.macro save reg
	st	\reg,@r4
	add	#4,r4
.endm
	
.macro restore reg
	ld	@r4,\reg
	add	#4,r4
.endm


	.text
	.global	setjmp
setjmp:
	st $9,  0($a0)
	st $10, 4($a0)
	st $11, 8($a0)
	st $12, 12($a0)
	st $13, 16($a0)
	st $14, 20($a0)
	st $15, 24($a0)
# Return 0 to caller.
	addiu $lr, $zero, 0x0
	ret $lr

	.global	longjmp
longjmp:
	ld $9,  0($a0)
	ld $10, 4($a0)
	ld $11, 8($a0)
	ld $12, 12($a0)
	ld $13, 16($a0)
	ld $14, 20($a0)
	ld $15, 24($a0)

# If caller attempted to return 0, return 1 instead.
        cmp     $sw, $5,$0
        jne     $sw, $BB1
        addiu   $5,$0,1
$BB1:
        addu    $2,$0,$5
        ret	$lr
cschen@cschendeiMac exlbt % bash newlib-cpu0.sh

The libm.a depends on variable errno of libc defined in sys/errno.h.

  • libgloss is BSP [5]

Compiler-rt’s builtins

Compiler-rt is a project for runtime libraries implentation [6] . Compiler-rt/lib/builtins provides functions for basic operations such as +, -, *, /, … on type of float or double and for conversion between float and integer, or other type of more than 32-bit, such as long long. The compiler-rt/lib/builtins/README.txt [7] includes the dependent functions that the whole builtins called. The dependent functions is a small part of libm listed in compier-rt/lib/builtins/int_math.h [8] .

~git/newlib-cygwin/build-cpu032I-eb/Makefile

MATHDIR = math

# The newlib hardware floating-point routines have been disabled due to
# inaccuracy.  If you wish to work on them, you will need to edit the
# configure.in file to re-enable the configuration option.  By default,
# the NEWLIB_HW_FP variable will always be false.
#MATHDIR = mathfp

As above Makefile, newlib uses libm/math. The dependences for builtin functions of compiler-rt on libm as Fig. 11.

// dot -Tpng compiler-rt-dep.gv -o compiler-rt-dep.png
digraph G {
  rankdir=LR;

  compound=true;
  node [shape=record];

  subgraph cluster_compiler_rt {
    label = "compiler-rt";
    utb [label="test/builtins/Unit"];
    subgraph cluster_builtins {
      label = "lib/builtins";
      builtins [label="<fdt> float and double types | <ct> complex type"];
    }
  }

  node [label = "sanitizer_printf(%lld)"]; sanitizer_printf;
  node [label = "Cpu0 backend of llvm"]; cpu0;

  subgraph cluster_libm {
    label = "libm";
    libm [label="<c> common | <ma> math"];
  }

  builtins:ct -> libm:c [label = "the __builtin functions of isinf, isnan, fabsl, \n fmax, fmaxf, fmaxl, log, logf, logl, scalbn, scalbnf, \n scalbnl, copysign, copysignf, copysignl, fabsl" ];
  builtins:ct:se -> libm:ma [label = "the __builtin functions of fabs, fabsf" ];
  builtins:fdt -> cpu0 [label = "__builtin_clz(), __builtin_clo() and abort()" ];
  utb -> sanitizer_printf [label = "sanitizer_printf.cpp and sanitizer_internal_defs.h \n of compiler-rt/test/builtins/Unit" ];
}

Fig. 11 Dependences for builtin functions of compiler-rt on libm

In this section, I get test cases for verification of SW Float Point from compiler-rt/test/builtins/Unit to compiler-rt-test/builtins/Unit/.

Since lbt/exlbt/input/printf-stdarg.c does not support %lld (long long integeter, 64-bit) and test cases in compiler-rt/test/builtins/Unit needs it to verify the result of test cases for SW Float Point, I port sanitizer_printf.cpp and sanitizer_internal_defs.h of lbt/exlbt/input from sanitizer_printf.cpp and sanitizer_internal_defs.h of compiler-rt/lib/sanitizer_common.

Table 4 compiler-rt builtins dependences on newlib/libm (open source libc for bare metal)

function

file

directory of libm

abort

lbt/exlbt/compiler-rt/ cpu0/abort.c

isinf

s_isinf.c

newlib-cygwin/newlib/libm/common

isnan

s_isnan.c

fabsl

fabsl.c

fmax

s_fmax.c

fmaxf

sf_fmax.c

fmaxl

fmaxl.c

log

log.c

logf

sf_log.c

logl

logl.c

scalbn

s_scalbn.c

scalbnf

sf_scalbn.c

scalbnl

scalblnl.c

copysign

s_copysign.c

copysignf

sf_copysign.c

copysignl

copysignl.c

fabsl

fabsl.c

fabs

s_fabs.c

newlib-cygwin/newlib/libm/math

fabsf

sf_fabs.c

  • Libm has no dependence to any other library.

  • Only type of complex in compiler-rt/lib/builtin need above, others (float and double) depend on __builtin_clz(), __builtin_clo() and abort() only. I has ported in lbt/exlbt/compiler-rt/cpu0/abort.c.

  • All test cases in compiler-rt/test/builtins/Unit depend on printf(%lld or %llX, …), I ported from compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp to lbt/exlbt/input/sanitizer_printf.cpp.

  • These dependent functions of complex type has bee ported from newlib/libm.

  • Except builtins, the other three, sanitizer runtimes, profile and BlocksRuntime, in compiler-rt are not needed for my embedded Cpu0.

The libgcc’s Integer plus Soft float library [9] [10] [11] are equal to functions of compiler-rt’s builtins.

In compiler-rt/lib/builtins, the dependence between files as table.

Table 5 dependence between files for compiler-rt/lib/builtins

functions

depend on

*.c

*.inc

*.inc

*.h

Though the ‘rt’ means RunTime libaraies, in builtins library, most of these functions written in target-independent C form and can be compiled and static-linked into target. When you compile the following c code, llc will generate jsub __addsf3 to call compiler-rt float function for Cpu0. This is because Cpu0 hasn’t hardware float-instructions, so Cpu0 backend doesn’t handle the DAG of __addsf3. The end result, llvm treats the DAG of __addsf3 as a function call for float-add instruction.

lbt/exlbt/input/ch_call_compilerrt_func.c

// clang -target mips-unknown-linux-gnu -S ch_call_compilerrt_func.c -emit-llvm -o ch_call_compilerrt_func.ll
// ~/llvm/test/build/bin/llc -march=cpu0 -mcpu=cpu032II -relocation-model=static -filetype=asm ch_call_compilerrt_func.ll -o -

/// start
float ch_call_compilerrt_func()
{
  float a = 3.1;
  float b = 2.2;
  float c = a + b;

  return c;
}

chungshu@ChungShudeMacBook-Air input % clang -target mips-unknown-linux-gnu -S
ch_call_compilerrt_func.c -emit-llvm
chungshu@ChungShudeMacBook-Air input % cat ch_call_compilerrt_func.ll
  ...
  %4 = load float, float* %1, align 4
  %5 = load float, float* %2, align 4
  %6 = fadd float %4, %5

chungshu@ChungShudeMacBook-Air input % ~/llvm/test/build/bin/llc -march=cpu0
-mcpu=cpu032II -relocation-model=static -filetype=asm ch_call_compilerrt_func.ll -o -
      ...
      ld      $4, 20($fp)
      ld      $5, 16($fp)
      jsub    __addsf3

For some brar-metal or embedded application, the C code doesn’t need the file and high-level IO in libc. Libm provides a lots of functions to support software floating point beyond basic operations [12] . Libc provides file, high-level IO functions and basic float functions [13] .

Cpu0 hires Compiler-rt/lib/builtins and compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp at this point. The compiler-rt/lib/builtins is a target-independent C form of software float library implementation. Cpu0 implements compiler-rt-12.x/cpu0/abort.c only at this point for supporting this feature.

Note

Why these libm functions called builtins in compiler-rt/lib/builtins?

Though these compiler-rt builtins functions are written in C. The CPU can provide float type instructions or high level instructions to compile these libm function calls into specific HW instructions to speed up.

In order to speed up these libm functions, many CPU provide float instructions for them. Of course, for the implemenation, clang compiles these float type’s operation in C into llvm ir, then Mips backends compiles them into their HW instructions. For example:

  • float a, b, c; a=b*c; -> (clang) -> %add = fmul float %0, %1 [15]

Mips backend compiles fmul into HW instructions as follows,

  • %add = fmul float %0, %1 -> (llvm-mips) -> mul.s [15] [14]

Cpu0 backend compiles fmul into libm function call fmul as follows,

  • %add = fmul float %0, %1 -> (llvm-mips) -> jsub fmul [15]

For high level of math functions, clang compiles these float type’s operation in C into llvm intrinsic functions, then the llvm backends of these CPU compile them into their HW instructions. For example, clang compiles pow() into @llvm.pow.f32 as follows,

  • %pow = call float @llvm.pow.f32(float %x, float %y) [16]

AMDGPU compiles @llvm.pow.f32 into a few instructions as follows:

  • %pow = call float @llvm.pow.f32(float %x, float %y) (llvm-AMDGPU) -> … + v_exp_f32_e32 v0, v0 + … [16]

Mips compiles @llvm.pow.f32 into a few instructions as follows:

  • %pow = call float @llvm.pow.f32(float %x, float %y) (llvm-AMDGPU) -> jal powf [16]

Clang treats these libm functions as builtin and compiles them into llvm ir or intrinsic, then different backends can choose to compile them into specific instructions or call builtin functions in libm. The following is Clang’s comment [17].

RValue CodeGenFunction::EmitBuiltinExpr(...)
  ...
  // There are LLVM math intrinsics/instructions corresponding to math library
  // functions except the LLVM op will never set errno while the math library
  // might. Also, math builtins have the same semantics as their math library
  // twins. Thus, we can transform math library and builtin calls to their
  // LLVM counterparts if the call is marked 'const' (known to never set errno).

Verification

The following sanitizer_printf.cpp extended from compiler-rt can support printf(“%lld”). It’s implementation calling some floating lib functions in compiler-rt/lib/builtins.

exlbt/include/math.h

#ifndef _MATH_H_
#define	_MATH_H_

//#ifdef HAS_COMPLEX
 #ifndef HUGE_VALF
  #define HUGE_VALF (1.0e999999999F)
 #endif

 #if !defined(INFINITY)
  #define INFINITY (HUGE_VALF)
 #endif

 #if !defined(NAN)
  #define NAN (0.0F/0.0F)
 #endif

 float cabsf(float complex) ;
//#endif
#endif

exlbt/include/stdio.h

#ifndef _STDIO_H_
#define	_STDIO_H_

#define stdin   0
#define stdout  1
#define stderr  2

#define size_t unsigned int

#endif

exlbt/include/stdlib.h

#ifndef _STDLIB_H_
#define	_STDLIB_H_

#ifdef __cplusplus
extern "C" {
#endif

void abort();

#ifdef __cplusplus
}
#endif

#endif

exlbt/include/string.h

#ifndef _STRING_H_
#define	_STRING_H_


#endif

exlbt/compiler-rt/cpu0/abort.c

void abort() {
  // cpu0.v: ABORT at mem 0x04
  asm("addiu $lr, $ZERO, 4");
  asm("ret $lr"); 
}

exlbt/input/sanitizer_internal_defs.h

//===-- sanitizer_internal_defs.h -------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file is shared between AddressSanitizer and ThreadSanitizer.
// It contains macro used in run-time libraries code.
//===----------------------------------------------------------------------===//
#ifndef SANITIZER_DEFS_H
#define SANITIZER_DEFS_H

// For portability reasons we do not include stddef.h, stdint.h or any other
// system header, but we do need some basic types that are not defined
// in a portable way by the language itself.
namespace __sanitizer {

#if defined(_WIN64)
// 64-bit Windows uses LLP64 data model.
typedef unsigned long long uptr;
typedef signed long long sptr;
#else
typedef unsigned long uptr;
typedef signed long sptr;
#endif  // defined(_WIN64)
#if defined(__x86_64__)
// Since x32 uses ILP32 data model in 64-bit hardware mode, we must use
// 64-bit pointer to unwind stack frame.
typedef unsigned long long uhwptr;
#else
typedef uptr uhwptr;
#endif

typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned int u32;
typedef unsigned long long u64;
typedef signed char s8;
typedef signed short s16;
typedef signed int s32;
typedef signed long long s64;

// Check macro
#define RAW_CHECK_MSG(expr, msg) 

#define RAW_CHECK(expr) RAW_CHECK_MSG(expr, #expr)

#define CHECK_IMPL(c1, op, c2)

#define CHECK(a)       CHECK_IMPL((a), !=, 0)
#define CHECK_EQ(a, b) CHECK_IMPL((a), ==, (b))
#define CHECK_NE(a, b) CHECK_IMPL((a), !=, (b))
#define CHECK_LT(a, b) CHECK_IMPL((a), <,  (b))
#define CHECK_LE(a, b) CHECK_IMPL((a), <=, (b))
#define CHECK_GT(a, b) CHECK_IMPL((a), >,  (b))
#define CHECK_GE(a, b) CHECK_IMPL((a), >=, (b))

}  // namespace __sanitizer

#endif

exlbt/input/sanitizer_printf.cpp

//===-- sanitizer_printf.cpp ----------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file is shared between AddressSanitizer and ThreadSanitizer.
//
// Internal printf function, used inside run-time libraries.
// We can't use libc printf because we intercept some of the functions used
// inside it.
//===----------------------------------------------------------------------===//

#include "sanitizer_internal_defs.h"

#include <stdio.h>
#include <stdarg.h>

#include "debug.h"

extern "C" int putchar(int c);

extern void* internal_memset(void* b, int c, size_t len);

#if SANITIZER_WINDOWS && defined(_MSC_VER) && _MSC_VER < 1800 &&               \
      !defined(va_copy)
# define va_copy(dst, src) ((dst) = (src))
#endif

namespace __sanitizer {

static int strlen(const char* s) {
  int len = 0;
  for (const char* p = s; *p != '\0'; p++) {
    len++;
  }
  return len;
}

static int AppendChar(char **buff, const char *buff_end, char c) {
  if (*buff < buff_end) {
    **buff = c;
    (*buff)++;
  }
  return 1;
}

// Appends number in a given base to buffer. If its length is less than
// |minimal_num_length|, it is padded with leading zeroes or spaces, depending
// on the value of |pad_with_zero|.
static int AppendNumber(char **buff, const char *buff_end, u64 absolute_value,
                        u8 base, u8 minimal_num_length, bool pad_with_zero,
                        bool negative, bool uppercase, bool left_justified) {
  uptr const kMaxLen = 30;
  RAW_CHECK(base == 10 || base == 16);
  RAW_CHECK(base == 10 || !negative);
  RAW_CHECK(absolute_value || !negative);
  RAW_CHECK(minimal_num_length < kMaxLen);
  int result = 0;
  if (negative && minimal_num_length)
    --minimal_num_length;
  if (negative && pad_with_zero)
    result += AppendChar(buff, buff_end, '-');
  uptr num_buffer[kMaxLen];
  int num_pads = 0;
  int pos = 0;
  do {
    RAW_CHECK_MSG((uptr)pos < kMaxLen, "AppendNumber buffer overflow");
    num_buffer[pos++] = absolute_value % base;
    absolute_value /= base;
  } while (absolute_value > 0);
  if (pos < minimal_num_length) {
    // Make sure compiler doesn't insert call to memset here.
    internal_memset(&num_buffer[pos], 0,
                    sizeof(num_buffer[0]) * (minimal_num_length - pos));
    num_pads = minimal_num_length - pos;
    pos = minimal_num_length;
  }
  RAW_CHECK(pos > 0);
  pos--;
  for (; pos >= 0 && num_buffer[pos] == 0; pos--) {
    char c = (pad_with_zero || pos == 0) ? '0' : ' ';
    if (!left_justified)
      result += AppendChar(buff, buff_end, c);
  }
  if (negative && !pad_with_zero) result += AppendChar(buff, buff_end, '-');
  for (; pos >= 0; pos--) {
    char digit = static_cast<char>(num_buffer[pos]);
    digit = (digit < 10) ? '0' + digit : (uppercase ? 'A' : 'a') + digit - 10;
    result += AppendChar(buff, buff_end, digit);
  }
  if (left_justified) {
    for (int i = 0; i < num_pads; i++)
      result += AppendChar(buff, buff_end, ' ');
  }
  return result;
}

static int AppendUnsigned(char **buff, const char *buff_end, u64 num, u8 base,
                          u8 minimal_num_length, bool pad_with_zero,
                          bool uppercase, bool left_justified) {
  return AppendNumber(buff, buff_end, num, base, minimal_num_length,
                      pad_with_zero, false /* negative */, uppercase, 
                      left_justified);
}

static int AppendSignedDecimal(char **buff, const char *buff_end, s64 num,
                               u8 minimal_num_length, bool pad_with_zero,
                               bool left_justified) {
  bool negative = (num < 0);
  return AppendNumber(buff, buff_end, (u64)(negative ? -num : num), 10,
                      minimal_num_length, pad_with_zero, negative,
                      false /* uppercase */, left_justified);
}


// Use the fact that explicitly requesting 0 width (%0s) results in UB and
// interpret width == 0 as "no width requested":
// width == 0 - no width requested
// width  < 0 - left-justify s within and pad it to -width chars, if necessary
// width  > 0 - right-justify s, implement for cpu0
static int AppendString(char **buff, const char *buff_end, int width,
                        int max_chars, const char *s, bool left_justified) {
  if (!s)
    s = "<null>";
  int result = 0;
  if (!left_justified) {
    int s_len = strlen(s);
    while (result < width - s_len)
      result += AppendChar(buff, buff_end, ' ');
  }
  for (; *s; s++) {
    if (max_chars >= 0 && result >= max_chars)
      break;
    result += AppendChar(buff, buff_end, *s);
  }
  if (left_justified) {
    while (width < -result)
      result += AppendChar(buff, buff_end, ' ');
  }
  return result;
}

static int AppendPointer(char **buff, const char *buff_end, u64 ptr_value,
                         bool left_justified) {
  int result = 0;
  result += AppendString(buff, buff_end, 0, -1, "0x", left_justified);
  result += AppendUnsigned(buff, buff_end, ptr_value, 16,
// By running clang -E, can get the macro value for SANITIZER_POINTER_FORMAT_LENGTH is (12)
//                           SANITIZER_POINTER_FORMAT_LENGTH,
                           (12),
                           true /* pad_with_zero */, false /* uppercase */,
                           left_justified);
  return result;
}

int VSNPrintf(char *buff, int buff_length,
              const char *format, va_list args) {
  static const char *kPrintfFormatsHelp =
      "Supported Printf formats: %([0-9]*)?(z|ll)?{d,u,x,X}; %p; "
      "%[-]([0-9]*)?(\\.\\*)?s; %c\n";
  RAW_CHECK(format);
  RAW_CHECK(buff_length > 0);
  const char *buff_end = &buff[buff_length - 1];
  const char *cur = format;
  int result = 0;
  for (; *cur; cur++) {
    if (*cur != '%') {
      result += AppendChar(&buff, buff_end, *cur);
      continue;
    }
    cur++;
    bool left_justified = *cur == '-';
    if (left_justified)
      cur++;
    bool have_width = (*cur >= '0' && *cur <= '9');
    bool pad_with_zero = (*cur == '0');
    int width = 0;
    if (have_width) {
      while (*cur >= '0' && *cur <= '9') {
        width = width * 10 + *cur++ - '0';
      }
    }
    bool have_precision = (cur[0] == '.' && cur[1] == '*');
    int precision = -1;
    if (have_precision) {
      cur += 2;
      precision = va_arg(args, int);
    }
    bool have_z = (*cur == 'z');
    cur += have_z;
    bool have_ll = !have_z && (cur[0] == 'l' && cur[1] == 'l');
    cur += have_ll * 2;
    s64 dval;
    u64 uval;
    const bool have_length = have_z || have_ll;
    const bool have_flags = have_width || have_length;
    // At the moment only %s supports precision and left-justification.
    CHECK(!((precision >= 0 || left_justified) && *cur != 's'));
    switch (*cur) {
      case 'd': {
        dval = have_ll ? va_arg(args, s64)
             : have_z ? va_arg(args, sptr)
             : va_arg(args, int);
        result += AppendSignedDecimal(&buff, buff_end, dval, width,
                                      pad_with_zero, left_justified);
        break;
      }
      case 'u':
      case 'x':
      case 'X': {
        uval = have_ll ? va_arg(args, u64)
             : have_z ? va_arg(args, uptr)
             : va_arg(args, unsigned);
        bool uppercase = (*cur == 'X');
        result += AppendUnsigned(&buff, buff_end, uval, (*cur == 'u') ? 10 : 16,
                                 width, pad_with_zero, uppercase, left_justified);
        break;
      }
      case 'p': {
        RAW_CHECK_MSG(!have_flags, kPrintfFormatsHelp);
        result += AppendPointer(&buff, buff_end, va_arg(args, uptr),
                                left_justified);
        break;
      }
      case 's': {
        RAW_CHECK_MSG(!have_length, kPrintfFormatsHelp);
        CHECK(!have_width || left_justified);
        result += AppendString(&buff, buff_end, left_justified ? -width : width,
                               precision, va_arg(args, char*), left_justified);
        break;
      }
      case 'c': {
        RAW_CHECK_MSG(!have_flags, kPrintfFormatsHelp);
        result += AppendChar(&buff, buff_end, va_arg(args, int));
        break;
      }
      case '%' : {
        RAW_CHECK_MSG(!have_flags, kPrintfFormatsHelp);
        result += AppendChar(&buff, buff_end, '%');
        break;
      }
      default: {
        RAW_CHECK_MSG(false, kPrintfFormatsHelp);
      }
    }
  }
  RAW_CHECK(buff <= buff_end);
  AppendChar(&buff, buff_end + 1, '\0');
  return result;
}

} // namespace __sanitizer

int prints(const char *string)
{
  int pc = 0, padchar = ' ';

  for ( ; *string ; ++string) {
    putchar (*string);
    ++pc;
  }

  return pc;
}

extern "C" int sprintf(char *buffer, const char *format, ...) {
  int length = 1000;
  va_list args;
  va_start(args, format);
  int needed_length = __sanitizer::VSNPrintf(buffer, length, format, args);
  va_end(args);
  return 0;
}

extern "C" int printf(const char *format, ...) {
  int length = 1000;
  char buffer[1000];
  va_list args;
  va_start(args, format);
  int needed_length = __sanitizer::VSNPrintf(buffer, length, format, args);
  va_end(args);
  prints(buffer);
  return 0;
}

extern "C" int san_printf(const char *format, ...) {
  int length = 1000;
  char buffer[1000];
  va_list args;
  va_start(args, format);
  int needed_length = __sanitizer::VSNPrintf(buffer, length, format, args);
  va_end(args);
  prints(buffer);
  return 0;
}

Above two sanitizer_*.* files are ported from compiler-rt and I add code to support left-justify for number-printf and right-justify for string-printf. The following ch_float.cpp test the float lib.

lbt/exlbt/compiler-rt-12.x/builtins/Makefile

# Thanks .c .cb Vranish (https://spin.a.cmi.cbject..cm/2016/08/26/makefile-c-p.cjects/)

# CPU and endian passed from command line, such as "make CPU=cpu032II ENDIAN=el"

TARGET_LIB := libbuiltins.a
BUILD_DIR := ./build-$(CPU)-$(ENDIAN)
TARGET := $(BUILD_DIR)/$(TARGET_LIB)

SRC_DIR := $(HOME)/llvm/llvm-project/compiler-rt/lib/builtins

PWD := $(shell pwd)

TOOLDIR := ~/llvm/test/build/bin
CC := $(TOOLDIR)/clang
AR := $(TOOLDIR)/llvm-ar

# copy GENERIC_SOURCES from compiler-rt/lib/builtin/CMakeLists.txt
GENERIC_SOURCES := \
  absvdi2.c \
  absvsi2.c \
  absvti2.c \
  adddf3.c \
  addsf3.c \
  addvdi3.c \
  addvsi3.c \
  addvti3.c \
  apple_versioning.c \
  ashldi3.c \
  ashlti3.c \
  ashrdi3.c \
  ashrti3.c \
  bswapdi2.c \
  bswapsi2.c \
  clzdi2.c \
  clzsi2.c \
  clzti2.c \
  cmpdi2.c \
  cmpti2.c \
  comparedf2.c \
  comparesf2.c \
  ctzdi2.c \
  ctzsi2.c \
  ctzti2.c \
  divdc3.c \
  divdf3.c \
  divdi3.c \
  divsc3.c \
  divtc3.c \
  divmoddi4.c \
  divmodsi4.c \
  divmodti4.c \
  divsc3.c \
  divsf3.c \
  divsi3.c \
  divti3.c \
  divxc3.c \
  extendsfdf2.c \
  extendhfsf2.c \
  ffsdi2.c \
  ffssi2.c \
  ffsti2.c \
  fixdfdi.c \
  fixdfsi.c \
  fixdfti.c \
  fixsfdi.c \
  fixsfsi.c \
  fixsfti.c \
  fixunsdfdi.c \
  fixunsdfsi.c \
  fixunsdfti.c \
  fixunssfdi.c \
  fixunssfsi.c \
  fixunssfti.c \
  floatdidf.c \
  floatdisf.c \
  floatsidf.c \
  floatsisf.c \
  floattidf.c \
  floattisf.c \
  floatundidf.c \
  floatundisf.c \
  floatunsidf.c \
  floatunsisf.c \
  floatuntidf.c \
  floatuntisf.c \
  fp_mode.c \
  int_util.c \
  lshrdi3.c \
  lshrti3.c \
  moddi3.c \
  modsi3.c \
  modti3.c \
  muldc3.c \
  muldf3.c \
  muldi3.c \
  mulodi4.c \
  mulosi4.c \
  muloti4.c \
  mulsc3.c \
  mulsf3.c \
  multi3.c \
  mulvdi3.c \
  mulvsi3.c \
  mulvti3.c \
  mulxc3.c \
  negdf2.c \
  negdi2.c \
  negsf2.c \
  negti2.c \
  negvdi2.c \
  negvsi2.c \
  negvti2.c \
  os_version_check.c \
  paritydi2.c \
  paritysi2.c \
  parityti2.c \
  popcountdi2.c \
  popcountsi2.c \
  popcountti2.c \
  powidf2.c \
  powisf2.c \
  subdf3.c \
  subsf3.c \
  subvdi3.c \
  subvsi3.c \
  subvti3.c \
  trampoline_setup.c \
  truncdfhf2.c \
  truncdfsf2.c \
  truncsfhf2.c \
  ucmpdi2.c \
  ucmpti2.c \
  udivdi3.c \
  udivmoddi4.c \
  udivmodsi4.c \
  udivmodti4.c \
  udivsi3.c \
  udivti3.c \
  umoddi3.c \
  umodsi3.c \
  umodti3.c

SRCS := $(GENERIC_SOURCES)

# String substitution for every C file.
# As an example, absvdi2.c turns into ./builtins/absvdi2.c
SRCS := $(SRCS:%=$(SRC_DIR)/%) $(PWD)/../cpu0/abort.c

# String substitution for every C/C++ file.
# As an example, absvdi2.c turns into ./build-$(CPU)-$(ENDIAN)/absvdi2.c.o
OBJS := $(SRCS:%=$(BUILD_DIR)/%.o)

# String substitution (suffix version without %).
# As an example, ./build/absvdi2.c.o turns into ./build-$(CPU)-$(ENDIAN)/absvdi2.c.d
DEPS := $(OBJS:.o=.d)

# Every folder in ./src will need to be passed to GCC so that it can find header files
# stdlib.h, ..., etc existed in ../../include
INC_DIRS := $(shell find $(SRC_DIR) -type d)  ../../include
# Add a prefix to INC_DIRS. So moduleA would become -ImoduleA. GCC understands this -I flag
INC_FLAGS := $(addprefix -I,$(INC_DIRS))

# The -MMD and -MP flags together generate Makefiles for us!
# These files will have .d instead of .o as the output.
CPPFLAGS := -MMD -MP -target cpu0$(ENDIAN)-unknown-linux-gnu -static \
  -fintegrated-as $(INC_FLAGS) -mcpu=$(CPU) -mllvm -has-lld=true

# The final build step.
$(TARGET): $(OBJS)
	$(AR) -rcs $@ $(OBJS)

# Build step for C source
$(BUILD_DIR)/%.c.o: %.c
	mkdir -p $(dir $@)
	$(CC) $(CPPFLAGS) $(CFLAGS) -c $< -o $@


.PHONY: clean
clean:
	rm -rf $(BUILD_DIR)

# Include the .d makefiles. The - at the f.cnt suppresses the er.crs.cf missing
# Makefiles. Initially, all the .d files will be missing, and we .cn't want t.cse
# er.crs .c s.cw up.
-include $(DEPS)

exlbt/input/ch_float.cpp

//#include "debug.h"

extern "C" int printf(const char *format, ...);
extern "C" int sprintf(char *out, const char *format, ...);

#include "ch9_3_longlongshift.cpp"

void test_printf()
{
  char buf[80];
  long long a = 0x100000007fffffff;
  printf("a: %llX, %llx, %lld\n", a, a, a);
  int b = 0x10000000;
  printf("b: %x, %d\n", b, b);
  sprintf(buf, "b: %x, %d\n", b, b); printf("%s", buf);

  // sanitizer_printf.cpp support right-justify for num only and left-justify
  // for string only. However, I change and support right-justify for cpu0.
  char ptr[] = "Hello world!";
  char *np = 0;
  int i = 5;
  unsigned int bs = sizeof(int)*8;
  int mi;

  mi = (1 << (bs-1)) + 1;
  printf("%s\n", ptr);
  printf("printf test\n");
  printf("%s is null pointer\n", np);
  printf("%d = 5\n", i);
  printf("%d = - max int\n", mi);
  printf("char %c = 'a'\n", 'a');
  printf("hex %x = ff\n", 0xff);
  printf("hex %02x = 00\n", 0);
  printf("signed %d = unsigned %u = hex %x\n", -3, -3, -3);
  printf("%d %s(s)", 0, "message");
  printf("\n");
  printf("%d %s(s) with %%\n", 0, "message");
  sprintf(buf, "justif: \"%-10s\"\n", "left"); printf("%s", buf);
  sprintf(buf, "justif: \"%10s\"\n", "right"); printf("%s", buf);
  sprintf(buf, " 3: %04d zero padded\n", 3); printf("%s", buf);
  sprintf(buf, " 3: %-4d left justif.\n", 3); printf("%s", buf);
  sprintf(buf, " 3: %4d right justif.\n", 3); printf("%s", buf);
  sprintf(buf, "-3: %04d zero padded\n", -3); printf("%s", buf);
  sprintf(buf, "-3: %-4d left justif.\n", -3); printf("%s", buf);
  sprintf(buf, "-3: %4d right justif.\n", -3); printf("%s", buf);
}

template <class T>
T test_shift_left(T a, T b) {
  return (a << b);
}

template <class T>
T test_shift_right(T a, T b) {
  return (a >> b);
}

template <class T1, class T2, class T3>
T1 test_add(T2 a, T3 b) {
  T1 c = a + b;
  return c;
}

template <class T1, class T2, class T3>
T1 test_mul(T2 a, T3 b) {
  T1 c = a * b;
  return c;
}

template <class T1, class T2, class T3>
T1 test_div(T2 a, T3 b) {
  T1 c = a / b;
  return c;
}

bool check_result(const char* fn, long long res, long long expected) {
  printf("%s = %lld\n", fn, res);
  if (res != expected) {
    printf("\terror: result %lld, expected %lld\n", res, expected);
  }
  return (res == expected);
}

bool check_result(const char* fn, unsigned long long res, unsigned long long expected) {
  printf("%s = %llu\n", fn, res);
  if (res != expected) {
    printf("\terror: result %llu, expected %llu\n", res, expected);
  }
  return (res == expected);
}

bool check_result(const char* fn, int res, int expected) {
  printf("%s = %d\n", fn, res);
  if (res != expected) {
    printf("\terror: result %d, expected %d\n", res, expected);
  }
  return (res == expected);
}

int main() {
  long long a;
  unsigned long long b;
  int c;

  test_printf();

  a = test_longlong_shift1();
  check_result("test_longlong_shift1()", a, 289LL);

  a = test_longlong_shift2();
  check_result("test_longlong_shift2()", a, 22LL);

// call __ashldi3
  a = test_shift_left<long long>(0x12LL, 4LL); // 0x120 = 288
  check_result("test_shift_left<long long>(0x12LL, 4LL)", a, 288LL);
  
// call __ashrdi3
  a = test_shift_right<long long>(0x001666660000000a, 48LL); // 0x16 = 22
  check_result("test_shift_right<long long>(0x001666660000000a, 48LL)", a, 22LL);
  
// call __lshrdi3
  b = test_shift_right<unsigned long long>(0x001666660000000a, 48LLu); // 0x16 = 22
  check_result("test_shift_right<unsigned long long>(0x001666660000000a, 48LLu)", b, 22LLu);
  
// call __addsf3, __fixsfsi
  c = (int)test_add<float, float, float>(-2.2, 3.3); // (int)1.1 = 1
  check_result("(int)test_add<float, float, float>(-2.2, 3.3)", c, 1);
  
// call __mulsf3, __fixsfsi
  c = (int)test_mul<float, float, float>(-2.2, 3.3); // (int)-7.26 = -7
  check_result("(int)test_mul<float, float, float>(-2.2, 3.3)", c, -7);
  
// call __divsf3, __fixsfsi
  c = (int)test_div<float, float, float>(-1.8, 0.5); // (int)-3.6 = -3
  check_result("(int)test_div<float, float, float>(-1.8, 0.5)", c, -3);
  
// call __extendsfdf2, __adddf3, __fixdfsi
  c = (int)test_add<double, double, float>(-2.2, 3.3); // (int)1.1 = 1
  check_result("(int)test_add<double, double, float>(-2.2, 3.3)", c, 1);
  
// call __extendsfdf2, __adddf3, __fixdfsi
  c = (int)test_add<double, float, double>(-2.2, 3.3); // (int)1.1 = 1
  check_result("(int)test_add<double, float, double>(-2.2, 3.3)", c, 1);
  
// call __extendsfdf2, __adddf3, __fixdfsi
  c = (int)test_add<float, float, double>(-2.2, 3.3); // (int)1.1 = 1
  check_result("(int)test_add<float, float, double>(-2.2, 3.3)", c, 1);
  
// call __extendsfdf2, __muldf3, __fixdfsi
  c = (int)test_mul<double, float, double>(-2.2, 3.3); // (int)-7.26 = -7
  check_result("(int)test_mul<double, float, double>(-2.2, 3.3)", c, -7);
  
// call __extendsfdf2, __muldf3, __truncdfsf2, __fixdfsi
// ! __truncdfsf2 in truncdfsf2.c is not work for Cpu0
  c = (int)test_mul<float, float, double>(-2.2, 3.3); // (int)-7.26 = -7
  check_result("(int)test_mul<float, float, double>(-2.2, 3.3)", c, -7);
  
// call __divdf3, __fixdfsi
  c = (int)test_div<double, double, double>(-1.8, 0.5); // (int)-3.6 = -3
  check_result("(int)test_div<double, double, double>(-1.8, 0.5)", c, -3);

#if 0 // these three do call builtins  
  c = (int)test_mul<int, int, int>(-2, 3); // -6
  check_result("(int)test_mul<int, int, int>(-2, 3)", c, -6);
  
  c = (int)test_div<int, int, int>(-10, 4); // -2 <- -2*4+2, quotient:-2, remainder:2 (remainder < 4:divident)
  check_result("(int)test_div<int, int, int>(-10, 4)", c, -3);
  
  a = test_mul<long long, long long, long long>(-2LL, 3LL); // -6LL
  check_result("test_mul<long long, long long, long long>(-2LL, 3LL)", a, -6LL);
#endif

// call __divdi3,
  a = test_div<long long, long long, long long>(-10LL, 4LL); // -3
  check_result("test_div<long long, long long, long long>(-10LL, 4LL)", a, -2LL);
  
  return 0;
}

exlbt/input/Makefile.float


SRCS := start.cpp debug.cpp sanitizer_printf.cpp printf-stdarg-def.c \
        cpu0-builtins.cpp ch_float.cpp lib_cpu0.c
LIBBUILTINS_DIR := ../compiler-rt/builtins
INC_DIRS := ../ $(NEWLIB_DIR)/newlib/libc/include $(LBDEX_DIR)/input
LIBS := $(LIBBUILTINS_DIR)/build-$(CPU)-$(ENDIAN)/libbuiltins.a

include Common.mk
chungshu@ChungShudeMacBook-Air input % bash make.sh cpu032II eb Makefile.float
...
endian =  BigEndian
ISR address:00020614
0   /* 0: big endian, 1: little endian */

chungshu@ChungShudeMacBook-Air verilog % iverilog -o cpu0IIs cpu0IIs.v
chungshu@ChungShudeMacBook-Air verilog % ./cpu0IIs
...

a: 100000007FFFFFFF, 100000007fffffff, 1152921506754330623
b: 10000000, 268435456
b: 10000000, 268435456
Hello world!
printf test
<null> is null pointer
5 = 5
-2147483647 = - max int
char a = 'a'
hex ff = ff
hex 00 = 00
signed -3 = unsigned 4294967293 = hex fffffffd
0 message(s)
0 message(s) with %
justif: "left      "
justif: "     right"
 3: 0003 zero padded
 3: 3    left justif.
 3:    3 right justif.
-3: -003 zero padded
-3: -3   left justif.
-3:   -3 right justif.
test_longlong_shift1() = 289
test_longlong_shift2() = 22
test_shift_left<long long>(0x12, 4LL) = 288
test_shift_right<long long>(0x001666660000000a, 48LL) = 22
test_shift_right<unsigned long long>(0x001666660000000a, 48LLu) = 22
(int)test_add<float, float, float>(-2.2, 3.3) = 1
(int)test_mul<float, float, float>(-2.2, 3.3) = -7
(int)test_div<float, float, float>(-1.8, 0.5) = -3
(int)test_add<double, double, float>(-2.2, 3.3) = 1
(int)test_add<double, float, double>(-2.2, 3.3) = 1
(int)test_add<float, float, double>(-2.2, 3.3) = 1
(int)test_mul<double, float, double>(-2.2, 3.3) = -7
(int)test_mul<float, float, double>(-2.2, 3.3) = -7
(int)test_div<double, double, double>(-1.8, 0.5) = -3
test_div<long long, long long, long long>(-10LL, 4LL) = -2
...
RET to PC < 0, finished!

The exlbt/input/compiler-rt-test/builtins/Unit copied from compiler-rt/test/builtins/Unit as follows,

exlbt/input/ch_builtins.cpp

#include "debug.h"
#include <stdlib.h>

extern "C" int printf(const char *format, ...);
extern "C" int sprintf(char *out, const char *format, ...);

extern "C" int absvdi2_test();
extern "C" int absvsi2_test();
extern "C" int absvti2_test();
extern "C" int adddf3vfp_test();
extern "C" int addsf3vfp_test();
extern "C" int addvdi3_test();
extern "C" int addvsi3_test();
extern "C" int addvti3_test();
extern "C" int ashldi3_test();
extern "C" int ashlti3_test();
extern "C" int ashrdi3_test();
extern "C" int ashrti3_test();

// atomic.c need memcmp(...)
//extern "C" int atomic_test();
extern "C" int bswapdi2_test();
extern "C" int bswapsi2_test();

extern "C" int clzdi2_test();
extern "C" int clzsi2_test();
extern "C" int clzti2_test();
extern "C" int cmpdi2_test();
extern "C" int cmpti2_test();
extern "C" int comparedf2_test();
extern "C" int comparesf2_test();

// Needless to compare compiler_rt_logb() with logb() of libm
//extern "C" int compiler_rt_logb_test();
//extern "C" int compiler_rt_logbf_test();
//extern "C" int compiler_rt_logbl_test();

extern "C" int cpu_model_test();
extern "C" int ctzdi2_test();
extern "C" int ctzsi2_test();
extern "C" int ctzti2_test();

// div for complex type need libm: fabs, isinf, ..., skip it at this point
#ifdef HAS_COMPLEX
extern "C" int divdc3_test();
#endif
extern "C" int divdf3_test();
extern "C" int divdf3vfp_test();
extern "C" int divdi3_test();
extern "C" int divmodsi4_test();
extern "C" int divmodti4_test();
#ifdef HAS_COMPLEX
extern "C" int divsc3_test();
#endif
extern "C" int divsf3_test();
extern "C" int divsf3vfp_test();
extern "C" int divsi3_test();
#ifdef HAS_COMPLEX
extern "C" int divtc3_test();
#endif
extern "C" int divtf3_test();
extern "C" int divti3_test();
#ifdef HAS_COMPLEX
extern "C" int divxc3_test();
#endif
extern "C" int enable_execute_stack_test();
extern "C" int eqdf2vfp_test();
extern "C" int eqsf2vfp_test();
extern "C" int eqtf2_test();
extern "C" int extenddftf2_test();
extern "C" int extendhfsf2_test();
extern "C" int extendhftf2_test();
extern "C" int extendsfdf2vfp_test();
extern "C" int extendsftf2_test();
#if 0
extern "C" int gcc_personality_test();
#endif
extern "C" int gedf2vfp_test();
extern "C" int gesf2vfp_test();
extern "C" int getf2_test();
extern "C" int gtdf2vfp_test();
extern "C" int gtsf2vfp_test();
extern "C" int gttf2_test();
extern "C" int ledf2vfp_test();
extern "C" int lesf2vfp_test();
extern "C" int letf2_test();
extern "C" int lshrdi3_test();
extern "C" int lshrti3_test();
extern "C" int ltdf2vfp_test();
extern "C" int ltsf2vfp_test();
extern "C" int lttf2_test();
extern "C" int moddi3_test();
extern "C" int modsi3_test();
extern "C" int modst3_test();
extern "C" int modti3_test();
#ifdef HAS_COMPLEX
extern "C" int muldc3_test();
#endif
extern "C" int muldf3vfp_test();
extern "C" int muldi3_test();
extern "C" int mulodi4_test();
extern "C" int mulosi4_test();
extern "C" int muloti4_test();
#ifdef HAS_COMPLEX
extern "C" int mulsc3_test();
#endif
extern "C" int mulsf3vfp_test();
//extern "C" int mulsi3_test(); no this mulsi3.c
#ifdef HAS_COMPLEX
extern "C" int multc3_test();
#endif
extern "C" int multf3_test();
extern "C" int multi3_test();
extern "C" int mulvdi3_test();
extern "C" int mulvsi3_test();
extern "C" int mulvti3_test();
#ifdef HAS_COMPLEX
extern "C" int mulxc3_test();
#endif
extern "C" int nedf2vfp_test();
extern "C" int negdf2vfp_test();
extern "C" int negdi2_test();
extern "C" int negsf2vfp_test();
extern "C" int negti2_test();
extern "C" int negvdi2_test();
extern "C" int negvsi2_test();
extern "C" int negvti2_test();
extern "C" int nesf2vfp_test();
extern "C" int netf2_test();
/* need rand, signbit, ...
extern "C" int paritydi2_test();
extern "C" int paritysi2_test();
extern "C" int parityti2_test();
extern "C" int popcountdi2_test();
extern "C" int popcountsi2_test();
extern "C" int popcountti2_test();
extern "C" int powidf2_test();
extern "C" int powisf2_test();
extern "C" int powitf2_test();
extern "C" int powixf2_test();
*/
extern "C" int subdf3vfp_test();
extern "C" int subsf3vfp_test();
extern "C" int subtf3_test();
extern "C" int subvdi3_test();
extern "C" int subvsi3_test();
extern "C" int subvti3_test();
extern "C" int trampoline_setup_test();
extern "C" int truncdfhf2_test();
extern "C" int truncdfsf2_test();
extern "C" int truncdfsf2vfp_test();
extern "C" int truncsfhf2_test();
extern "C" int trunctfdf2_test();
extern "C" int trunctfhf2_test();
extern "C" int trunctfsf2_test();
extern "C" int ucmpdi2_test();
extern "C" int ucmpti2_test();
extern "C" int udivdi3_test();
extern "C" int udivmoddi4_test();
extern "C" int udivmodsi4_test();
extern "C" int udivmodti4_test();
extern "C" int udivsi3_test();
extern "C" int udivti3_test();
extern "C" int umoddi3_test();
extern "C" int umodsi3_test();
extern "C" int umodti3_test();
extern "C" int unorddf2vfp_test();
extern "C" int unordsf2vfp_test();
extern "C" int unordtf2_test();

void show_result(const char *fn, int res) {
  if (res == 1)
    printf("%s: FAIL!\n", fn);
  else if (res == 0)
    printf("%s: PASS!\n", fn);
  else if (res == -1)
    printf("%s: SKIPPED!\n", fn);
  else {
    printf("FIXME!");
    abort();
  }
}

int main() {
  int res = 0;

// pre-defined compiler macro (from llc -march=cpu0${ENDIAN} or
// clang -target cpu0${ENDIAN}-unknown-linux-gnu
#ifdef __CPU0EB__
  printf("__CPU0EB__\n");
#endif
#ifdef __CPU0EL__
  printf("__CPU0EL__\n");
#endif

  res = absvdi2_test();
  show_result("absvdi2_test()", res);

  res = absvsi2_test();
  show_result("absvsi2_test()", res);

  res = absvti2_test();
  show_result("absvti2_test()", res);

  res = adddf3vfp_test();
  show_result("adddf3vfp_test()", res);

  res = addsf3vfp_test();
  show_result("addsf3vfp_test()", res);

  res = addvdi3_test();
  show_result("addvdi3_test()", res);

  res = addvsi3_test();
  show_result("addvsi3_test()", res);

  res = addvti3_test();
  show_result("addvti3_test()", res);

  res = ashldi3_test();
  show_result("ashldi3_test()", res);

  res = ashlti3_test();
  show_result("ashlti3_test()", res);

  res = ashrdi3_test();
  show_result("ashrdi3_test()", res);

  res = ashrti3_test();
  show_result("ashrti3_test()", res);

#if 0 // atomic.c need memcmp(...)
  res = atomic_test();
  show_result("atomic_test()", res);
#endif

  res = bswapdi2_test();
  show_result("bswapdi2_test()", res);

  res = bswapsi2_test();
  show_result("bswapsi2_test()", res);

  res = clzdi2_test();
  show_result("clzdi2_test()", res);

  res = clzsi2_test();
  show_result("clzsi2_test()", res);

  res = clzti2_test();
  show_result("clzti2_test()", res);

  res = cmpdi2_test();
  show_result("cmpdi2_test()", res);

  res = cmpti2_test();
  show_result("cmpti2_test()", res);

  res = comparedf2_test();
  show_result("comparedf2_test()", res);

  res = comparesf2_test();
  show_result("comparesf2_test()", res);

//  res = compiler_rt_logb_test();
//  show_result("compiler_rt_logb_test()", res);

//  res = compiler_rt_logbf_test();
//  show_result("compiler_rt_logbf_test()", res);

//  res = compiler_rt_logbl_test();
//  show_result("compiler_rt_logbl_test()", res);

  res = cpu_model_test();
  show_result("cpu_model_test()", res);

  res = ctzdi2_test();
  show_result("ctzdi2_test()", res);

  res = ctzsi2_test();
  show_result("ctzsi2_test()", res);

  res = ctzti2_test();
  show_result("ctzti2_test()", res);

#ifdef HAS_COMPLEX
  res = divdc3_test();
  show_result("divdc3_test()", res);
#endif

  res = divdf3_test();
  show_result("divdf3_test()", res);

  res = divdf3vfp_test();
  show_result("divdf3vfp_test()", res);

  res = divdi3_test();
  show_result("divdi3_test()", res);

  res = divmodsi4_test();
  show_result("divmodsi4_test()", res);

  res = divmodti4_test();
  show_result("divmodti4_test()", res);

#ifdef HAS_COMPLEX
  res = divsc3_test();
  show_result("divsc3_test()", res);
#endif

  res = divsf3_test();
  show_result("divsf3_test()", res);

  res = divsf3vfp_test();
  show_result("divsf3vfp_test()", res);

  res = divsi3_test();
  show_result("divsi3_test()", res);

#ifdef HAS_COMPLEX
  res = divtc3_test();
  show_result("divtc3_test()", res);
#endif

  res = divtf3_test();
  show_result("divtf3_test()", res);

  res = divti3_test();
  show_result("divti3_test()", res);

#ifdef HAS_COMPLEX
  res = divxc3_test();
  show_result("divxc3_test()", res);
#endif

#if 0
  res = enable_execute_stack_test();
  show_result("enable_execute_stack_test()", res);
#endif

  res = eqdf2vfp_test();
  show_result("eqdf2vfp_test()", res);

  res = eqsf2vfp_test();
  show_result("eqsf2vfp_test()", res);

  res = eqtf2_test();
  show_result("eqtf2_test()", res);

  res = extenddftf2_test();
  show_result("extenddftf2_test()", res);

  res = extendhfsf2_test();
  show_result("extendhfsf2_test()", res);

  res = extendhftf2_test();
  show_result("extendhftf2_test()", res);

  res = extendsfdf2vfp_test();
  show_result("extendsfdf2vfp_test()", res);

  res = extendsftf2_test();
  show_result("extendsftf2_test()", res);

#if 0
  res = gcc_personality_test();
  show_result("gcc_personality_test()", res);
#endif

  res = gedf2vfp_test();
  show_result("gedf2vfp_test()", res);

  res = gesf2vfp_test();
  show_result("gesf2vfp_test()", res);

  res = getf2_test();
  show_result("getf2_test()", res);

  res = gtdf2vfp_test();
  show_result("gtdf2vfp_test()", res);

  res = gtsf2vfp_test();
  show_result("gtsf2vfp_test()", res);

  res = gttf2_test();
  show_result("gttf2_test()", res);

  res = ledf2vfp_test();
  show_result("ledf2vfp_test()", res);

  res = lesf2vfp_test();
  show_result("lesf2vfp_test()", res);

  res = letf2_test();
  show_result("letf2_test()", res);

  res = lshrdi3_test();
  show_result("lshrdi3_test()", res);

  res = lshrti3_test();
  show_result("lshrti3_test()", res);

  res = ltdf2vfp_test();
  show_result("ltdf2vfp_test()", res);

  res = ltsf2vfp_test();
  show_result("ltsf2vfp_test()", res);

  res = lttf2_test();
  show_result("lttf2_test()", res);

  res = moddi3_test();
  show_result("moddi3_test()", res);

  res = modsi3_test();
  show_result("modsi3_test()", res);

  res = modti3_test();
  show_result("modti3_test()", res);

#ifdef HAS_COMPLEX
  res = muldc3_test();
  show_result("muldc3_test()", res);
#endif

  res = muldf3vfp_test();
  show_result("muldf3vfp_test()", res);

  res = muldi3_test();
  show_result("muldi3_test()", res);

  res = mulodi4_test();
  show_result("mulodi4_test()", res);

  res = mulosi4_test();
  show_result("mulosi4_test()", res);

  res = muloti4_test();
  show_result("muloti4_test()", res);

#ifdef HAS_COMPLEX
  res = mulsc3_test();
  show_result("mulsc3_test()", res);
#endif

  res = mulsf3vfp_test();
  show_result("mulsf3vfp_test()", res);

// no mulsi3.c
//  res = mulsi3_test();
//  show_result("mulsi3_test()", res);

#ifdef HAS_COMPLEX
  res = multc3_test();
  show_result("multc3_test()", res);
#endif

  res = multf3_test();
  show_result("multf3_test()", res);

  res = multi3_test();
  show_result("multi3_test()", res);

  res = mulvdi3_test();
  show_result("mulvdi3_test()", res);

  res = mulvsi3_test();
  show_result("mulvsi3_test()", res);

  res = mulvti3_test();
  show_result("mulvti3_test()", res);

#ifdef HAS_COMPLEX
  res = mulxc3_test();
  show_result("mulxc3_test()", res);
#endif

  res = nedf2vfp_test();
  show_result("nedf2vfp_test()", res);

  res = negdf2vfp_test();
  show_result("negdf2vfp_test()", res);

  res = negdi2_test();
  show_result("negdi2_test()", res);

  res = negsf2vfp_test();
  show_result("negsf2vfp_test()", res);

  res = negti2_test();
  show_result("negti2_test()", res);

  res = negvdi2_test();
  show_result("negvdi2_test()", res);

  res = negvsi2_test();
  show_result("negvsi2_test()", res);

  res = negvti2_test();
  show_result("negvti2_test()", res);

  res = nesf2vfp_test();
  show_result("nesf2vfp_test()", res);

  res = netf2_test();
  show_result("netf2_test()", res);

/* need rand, signbit, ...
  res = paritydi2_test();
  show_result("paritydi2_test()", res);

  res = paritysi2_test();
  show_result("paritysi2_test()", res);

  res = parityti2_test();
  show_result("parityti2_test()", res);

  res = popcountdi2_test();
  show_result("popcountdi2_test()", res);

  res = popcountsi2_test();
  show_result("popcountsi2_test()", res);

  res = popcountti2_test();
  show_result("popcountti2_test()", res);

  res = powidf2_test();
  show_result("powidf2_test()", res);

  res = powisf2_test();
  show_result("powisf2_test()", res);

  res = powitf2_test();
  show_result("powitf2_test()", res);

  res = powixf2_test();
  show_result("powixf2_test()", res);
*/

  res = subdf3vfp_test();
  show_result("subdf3vfp_test()", res);

  res = subsf3vfp_test();
  show_result("subsf3vfp_test()", res);

  res = subtf3_test();
  show_result("subtf3_test()", res);

  res = subvdi3_test();
  show_result("subvdi3_test()", res);

  res = subvsi3_test();
  show_result("subvsi3_test()", res);

  res = subvti3_test();
  show_result("subvti3_test()", res);

  res = trampoline_setup_test();
  show_result("trampoline_setup_test()", res);

  res = truncdfhf2_test();
  show_result("truncdfhf2_test()", res);

  res = truncdfsf2_test();
  show_result("truncdfsf2_test()", res);

  res = truncdfsf2vfp_test();
  show_result("truncdfsf2vfp_test()", res);

  res = truncsfhf2_test();
  show_result("truncsfhf2_test()", res);

  res = trunctfdf2_test();
  show_result("trunctfdf2_test()", res);

  res = trunctfhf2_test();
  show_result("trunctfhf2_test()", res);

  res = trunctfsf2_test();
  show_result("trunctfsf2_test()", res);

  res = ucmpdi2_test();
  show_result("ucmpdi2_test()", res);

  res = ucmpti2_test();
  show_result("ucmpti2_test()", res);

  res = udivdi3_test();
  show_result("udivdi3_test()", res);

  res = udivmoddi4_test();
  show_result("udivmoddi4_test()", res);

  res = udivmodsi4_test();
  show_result("udivmodsi4_test()", res);

  res = udivmodti4_test();
  show_result("udivmodti4_test()", res);

  res = udivsi3_test();
  show_result("udivsi3_test()", res);

  res = udivti3_test();
  show_result("udivti3_test()", res);

  res = umoddi3_test();
  show_result("umoddi3_test()", res);

  res = umodsi3_test();
  show_result("umodsi3_test()", res);

  res = umodti3_test();
  show_result("umodti3_test()", res);

  res = unorddf2vfp_test();
  show_result("unorddf2vfp_test()", res);

  res = unordsf2vfp_test();
  show_result("unordsf2vfp_test()", res);

  res = unordtf2_test();
  show_result("unordtf2_test()", res);

  return 0;
}

exlbt/input/Makefile.builtins

# CPU and endian passed from command line, such as 
#   "make -f Makefile.builtins CPU=cpu032II ENDIAN=eb or
#   "make -f Makefile.builtins CPU=cpu032I ENDIAN=el

# start.cpp must be put at beginning
SRCS :=  start.cpp debug.cpp syscalls.c sanitizer_printf.cpp printf-stdarg-def.c \
  compiler-rt-test/builtins/Unit/absvdi2_test.c \
  compiler-rt-test/builtins/Unit/absvsi2_test.c \
  compiler-rt-test/builtins/Unit/absvti2_test.c \
  compiler-rt-test/builtins/Unit/adddf3vfp_test.c \
  compiler-rt-test/builtins/Unit/addsf3vfp_test.c \
  compiler-rt-test/builtins/Unit/addvdi3_test.c \
  compiler-rt-test/builtins/Unit/addvsi3_test.c \
  compiler-rt-test/builtins/Unit/addvti3_test.c \
  compiler-rt-test/builtins/Unit/ashldi3_test.c \
  compiler-rt-test/builtins/Unit/ashlti3_test.c \
  compiler-rt-test/builtins/Unit/ashrdi3_test.c \
  compiler-rt-test/builtins/Unit/ashrti3_test.c \
  compiler-rt-test/builtins/Unit/bswapdi2_test.c \
  compiler-rt-test/builtins/Unit/bswapsi2_test.c \
  compiler-rt-test/builtins/Unit/clzdi2_test.c \
  compiler-rt-test/builtins/Unit/clzsi2_test.c \
  compiler-rt-test/builtins/Unit/clzti2_test.c \
  compiler-rt-test/builtins/Unit/cmpdi2_test.c \
  compiler-rt-test/builtins/Unit/cmpti2_test.c \
  compiler-rt-test/builtins/Unit/comparedf2_test.c \
  compiler-rt-test/builtins/Unit/comparesf2_test.c \
  compiler-rt-test/builtins/Unit/cpu_model_test.c \
  compiler-rt-test/builtins/Unit/ctzdi2_test.c \
  compiler-rt-test/builtins/Unit/ctzsi2_test.c \
  compiler-rt-test/builtins/Unit/ctzti2_test.c \
  compiler-rt-test/builtins/Unit/divdc3_test.c \
  compiler-rt-test/builtins/Unit/divdf3_test.c \
  compiler-rt-test/builtins/Unit/divdf3vfp_test.c \
  compiler-rt-test/builtins/Unit/divdi3_test.c \
  compiler-rt-test/builtins/Unit/divmodsi4_test.c \
  compiler-rt-test/builtins/Unit/divmodti4_test.c \
  compiler-rt-test/builtins/Unit/divsc3_test.c \
  compiler-rt-test/builtins/Unit/divsf3_test.c \
  compiler-rt-test/builtins/Unit/divsf3vfp_test.c \
  compiler-rt-test/builtins/Unit/divsi3_test.c \
  compiler-rt-test/builtins/Unit/divtc3_test.c \
  compiler-rt-test/builtins/Unit/divtf3_test.c \
  compiler-rt-test/builtins/Unit/divti3_test.c \
  compiler-rt-test/builtins/Unit/divxc3_test.c \
  compiler-rt-test/builtins/Unit/enable_execute_stack_test.c \
  compiler-rt-test/builtins/Unit/eqdf2vfp_test.c \
  compiler-rt-test/builtins/Unit/eqsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/eqtf2_test.c \
  compiler-rt-test/builtins/Unit/extenddftf2_test.c \
  compiler-rt-test/builtins/Unit/extendhfsf2_test.c \
  compiler-rt-test/builtins/Unit/extendhftf2_test.c \
  compiler-rt-test/builtins/Unit/extendsfdf2vfp_test.c \
  compiler-rt-test/builtins/Unit/extendsftf2_test.c \
  compiler-rt-test/builtins/Unit/gedf2vfp_test.c \
  compiler-rt-test/builtins/Unit/gesf2vfp_test.c \
  compiler-rt-test/builtins/Unit/getf2_test.c \
  compiler-rt-test/builtins/Unit/gtdf2vfp_test.c \
  compiler-rt-test/builtins/Unit/gtsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/gttf2_test.c \
  compiler-rt-test/builtins/Unit/ledf2vfp_test.c \
  compiler-rt-test/builtins/Unit/lesf2vfp_test.c \
  compiler-rt-test/builtins/Unit/letf2_test.c \
  compiler-rt-test/builtins/Unit/lshrdi3_test.c \
  compiler-rt-test/builtins/Unit/lshrti3_test.c \
  compiler-rt-test/builtins/Unit/ltdf2vfp_test.c \
  compiler-rt-test/builtins/Unit/ltsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/lttf2_test.c \
  compiler-rt-test/builtins/Unit/moddi3_test.c \
  compiler-rt-test/builtins/Unit/modsi3_test.c \
  compiler-rt-test/builtins/Unit/modti3_test.c \
  compiler-rt-test/builtins/Unit/muldc3_test.c \
  compiler-rt-test/builtins/Unit/muldf3vfp_test.c \
  compiler-rt-test/builtins/Unit/muldi3_test.c \
  compiler-rt-test/builtins/Unit/mulodi4_test.c \
  compiler-rt-test/builtins/Unit/mulosi4_test.c \
  compiler-rt-test/builtins/Unit/muloti4_test.c \
  compiler-rt-test/builtins/Unit/mulsc3_test.c \
  compiler-rt-test/builtins/Unit/mulsf3vfp_test.c \
  compiler-rt-test/builtins/Unit/mulsi3_test.c \
  compiler-rt-test/builtins/Unit/multc3_test.c \
  compiler-rt-test/builtins/Unit/multf3_test.c \
  compiler-rt-test/builtins/Unit/multi3_test.c \
  compiler-rt-test/builtins/Unit/mulvdi3_test.c \
  compiler-rt-test/builtins/Unit/mulvsi3_test.c \
  compiler-rt-test/builtins/Unit/mulvti3_test.c \
  compiler-rt-test/builtins/Unit/mulxc3_test.c \
  compiler-rt-test/builtins/Unit/nedf2vfp_test.c \
  compiler-rt-test/builtins/Unit/negdf2vfp_test.c \
  compiler-rt-test/builtins/Unit/negdi2_test.c \
  compiler-rt-test/builtins/Unit/negsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/negti2_test.c \
  compiler-rt-test/builtins/Unit/negvdi2_test.c \
  compiler-rt-test/builtins/Unit/negvsi2_test.c \
  compiler-rt-test/builtins/Unit/negvti2_test.c \
  compiler-rt-test/builtins/Unit/nesf2vfp_test.c \
  compiler-rt-test/builtins/Unit/netf2_test.c \
  compiler-rt-test/builtins/Unit/paritydi2_test.c \
  compiler-rt-test/builtins/Unit/paritysi2_test.c \
  compiler-rt-test/builtins/Unit/parityti2_test.c \
  compiler-rt-test/builtins/Unit/popcountdi2_test.c \
  compiler-rt-test/builtins/Unit/popcountsi2_test.c \
  compiler-rt-test/builtins/Unit/popcountti2_test.c \
  compiler-rt-test/builtins/Unit/powidf2_test.c \
  compiler-rt-test/builtins/Unit/powisf2_test.c \
  compiler-rt-test/builtins/Unit/powitf2_test.c \
  compiler-rt-test/builtins/Unit/powixf2_test.c \
  compiler-rt-test/builtins/Unit/subdf3vfp_test.c \
  compiler-rt-test/builtins/Unit/subsf3vfp_test.c \
  compiler-rt-test/builtins/Unit/subtf3_test.c \
  compiler-rt-test/builtins/Unit/subvdi3_test.c \
  compiler-rt-test/builtins/Unit/subvsi3_test.c \
  compiler-rt-test/builtins/Unit/subvti3_test.c \
  compiler-rt-test/builtins/Unit/trampoline_setup_test.c \
  compiler-rt-test/builtins/Unit/truncdfhf2_test.c \
  compiler-rt-test/builtins/Unit/truncdfsf2_test.c \
  compiler-rt-test/builtins/Unit/truncdfsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/truncsfhf2_test.c \
  compiler-rt-test/builtins/Unit/trunctfdf2_test.c \
  compiler-rt-test/builtins/Unit/trunctfhf2_test.c \
  compiler-rt-test/builtins/Unit/trunctfsf2_test.c \
  compiler-rt-test/builtins/Unit/ucmpdi2_test.c \
  compiler-rt-test/builtins/Unit/ucmpti2_test.c \
  compiler-rt-test/builtins/Unit/udivdi3_test.c \
  compiler-rt-test/builtins/Unit/udivmoddi4_test.c \
  compiler-rt-test/builtins/Unit/udivmodsi4_test.c \
  compiler-rt-test/builtins/Unit/udivmodti4_test.c \
  compiler-rt-test/builtins/Unit/udivsi3_test.c \
  compiler-rt-test/builtins/Unit/udivti3_test.c \
  compiler-rt-test/builtins/Unit/umoddi3_test.c \
  compiler-rt-test/builtins/Unit/umodsi3_test.c \
  compiler-rt-test/builtins/Unit/umodti3_test.c \
  compiler-rt-test/builtins/Unit/unorddf2vfp_test.c \
  compiler-rt-test/builtins/Unit/unordsf2vfp_test.c \
  compiler-rt-test/builtins/Unit/unordtf2_test.c \
  cpu0-builtins.cpp ch_builtins.cpp lib_cpu0.c

INC_DIRS := ./ $(LBDEX_DIR)/input \
            $(HOME)/llvm/llvm-project/compiler-rt/lib/builtins \
            $(NEWLIB_DIR)/newlib/libc/include \
            $(NEWLIB_DIR)/libgloss 
LIBBUILTINS_DIR := ../compiler-rt/builtins
LIBS := $(LIBBUILTINS_DIR)/build-$(CPU)-$(ENDIAN)/libbuiltins.a \
        $(NEWLIB_DIR)/build-$(CPU)-$(ENDIAN)/libm.a \
        $(NEWLIB_DIR)/build-$(CPU)-$(ENDIAN)/libc.a

include Common.mk

Run as follows,

chungshu@ChungShudeMacBook-Air input % bash make.sh cpu032II eb Makefile.builtins
...
chungshu@ChungShudeMacBook-Air verilog % ./cpu0IIs
...
absvdi2_test(): PASS!
absvsi2_test(): PASS!
absvti2_test(): SKIPPED!
adddf3vfp_test(): SKIPPED!
addsf3vfp_test(): SKIPPED!
addvdi3_test(): PASS!
addvsi3_test(): PASS!
addvti3_test(): SKIPPED!
ashldi3_test(): PASS!
ashlti3_test(): SKIPPED!
ashrdi3_test(): PASS!
ashrti3_test(): SKIPPED!
bswapdi2_test(): PASS!
bswapsi2_test(): PASS!
clzdi2_test(): PASS!
clzsi2_test(): PASS!
clzti2_test(): SKIPPED!
cmpdi2_test(): PASS!
cmpti2_test(): SKIPPED!
comparedf2_test(): PASS!
comparesf2_test(): PASS!
cpu_model_test(): SKIPPED!
ctzdi2_test(): PASS!
ctzsi2_test(): PASS!
ctzti2_test(): SKIPPED!
divdc3_test(): PASS!
divdf3_test(): PASS!
divdf3vfp_test(): SKIPPED!
divdi3_test(): PASS!
divmodsi4_test(): PASS!
divmodti4_test(): SKIPPED!
divsf3_test(): PASS!
divsf3vfp_test(): SKIPPED!
divsi3_test(): PASS!
divtc3_test(): PASS!
divtf3_test(): SKIPPED!
divti3_test(): SKIPPED!
divxc3_test(): PASS!
eqdf2vfp_test(): SKIPPED!
eqsf2vfp_test(): SKIPPED!
eqtf2_test(): SKIPPED!
extenddftf2_test(): SKIPPED!
extendhfsf2_test(): PASS!
extendhftf2_test(): SKIPPED!
extendsfdf2vfp_test(): SKIPPED!
extendsftf2_test(): SKIPPED!
gedf2vfp_test(): SKIPPED!
gesf2vfp_test(): SKIPPED!
getf2_test(): SKIPPED!
gtdf2vfp_test(): SKIPPED!
gtsf2vfp_test(): SKIPPED!
gttf2_test(): SKIPPED!
ledf2vfp_test(): SKIPPED!
lesf2vfp_test(): SKIPPED!
letf2_test(): SKIPPED!
lshrdi3_test(): PASS!
lshrti3_test(): SKIPPED!
ltdf2vfp_test(): SKIPPED!
ltsf2vfp_test(): SKIPPED!
lttf2_test(): SKIPPED!
moddi3_test(): PASS!
modsi3_test(): PASS!
modti3_test(): SKIPPED!
muldc3_test(): PASS!
muldf3vfp_test(): SKIPPED!
muldi3_test(): PASS!
mulodi4_test(): PASS!
mulosi4_test(): PASS!
muloti4_test(): SKIPPED!
mulsc3_test(): PASS!
mulsf3vfp_test(): SKIPPED!
multc3_test(): SKIPPED!
multf3_test(): SKIPPED!
multi3_test(): SKIPPED!
mulvdi3_test(): PASS!
mulvsi3_test(): PASS!
mulvti3_test(): SKIPPED!
mulxc3_test(): PASS!
nedf2vfp_test(): SKIPPED!
negdf2vfp_test(): SKIPPED!
negdi2_test(): PASS!
negsf2vfp_test(): SKIPPED!
negti2_test(): SKIPPED!
negvdi2_test(): PASS!
negvsi2_test(): PASS!
negvti2_test(): SKIPPED!
nesf2vfp_test(): SKIPPED!
netf2_test(): SKIPPED!
subdf3vfp_test(): SKIPPED!
subsf3vfp_test(): SKIPPED!
subtf3_test(): SKIPPED!
subvdi3_test(): PASS!
subvsi3_test(): PASS!
subvti3_test(): SKIPPED!
trampoline_setup_test(): SKIPPED!
truncdfhf2_test(): PASS!
truncdfsf2_test(): PASS!
truncdfsf2vfp_test(): SKIPPED!
truncsfhf2_test(): PASS!
trunctfdf2_test(): SKIPPED!
trunctfhf2_test(): SKIPPED!
trunctfsf2_test(): SKIPPED!
ucmpdi2_test(): PASS!
ucmpti2_test(): SKIPPED!
udivdi3_test(): PASS!
udivmoddi4_test(): PASS!
udivmodsi4_test(): PASS!
udivmodti4_test(): SKIPPED!
udivsi3_test(): PASS!
udivti3_test(): SKIPPED!
umoddi3_test(): PASS!
umodsi3_test(): PASS!
umodti3_test(): SKIPPED!
unorddf2vfp_test(): SKIPPED!
unordsf2vfp_test(): SKIPPED!
unordtf2_test(): SKIPPED!
...
RET to PC < 0, finished!