Appendix A: RISCV¶

This chapter shows the installation of the RISC toolchain, including GNU, LLVM, and simulators on Linux, as illustrated in the figure and table below.

digraph G {

rankdir=LR;
subgraph cluster_0 {
style=filled;
color=lightgrey;
// label = "RISCV toolchain flow";
node [style=filled,color=white]; usercode [label = "user program"];
node [style=filled,color=white]; sflib [label = "lib (libm/libc/libstdc++)"];
node [style=filled,color=white]; linker [label = "lld or ld"];
node [style=filled,color=white]; simulator [label = "qemu or gem5"];
node [style=filled,color=white]; clang, llvm, gdb;
usercode -> clang;
sflib -> clang;
clang -> llvm [ label = "IR" ];
llvm -> linker [ label = "obj" ];
linker -> simulator [ label = "exe" ];
linker -> gdb [ label = "exe" ];
simulator -> gdb;
gdb -> simulator;
}

} — Fig. 12 RISCV toolchain flow¶

Table 6 RISCV toolchain [1]¶
Component	name	github
C/C++ Compiler	clang/llvm	llvm-project
LLVM Assembler	llvm integrated assembler	“
LLVM Linker	ld.lld	“
debug tool	lldb	“
Utils	llvm-ar, llvm-objdump etc.	“
gcc Assembler	as	riscv-gnu-toolchain
gcc Linker	ld.bfd ld.gold	“
Runtime	libgcc	“
Unwinder	libgcc_s	“
C library	libc	“
C++ library	libsupc++ libstdc++	“
debug tool	gdb	“
Utils	ar, objdump etc.	“
Functional sim	qemu	qemu
Cycle sim	gem5	gem5

ISA ¶

_images/riscv-isa.png — Fig. 13 RISCV ISA¶

_images/isa-desc.png — Fig. 14 RISCV ISA Description¶

As shown in Fig. 13 and Fig. 14, RISC has 32/64/128 bit variants. The I (integer) extension is the base part, and others are optional. G = IMAFD, the general extensions (i.e., I, M, A, F, D) [2] [3] [4].

Since RISC-V has vector instructions supporting variable-length data and allows vendors to encode variable-length instruction sets, little endian is the dominant format in the market [5].

Mem ¶

I-cache, D-cache: Size ranges from 4KB to 64KB in Andes N25f.
ILM, DLM: Size ranges from 4KB to 16MB [6].
- For deterministic and efficient program execution
- Flexible size selection to fit diversified needs
DRAM

RISC compiler toolchain installation ¶

First, install the dependent packages following https://github.com/riscv-collab/riscv-gnu-toolchain#readme. Next, create your $HOME/riscv and $HOME/riscv/git directories. Then run the following bash script.

exlbt/riscv/riscv-toolchain-setup.sh

#!/usr/bin/env bash

# Verified on ubuntu 18.04
# mkdir riscv/git, riscv/riscv_newlib, riscv_linux before running this bash script
export LLVM_VER=14.x
#export LLVM_VER=13.0.0
export GNU_SRC_DIR=$HOME/riscv/git
export LLVM_SRC_DIR=$HOME/riscv/git/$LLVM_VER

export GNU_NEWLIB_INSTALL_DIR=$HOME/riscv/$LLVM_VER/riscv_newlib
export LLVM_NEWLIB_BUILD_DIR=$LLVM_SRC_DIR/llvm-project/build_riscv_newlib

export GNU_LINUX_INSTALL_DIR=$HOME/riscv/$LLVM_VER/riscv_linux
export LLVM_LINUX_BUILD_DIR=$LLVM_SRC_DIR/llvm-project/build_riscv_linux

riscv_gnu_toolchain_prerequisites() {
  sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev \
  libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool \
  patchutils bc zlib1g-dev libexpat-dev
  if [ ! -f "/usr/bin/python" ]; then
    sudo ln -s /usr/bin/python3 /usr/bin/python
  fi
}

riscv_llvm_prerequisites() {
  sudo apt-get install ninja-build
}

get_llvm() {
  if [ ! -d "$GNU_SRC_DIR" ]; then
    echo "GNU_SRC_DIR: $GNU_SRC_DIR not exist"
    exit 1
  fi
  rm -rf $LLVM_SRC_DIR
  mkdir $LLVM_SRC_DIR
  pushd $LLVM_SRC_DIR
  git clone https://github.com/llvm/llvm-project.git
  cd llvm-project
  git checkout -b $LLVM_VER origin/release/$LLVM_VER
  popd
}

check() {
  if [ ! -d "$GNU_SRC_DIR" ]; then
    echo "GNU_SRC_DIR: $GNU_SRC_DIR not exist"
    exit 1
  fi
  if [ -d "$GNU_NEWLIB_INSTALL_DIR" ]; then
    echo "GNU_NEWLIB_INSTALL_DIR: $GNU_NEWLIB_INSTALL_DIR exist. Remove it before running."
    exit 1
  fi
  if [ -d "$GNU_LINUX_INSTALL_DIR" ]; then
    echo "GNU_LINUX_INSTALL_DIR: $GNU_LINUX_INSTALL_DIR exist. Remove it before running."
    exit 1
  fi
}

build_gnu_toolchain() {
  pushd $GNU_SRC_DIR
  git clone https://github.com/riscv/riscv-gnu-toolchain
  cd riscv-gnu-toolchain
#  Looks branch change from original/rvv-intrinsic to origin/__archive__
#  git checkout -b rvv-intrinsic origin/rvv-intrinsic
# commit 409b951ba6621f2f115aebddfb15ce2dd78ec24f of master branch is work for vadd.vv of vadd1.c
  mkdir build_newlib
  cd build_newlib
# NX27V is 32-64 bits configurable and has HW float point
  ../configure --prefix=$GNU_NEWLIB_INSTALL_DIR \
  --with-arch=rv64gc --with-abi=lp64d
#  --with-multilib-generator="rv32i-ilp32--;rv32imafd-ilp32--;rv64ima-lp64--"
  make -j4

  cd ..
  mkdir build_linux
  cd build_linux
  ../configure --prefix=$GNU_LINUX_INSTALL_DIR
  make linux -j4
  popd
}

# LLVM_OPTIMIZED_TABLEGEN: Builds a release tablegen that gets used during the LLVM build. This can dramatically speed up debug builds.
# LLVM_INSTALL_TOOLCHAIN_ONLY default is OFF already. Check CmakeCache.txt.
#   https://llvm.org/docs/BuildingADistribution.html?highlight=llvm_install_toolchain_only#difference-between-install-and-install-distribution
# LLVM_BINUTILS_INCDIR:
#   https://stackoverflow.com/questions/45715423/how-to-enable-cfi-in-llvm
#   For lld. https://llvm.org/docs/GoldPlugin.html
#   For llvm version 13.0.0 -DLLVM_BINUTILS_INCDIR will fail on ninja as follows,
#   /home/jonathanchen/riscv/git/13.0.0/llvm-project/llvm/tools/gold/gold-plugin.cpp:38:10: fatal error: plugin-api.h: No such file or directory
#    #include <plugin-api.h>
#             ^~~~~~~~~~~~~~
# -DLLVM_BINUTILS_INCDIR=$GNU_SRC_DIR/riscv-gnu-toolchain/riscv-binutils/include will incurs above fail on 13.x
# DEFAULT_SYSROOT: 
#   https://stackoverflow.com/questions/66357013/compile-clang-with-alternative-sysroot
# LLVM_DEFAULT_TARGET_TRIPLE:  
#   https://clang.llvm.org/docs/CrossCompilation.html#general-cross-compilation-options-in-clang
# LLVM_INSTALL_UTILS:BOOL
#   If enabled, utility binaries like FileCheck and not will be installed to CMAKE_INSTALL_PREFIX.
#   https://llvm.org/docs/CMake.html
# Use "clang --sysroot" if did not "cmake -DDEFAULT_SYSROOT"
# $LLVM_NEWLIB_BUILD_DIR/bin/clang++ --gcc-toolchain=$GNU_NEWLIB_INSTALL_DIR test.cpp -march=rv64g -O0 -mabi=lp64d -v
# $LLVM_LINUX_BUILD_DIR/bin/clang++ --gcc-toolchain=$GNU_LINUX_INSTALL_DIR --sysroot=$GNU_LINUX_INSTALL_DIR/sysroot/ --static test.cpp -march=rv64g -O0 -mabi=lp64d -v
build_llvm_toolchain() {
  rm -rf $LLVM_NEWLIB_BUILD_DIR
  mkdir $LLVM_NEWLIB_BUILD_DIR
  pushd $LLVM_NEWLIB_BUILD_DIR
  cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Debug -DLLVM_TARGETS_TO_BUILD="RISCV" \
  -DLLVM_ENABLE_PROJECTS="clang;lld"  \
  -DLLVM_OPTIMIZED_TABLEGEN=On \
  -DCMAKE_INSTALL_PREFIX=$GNU_NEWLIB_INSTALL_DIR -DLLVM_PARALLEL_COMPILE_JOBS=4 \
  -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_DEFAULT_TARGET_TRIPLE=riscv64-unknown-elf \
  -DDEFAULT_SYSROOT=$GNU_NEWLIB_INSTALL_DIR/riscv64-unknown-elf \
  -DLLVM_INSTALL_UTILS=ON ../llvm
  ninja
  ninja install
  popd
  rm -rf $LLVM_LINUX_BUILD_DIR
  mkdir $LLVM_LINUX_BUILD_DIR
  pushd $LLVM_LINUX_BUILD_DIR
  cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Debug -DLLVM_TARGETS_TO_BUILD="RISCV" \
  -DLLVM_ENABLE_PROJECTS="clang;lld"  \
  -DLLVM_OPTIMIZED_TABLEGEN=On \
  -DCMAKE_INSTALL_PREFIX=$GNU_LINUX_INSTALL_DIR -DLLVM_PARALLEL_COMPILE_JOBS=4 \
  -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_DEFAULT_TARGET_TRIPLE=riscv64-unknown-linux-gnu \
  -DDEFAULT_SYSROOT=$GNU_LINUX_INSTALL_DIR/sysroot -DLLVM_INSTALL_UTILS=ON ../llvm
  ninja
  ninja install
  popd
}

riscv_gnu_toolchain_prerequisites;
riscv_llvm_prerequisites;
get_llvm;
check;
build_gnu_toolchain;
build_llvm_toolchain;

$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ bash riscv-toolchain-setup.sh

RISCV toolchain includes support for both bare-metal (Newlib) and Linux platforms.

$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ ls $HOME/riscv/riscv_newlib
bin  include  lib  libexec  riscv64-unknown-elf  share

_images/linux-sysroot.png — Fig. 15 RISCV ISA Description¶

The Linux sysroot is shown in Fig. 15 above. You can compare it with the following installed directory.

$ ls $HOME/riscv/riscv_linux/sysroot/
etc  lib  lib64  sbin  usr  var
$ ls $HOME/riscv/riscv_linux/sysroot/usr
bin  include  lib  libexec  sbin  share

Linker Command ¶

Different HW platforms have their own memory map for the RISCV architecture. As a result, their HW may need to initialize $sp, $pc, $gp, and others. The GNU linker command language [15] allows users to specify the memory map for their HW.

The crt0.S and riscv64-virt.ld in lbt/exlbt/riscv are modified from the original as follows:

riscv$ pwd
~/git/lbt/exlbt/riscv
riscv$ diff exlbt/riscv/crt0.S ~/riscv/git/riscv-gnu-toolchain/newlib/libgloss/riscv/crt0.S
22d21
<   la sp, __stack_top
riscv$ ~/riscv/14.x/riscv_newlib/bin/riscv64-unknown-elf-ld --verbose &> riscv64-virt-origin.ld
riscv$ diff riscv64-virt-origin.ld riscv64-virt.ld
1,8d0
< GNU ld (GNU Binutils) 2.39
<   Supported emulations:
<    elf64lriscv
<    elf32lriscv
<    elf64briscv
<    elf32briscv
< using internal linker script:
< ==================================================
16a9,12
> MEMORY
> {
>    RAM (rx)  : ORIGIN = 0x10000, LENGTH = 128M
> }
22a19
>   PROVIDE(__stack_top = ORIGIN(RAM) + LENGTH(RAM));
257,259d253
<
<
< ==================================================
riscv$ ~/riscv/14.x/riscv_newlib/bin/clang hello.c -menable-experimental-extensions \
-march=rv64gcv1p0 -O0 -mabi=lp64d -T riscv64-virt.ld -nostartfiles crt0.S -v

If RAM is used with (rwx) permission, a warning may occur. This warning can be suppressed by adding -Wl,–no-warn-rwx-segment to the clang options.

QEMU in the next section can run the program without initializing $sp in crt0.S. Perhaps QEMU initializes $sp as shown in the following code.

qemu$ pwd
~/riscv/git/qemu
qemu$ vi linux-user/riscv/cpu_loop.c
void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs)
{
  ...
  env->gpr[xSP] = regs->sp;

The ELF object file format uses program headers, which the system loader reads to know how to load the program into memory. The program headers of an ELF file can be displayed using llvm-objdump -p [16].

The instruction “la sp, __stack_top” is a pseudo-instruction in RISCV [17].

QEMU simulator ¶

Install QEMU following the instructions at: https://gitlab.com/qemu-project/qemu as shown below.

$ pwd
$ $HOME/riscv/git
$ git clone https://gitlab.com/qemu-project/qemu.git
$ cd qemu
$ git log
commit a28498b1f9591e12dcbfdf06dc8f54e15926760e
...
$ mkdir build
$ cd build
$ ../configure
$ make

Then, you can compile and run QEMU for bare-metal as follows:

$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ $HOME/riscv/riscv_newlib/bin/clang -march=rv64g hello.c -fuse-ld=lld \
  -mno-relax -g -mabi=lp64d -o hello_newlib
$ $HOME/riscv/git/qemu/build/qemu-riscv64 hello_newlib
hello world!

Also, compile and run QEMU for Linux [18] as follows:

$ $HOME/riscv/riscv_linux/bin/clang -march=rv64g hello.c -o hello_linux -static
$ $HOME/riscv/git/qemu/build/qemu-riscv64 hello_linux
hello world!

RISCV requires linking with -lm for math.h functions, as shown below:

exlbt/riscv/pow.cpp

// RISCV does need -lm while X86-64 does not.
// $ ~/riscv/riscv_newlib/bin/clang++ -menable-experimental-extensions pow.cpp -march=rv64gcv0p10 -O0 -fuse-ld=lld -mno-relax -g -mabi=lp64d -lm  -v
// $ ~/riscv/git/qemu/build/qemu-riscv64 -cpu rv64,v=true a.out

#include <stdio.h>
#include <math.h>

double base = 100;
double power = 2;
double test_math()
{
  double res = 0;

  res = pow(base, power);

  return res;
}

int main() {
  double a = test_math();
  printf("a = %lf\n", a);
  return 0;
}

Assembly code of “Hello, World” can be compiled and run in bare-metal mode as follows:

$ $HOME/riscv/riscv_newlib/bin/riscv64-unknown-elf-gcc -c hello_world.s
$ $HOME/riscv/riscv_newlib/bin/riscv64-unknown-elf-ld hello_world.o -o hello_world
$ $HOME/riscv/git/qemu/build/qemu-riscv64 hello_world
Hello World

Linking between assembly code and C for bare-metal can be done as follows:

exlbt/riscv/caller_hello.c

/* ~/git/lbt/exlbt/riscv$ ~/riscv/riscv_newlib/bin/riscv64-unknown-elf-gcc -c func_hello_start.s 
~/git/lbt/exlbt/riscv$ ~/riscv/riscv_newlib/bin/riscv64-unknown-elf-gcc -c caller_hello.c 
~/git/lbt/exlbt/riscv$ ~/riscv/riscv_newlib/bin/riscv64-unknown-elf-ld caller_hello.o func_hello_start.o
~/git/lbt/exlbt/riscv$ ~/riscv/riscv_newlib/bin/riscv64-unknown-elf-ld caller_hello.o func_hello_start.o -o a.out
~/git/lbt/exlbt/riscv$ ~riscv/git/qemu/build/qemu-riscv64 a.out
Hello World
*/

extern void hello();

int main() {
  hello();
}

exlbt/riscv/func_hello_start.s

.section .text
.globl hello
hello:

    li a0, 0                    # stdout
1:  auipc a1, %pcrel_hi(msg)    # load msg(hi)
    addi a1, a1, %pcrel_lo(1b)  # load msg(lo)
    li a2, 12                   # length
    li a3, 0
    li a7, 64                   # _NR_sys_write
    ecall                       # system call

    li a0, 0
    li a1, 0
    li a2, 0
    li a3, 0
    li a7, 93                   # _NR_sys_exit
    ecall                       # system call

loop:
    j loop

.section .rodata
msg:
    .string "Hello World\n"

.globl _start
_start:
    call main

Gem5 Simulator ¶

Build Gem5 according to the following steps:

https://www.gem5.org/documentation/general_docs/building or http://learning.gem5.org/book/part1/building.html#requirements-for-gem5

If you do not have python3.x-config on Ubuntu 18.04, as shown below:

$ ls /usr/bin/python*
... /usr/bin/python2.7-config

Then install it using pip3 as follows [19]:

$ sudo apt install python3-pip
$ pip3 install scons
$ ls /usr/bin/python*
... /usr/bin/python3-config

After installing all dependencies, clone gem5 and build the RISC-V target as follows:

$ pwd
$ $HOME/riscv/git
$ sudo apt install -y libglib2.0-dev
$ sudo apt install -y libpixman-1-dev
$ git clone https://gem5.googlesource.com/public/gem5
$ cd gem5
$ git log
commit 846dcf0ba4eff824c295f06550b8673ff3f31314
...
$HOME/riscv/git/gem5$ /usr/bin/env python3 $(which scons) ./build/RISCV/gem5.debug -j4
...
$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ $HOME/riscv/git/gem5/build/RISCV/gem5.debug \
$HOME/riscv/git/gem5/configs/example/se.py --cmd=./hello_newlib
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:107: info: Entering event queue @ 0.  Starting simulation...
hello world!

Check the number of cycles as follows:

$HOME/git/lbt/exlbt/riscv$ vi m5out/stats.txt
simSeconds          0.000001       # Number of seconds simulated (Second)
simTicks            1229000        # Number of ticks simulated (Tick)
...

$HOME/git/lbt/exlbt/riscv$ $HOME/riscv/git/gem5/build/RISCV/gem5.debug \
/local/git/gem5/configs/example/se.py --cmd=./hello_linux
...
hello world!
...

The configuration examples for gem5 can be found at the following reference: http://learning.gem5.org/book/part1/example_configs.html

GDB ¶

digraph G {

rankdir=LR;
subgraph cluster_0 {
style=filled;
color=lightgrey;
// label = "GDB flow";
node [style=filled,color=white]; user, gdb;
// node [style=filled,color=white]; linker [label = "lld or ld"];
node [style=filled,color=white]; simulator [label = "qemu "];
// linker -> simulator [ label = "exe" ];
// linker -> gdb [ label = "exe" ];
user -> gdb [label = "debug command"];
gdb -> simulator [label = "debug command(print res, step, ...)"];
simulator -> gdb [label = "response(variable-value, ...)"];
}

} — Fig. 16 GDB flow¶

LLVM 13.x fails to compile RVV C/C++ files with clang -g, while LLVM 14.x works. Run QEMU on terminal A with GDB on terminal B as follows [7]:

// terminal A:
$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ $HOME/riscv/14.x/riscv_newlib/bin/clang vadd1.c -menable-experimental-extensions \
  -march=rv64gcv1p0 -O0 -g -mabi=lp64d -o a.out  -v
$ $HOME/riscv/git/qemu/build/qemu-riscv64 -cpu rv64,v=true -g 1234 a.out
vector version is not specified, use the default value v1.0

// terminal B:
$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ $HOME/riscv/14.x/riscv_newlib/bin/riscv64-unknown-elf-gdb a.out
...
Reading symbols from a.out...
(gdb) target remote :1234
Remote debugging using :1234
0x0000000000010150 in _start ()
(gdb) b vadd1.c:95
Breakpoint 1 at 0x10536: file vadd1.c, line 95.
(gdb) c
Continuing.
Breakpoint 1, main () at vadd1.c:95
95      vOp(a, a, a, array_size(a), VMUL);
(gdb) p a[1]
$1 = 2

// terminal A:
...
vl: 32
array_size(a):4096

a[]: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100
102 104 106 108 110 112 114 116 118 120 122 124 126 ...
The results of VADD:  PASS

// terminal B:
$ ...
(gdb) c

// terminal A:
...
a[]: 0 4 16 36 64 100 144 196 256 324 400 484 576 676 784 900 1024 1156 1296
1444 1600 1764 1936 2116 2304 2500 2704 2916 3136 3364 3600 3844 4096 4356
4624 4900 5184 5476 5776 6084 6400 6724 7056 7396 7744 8100 8464 8836 9216
9604 10000 10404 10816 11236 11664 12100 12544 12996 13456 13924 14400 14884
15376 15876 ...
The results of VMUL:  PASS
$

Since RVV v1.0 is accepted and v0.10 is draft for release v1.0, the above -march option changed from rv64gv0p10 in LLVM 13.x to rv64gcv1p0 in LLVM 14.x [8].

In the example above, both QEMU and GDB run in the same host environment.

When you need more flexibility–for example, running GDB on a physically separate host, or controlling a standalone system over a serial port or a realtime system over a TCP/IP connection [9]. To use a TCP connection, use an argument of the form host:port. Above “target remote :1234”, means host and target run on the same host listening port 1234 [10].

RISCV Calling Convention [11]¶

The RV32 register size is 32 bits. The RV64 register size is 64 bits.

In RV64, 32-bit types, such as int, are stored in integer registers with proper sign extension; that is, bits 63..32 are copies of bit 31.

There are two kinds of ABI: ilp32 and lp64, such as -mabi=ilp32, ilp32f, ilp32d, lp64, lp64f, lp64d. They differ in how float arguments are passed on integer, single-float, or double-float registers.

Table 7 ABI, caller passing integer/float/double arguments [12] [13]¶
name	float	double
ilp32/lp64	a registers	a registers
ilp32f/lp64f	fa registers	a regsiters
ilp32d/lp64d	fa registers	fa registers

Both ilp32 and lp64 are Soft-Float Calling Conventions.

Soft-Float Calling Convention means floating-point arguments are passed and returned in integer registers, using the same rules as integer arguments of the corresponding size.

-mabi=ABI-string

Specify integer and floating-point calling convention. ABI-string contains two parts: the size of integer types and the registers used for floating-point types. For example ‘-march=rv64ifd -mabi=lp64d’ means that ‘long’ and pointers are 64-bit (implicitly defining ‘int’ to be 32-bit), and that floating-point values up to 64 bits wide are passed in F registers. Contrast this with ‘-march=rv64ifd -mabi=lp64f’, which still allows the compiler to generate code that uses the F and D extensions but only allows floating-point values up to 32 bits long to be passed in registers; or ‘-march=rv64ifd -mabi=lp64’, in which no floating-point arguments will be passed in registers.

The default for this argument is system dependent, users who want a specific calling convention should specify one explicitly. The valid calling conventions are: ‘ilp32’, ‘ilp32f’, ‘ilp32d’, ‘lp64’, ‘lp64f’, and ‘lp64d’. Some calling conventions are impossible to implement on some ISAs: for example, ‘-march=rv32if -mabi=ilp32d’ is invalid because the ABI requires 64-bit values be passed in F registers, but F registers are only 32 bits wide. There is also the ‘ilp32e’ ABI that can only be used with the ‘rv32e’ architecture. This ABI is not well specified at present, and is subject to change [14].

RVV ¶

Clang/LLVM provides RVV (RISC-V Vectors) written in C rather than inline assembly.

Although enabled by the clang option -target-feature +experimental-v, using C is shorter, more user-friendly, and easier to remember than inline assembly.

Builtins are C functions and also user-friendly. RVV code can be written and run as follows,

exlbt/riscv/vadd2.c

// pass: 
/* ~/riscv/riscv_newlib/bin/clang++ vadd2.c -menable-experimental-extensions \
 -march=rv64gcv0p10 -O0 -mllvm --riscv-v-vector-bits-min=256 -v */
// ~/riscv/git/qemu/build/qemu-riscv64 -cpu rv64,v=true a.out
// ref. https://jia.je/software/2022/01/25/rvv-1.0-toolchain/
//      https://pages.dogdog.run/toolchain/riscv_vector_extension.html

#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <riscv_vector.h>

#define array_size(a) (sizeof(a) / sizeof((a)[0]))

// Vector-vector add
void vadd(uint32_t *a, const uint32_t *b, const uint32_t *c, size_t n) {
  while (n > 0) {
    size_t vl = vsetvl_e32m8(n);
    vuint32m8_t vb = vle32_v_u32m8(b, vl);
// generate:
//   vsetvli zero, a0, e32, m8, ta, mu
//   vadd.vv v8, v8, v16
    vuint32m8_t vc = vle32_v_u32m8(c, vl);
    vuint32m8_t va = vadd(vb, vc, vl);
    vse32(a, va, vl);
    a += vl;
    b += vl;
    c += vl;
    n -= vl;
  }
}

// Vector-vector add (inline assembly)
void vadd_asm(uint32_t *a, const uint32_t *b, const uint32_t *c, size_t n) {
  while (n > 0) {
    size_t vl;
    vuint32m8_t va, vb, vc;
    //Fail: __asm__ __volatile__ ( "vsetvli %[vl], %[512], e32, m8" : [vl] "=r"(vl) : [512] "r"(512) );
    vl = vsetvl_e32m8(n);
#if (__clang_major__ > 10)
    __asm__ __volatile__ ( "vle32.v %[vb], (%[b])" : [vb] "=vr"(vb) : [b] "r"(b) : "memory" );
    __asm__ __volatile__ ( "vle32.v %[vc], (%[c])" : [vc] "=vr"(vc) : [c] "r"(c) : "memory" );
    __asm__ __volatile__ ( "vadd.vv %[va], %[vb], %[vc]" : [va] "=vr"(va) : [vb] "vr"(vb), [vc] "vr"(vc) );
    __asm__ __volatile__ ( "vse32.v %[va], (%[a])" : [va] "=vr"(va) : [a] "r"(a) : "memory" );
#else
    __asm__ __volatile__ ( "vle32.v %[vb], (%[b])" : [vb] "=v8"(vb) : [b] "r"(b) : "memory" );
    __asm__ __volatile__ ( "vle32.v %[vc], (%[c])" : [vc] "=v8"(vc) : [c] "r"(c) : "memory" );
    __asm__ __volatile__ ( "vadd.vv %[va], %[vb], %[vc]" : [va] "=v8"(va) : [vb] "v8"(vb), [vc] "v8"(vc) );
    __asm__ __volatile__ ( "vse32.v %[va], (%[a])" : [va] "=v8"(va) : [a] "r"(a) : "memory" );
#endif

    a += vl;
    b += vl;
    c += vl;
    n -= vl;
  }
}

uint32_t a[4096];
uint8_t m[512];

int main(void) {
  printf("array_size(a):%lu\n", array_size(a));
  // init source
  for (size_t i = 0; i < array_size(a); ++i)
    a[i] = i;

  vadd(a, a, a, array_size(a));

  printf("\na[]: ");
  for (size_t i = 0; i < array_size(a); ++i) {
    if (i < 10)
      printf("%d ", a[i]);
    else if (i == 11)
      printf("...");
    assert(a[i] == i * 2);
  }
  printf("\nThe results of vadd:\tPASS\n");

  vadd_asm(a, a, a, array_size(a));

  printf("\na[]: ");
  for (size_t i = 0; i < array_size(a); ++i) {
    if (i < 10)
      printf("%d ", a[i]);
    else if (i == 11)
      printf("...");
    assert(a[i] == i * 4);
  }
  printf("\nThe results of vadd_asm:\tPASS\n");

  return 0;
}

$ pwd
$ $HOME/git/lbt/exlbt/riscv
$ $HOME/riscv/riscv_newlib/bin/clang vadd2.c -menable-experimental-extensions \
  -march=rv64gcv0p10 -O0 -mllvm --riscv-v-vector-bits-min=256
$ $HOME/riscv/git/qemu/build/qemu-riscv64 -cpu rv64,v=true a.out
vector version is not specified, use the default value v1.0
array_size(a):4096

a[]: 0 2 4 6 8 10 12 14 16 18 ...
The results of vadd:  PASS

a[]: 0 4 8 12 16 20 24 28 32 36 ...
The results of vadd_asm:      PASS

$ $HOME/riscv/riscv_newlib/bin/clang vadd1.c -menable-experimental-extensions \
  -march=rv64gcv0p10 -O0 -mllvm --riscv-v-vector-bits-min=256 -static
$ $HOME/riscv/git/qemu/build/qemu-riscv64 -cpu rv64,v=true a.out
vector version is not specified, use the default value v1.0
1 11 11 11 11 11 11 11 11 1
$ $HOME/riscv/riscv_newlib/bin/riscv64-unknown-elf-objdump -d a.out|grep vadd.vv
 106fc:       03ae0d57                vadd.vv v26,v26,v28

For -march=rv64imfv0p10zfh0p1,

v0p10: vector version 0.10.
zfh0p1: “Zfh” version 0.1 [4].

For -mabi, see the section above.

Clang/LLVM provides builtin and intrinsic functions to implement RVV (RISC-V Vectors).

$HOME/riscv/git/llvm-project/clang/include/clang/Basic/riscv_vector.td

#define vsetvl_e8mf8(avl) __builtin_rvv_vsetvli((size_t)(avl), 0, 5)
...
defm vmadd  : RVVIntTerBuiltinSet;
...
defm vfdiv  : RVVFloatingBinBuiltinSet;

$HOME/riscv/git/llvm-project/llvm/include/llvm/IR/IntrinsicsRISCV.td

def int_riscv_vsetvli   : Intrinsic<...
...
defm vmadd : RISCVTernaryAAXA;
...
defm vfdiv : RISCVBinaryAAX;

Refer to Clang/LLVM test cases in the following folders.

$HOME/riscv/git/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics$ ls
... vfdiv.c ... vfmadd.c

$HOME/riscv/git/llvm-project/llvm/test/CodeGen/RISCV/rvv$ ls
... vfdiv-rv64.ll ... vfmadd-rv64.ll ...

How to run qemu-system-riscv64 as baremetal qemu-riscv64?

Following https://twilco.github.io/riscv-from-scratch/2019/04/27/riscv-from-scratch-2.html#link-it-up to create crt0.S and riscv64-virt.ld does not work. It fails without changing crt0.S as well.

// terminal A:
riscv % ~/riscv/14.x/riscv_newlib/bin/clang hello.c -menable-experimental-extensions \
-march=rv64gcv1p0 -O0 -mabi=lp64d -T riscv64-virt.ld -Wl,--no-warn-rwx-segment -nostartfiles crt0.S -v
riscv % ~/riscv/git/qemu/build/qemu-system-riscv64 -machine virt -m 128M -cpu \
rv64,v=true -gdb tcp::1234 -kernel a.out

// terminal B:
riscv % ~/riscv/14.x/riscv_newlib/bin/riscv64-unknown-elf-gdb a.out
...
Reading symbols from a.out...
(gdb) target remote :1234
Remote debugging using :1234
0x0000000080000408 in ?? ()
(gdb) c
Continuing.

The qemu-system-riscv64 with -nographic option also fails to run properly.

Tutorial: Creating an LLVM Toolchain for the Cpu0 Architecture

Appendix A: RISCV

Appendix A: RISCV¶

ISA ¶

Mem ¶

RISC compiler toolchain installation ¶

Linker Command ¶

QEMU simulator ¶

Gem5 Simulator ¶

GDB ¶

RISCV Calling Convention [11]¶

RVV ¶

Atomic instructions ¶

RISCV+NPU for Deep Learning ¶

Resources ¶

FreeBSD ¶

FreeRTOS ¶

Zephyr ¶

Andes ¶

Websites ¶

To do:¶