Read the articles...
-
"Zig runs ROM FS Filesystem in the Web Browser (thanks to Apache NuttX RTOS)"
-
"TCC RISC-V Compiler runs in the Web Browser (thanks to Zig Compiler)"
TCC is a simple C Compiler for 64-bit RISC-V... Can we run TCC in a Web Browser?
Let's find out! We'll compile TCC to WebAssembly with Zig Compiler.
(We'll emulate the POSIX File Access with some WebAssembly Mockups)
Why are we doing this?
Today we can run Apache NuttX RTOS in a Web Browser (with WebAssembly + Emscripten + 64-bit RISC-V).
What if we could allow NuttX Apps to be compiled and tested in the Web Browser?
-
We type a C Program into a HTML Textbox...
int main(int argc, char *argv[]) { printf("Hello, World!!\n"); return 0; }
-
Run TCC in the Web Browser to compile the C Program into an ELF Executable (64-bit RISC-V)
-
Copy the ELF Executable to the NuttX Filesystem (via WebAssembly)
-
NuttX runs our ELF Executable inside the Web Browser
Let's verify that TCC will generate 64-bit RISC-V code...
We build TCC to support 64-bit RISC-V Target...
## Build TCC for 64-bit RISC-V Target
git clone https://github.com/lupyuen/tcc-riscv32-wasm
cd tcc-riscv32-wasm
./configure
make help
make --trace cross-riscv64
./riscv64-tcc -v
We compile this C program...
## Simple C Program
int main(int argc, char *argv[]) {
printf("Hello, World!!\n");
return 0;
}
Like this...
## Compile C to 64-bit RISC-V
/workspaces/bookworm/tcc-riscv32/riscv64-tcc \
-c \
/workspaces/bookworm/apps/examples/hello/hello_main.c
## Dump the 64-bit RISC-V Disassembly
riscv-none-elf-objdump \
--syms --source --reloc --demangle --line-numbers --wide \
--debugging \
hello_main.o \
>hello_main.S \
2>&1
The RISC-V Disassembly looks valid, very similar to a NuttX App (so it will probably run on NuttX): hello_main.S
hello_main.o: file format elf64-littleriscv
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 /workspaces/bookworm/apps/examples/hello/hello_m
ain.c
0000000000000000 l O .data.ro 0000000000000010 L.0
0000000000000000 g F .text 0000000000000040 main
0000000000000000 F *UND* 0000000000000000 printf
Disassembly of section .text:
0000000000000000 <main>:
main():
0: fe010113 add sp,sp,-32
4: 00113c23 sd ra,24(sp)
8: 00813823 sd s0,16(sp)
c: 02010413 add s0,sp,32
10: 00000013 nop
14: fea43423 sd a0,-24(s0)
18: feb43023 sd a1,-32(s0)
1c: 00000517 auipc a0,0x0 1c: R_RISCV_PCREL_HI20 L.0
20: 00050513 mv a0,a0 20: R_RISCV_PCREL_LO12_I .text
24: 00000097 auipc ra,0x0 24: R_RISCV_CALL_PLT printf
28: 000080e7 jalr ra # 24 <main+0x24>
2c: 0000051b sext.w a0,zero
30: 01813083 ld ra,24(sp)
34: 01013403 ld s0,16(sp)
38: 02010113 add sp,sp,32
3c: 00008067 ret
See the Object File: hello_main.o
We compile TCC Compiler with the Zig Compiler. First we figure out the GCC Options...
## Show the GCC Options
$ make --trace cross-riscv64
gcc \
-o riscv64-tcc.o \
-c tcc.c \
-DTCC_TARGET_RISCV64 \
-DCONFIG_TCC_CROSSPREFIX="\"riscv64-\"" \
-DCONFIG_TCC_CRTPREFIX="\"/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_LIBPATHS="\"{B}:/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_SYSINCLUDEPATHS="\"{B}/include:/usr/riscv64-linux-gnu/include\"" \
-DTCC_GITHASH="\"main:b3d10a35\"" \
-Wall \
-O2 \
-Wdeclaration-after-statement \
-fno-strict-aliasing \
-Wno-pointer-sign \
-Wno-sign-compare \
-Wno-unused-result \
-Wno-format-truncation \
-Wno-stringop-truncation \
-I.
gcc \
-o riscv64-tcc riscv64-tcc.o \
-lm \
-lpthread \
-ldl \
-s \
## Probably won't need this for now
../riscv64-tcc -c lib-arm64.c -o riscv64-lib-arm64.o -B.. -I..
../riscv64-tcc -c stdatomic.c -o riscv64-stdatomic.o -B.. -I..
../riscv64-tcc -c atomic.S -o riscv64-atomic.o -B.. -I..
../riscv64-tcc -c dsohandle.c -o riscv64-dsohandle.o -B.. -I..
../riscv64-tcc -ar rcs ../riscv64-libtcc1.a riscv64-lib-arm64.o riscv64-stdatomic.o riscv64-atomic.o riscv64-dsohandle.o
We copy the above GCC Options and we compile TCC with Zig Compiler...
## Compile TCC with Zig Compiler
export PATH=/workspaces/bookworm/zig-linux-x86_64-0.12.0-dev.2341+92211135f:$PATH
./configure
make --trace cross-riscv64
zig cc \
tcc.c \
-DTCC_TARGET_RISCV64 \
-DCONFIG_TCC_CROSSPREFIX="\"riscv64-\"" \
-DCONFIG_TCC_CRTPREFIX="\"/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_LIBPATHS="\"{B}:/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_SYSINCLUDEPATHS="\"{B}/include:/usr/riscv64-linux-gnu/include\"" \
-DTCC_GITHASH="\"main:b3d10a35\"" \
-Wall \
-O2 \
-Wdeclaration-after-statement \
-fno-strict-aliasing \
-Wno-pointer-sign \
-Wno-sign-compare \
-Wno-unused-result \
-Wno-format-truncation \
-Wno-stringop-truncation \
-I.
## Test our TCC compiled with Zig Compiler
/workspaces/bookworm/tcc-riscv32/a.out -v
/workspaces/bookworm/tcc-riscv32/a.out -c \
/workspaces/bookworm/apps/examples/hello/hello_main.c
riscv-none-elf-objdump \
--syms --source --reloc --demangle --line-numbers --wide \
--debugging \
hello_main.o \
>hello_main.S \
2>&1
Yep it works OK!
Now we compile TCC to WebAssembly.
Zig Compiler doesn't like it, so we Patch the longjmp / setjmp. (We probably won't need it unless TCC hits Compiler Errors)
## Compile TCC from C to WebAssembly
./configure
make --trace cross-riscv64
zig cc \
-c \
-target wasm32-freestanding \
-dynamic \
-rdynamic \
-lc \
tcc.c \
-DTCC_TARGET_RISCV64 \
-DCONFIG_TCC_CROSSPREFIX="\"riscv64-\"" \
-DCONFIG_TCC_CRTPREFIX="\"/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_LIBPATHS="\"{B}:/usr/riscv64-linux-gnu/lib\"" \
-DCONFIG_TCC_SYSINCLUDEPATHS="\"{B}/include:/usr/riscv64-linux-gnu/include\"" \
-DTCC_GITHASH="\"main:b3d10a35\"" \
-Wall \
-O2 \
-Wdeclaration-after-statement \
-fno-strict-aliasing \
-Wno-pointer-sign \
-Wno-sign-compare \
-Wno-unused-result \
-Wno-format-truncation \
-Wno-stringop-truncation \
-I.
## Dump our Compiled WebAssembly
sudo apt install wabt
wasm-objdump -h tcc.o
wasm-objdump -x tcc.o >/tmp/tcc.txt
Yep TCC compiles OK to WebAssembly with Zig Compiler!
We check the Compiled WebAssembly. These POSIX Functions are missing from the Compiled WebAssembly...
$ wasm-objdump -x tcc.o >/tmp/tcc.txt
$ cat /tmp/tcc.txt
Import[75]:
- memory[0] pages: initial=2 <- env.__linear_memory
- global[0] i32 mutable=1 <- env.__stack_pointer
- func[0] sig=1 <env.strcmp> <- env.strcmp
- func[1] sig=12 <env.memset> <- env.memset
- func[2] sig=1 <env.getcwd> <- env.getcwd
- func[3] sig=1 <env.strcpy> <- env.strcpy
- func[4] sig=2 <env.unlink> <- env.unlink
- func[5] sig=0 <env.free> <- env.free
- func[6] sig=6 <env.snprintf> <- env.snprintf
- func[7] sig=2 <env.getenv> <- env.getenv
- func[8] sig=2 <env.strlen> <- env.strlen
- func[9] sig=12 <env.sem_init> <- env.sem_init
- func[10] sig=2 <env.sem_wait> <- env.sem_wait
- func[11] sig=1 <env.realloc> <- env.realloc
- func[12] sig=12 <env.memmove> <- env.memmove
- func[13] sig=2 <env.malloc> <- env.malloc
- func[14] sig=12 <env.fprintf> <- env.fprintf
- func[15] sig=2 <env.puts> <- env.puts
- func[16] sig=0 <env.exit> <- env.exit
- func[17] sig=2 <env.sem_post> <- env.sem_post
- func[18] sig=1 <env.strchr> <- env.strchr
- func[19] sig=1 <env.strrchr> <- env.strrchr
- func[20] sig=6 <env.vsnprintf> <- env.vsnprintf
- func[21] sig=1 <env.printf> <- env.printf
- func[22] sig=2 <env.fflush> <- env.fflush
- func[23] sig=12 <env.memcpy> <- env.memcpy
- func[24] sig=12 <env.memcmp> <- env.memcmp
- func[25] sig=12 <env.sscanf> <- env.sscanf
- func[26] sig=1 <env.fputs> <- env.fputs
- func[27] sig=2 <env.close> <- env.close
- func[28] sig=12 <env.open> <- env.open
- func[29] sig=18 <env.lseek> <- env.lseek
- func[30] sig=12 <env.read> <- env.read
- func[31] sig=12 <env.strtol> <- env.strtol
- func[32] sig=2 <env.atoi> <- env.atoi
- func[33] sig=19 <env.strtoull> <- env.strtoull
- func[34] sig=12 <env.strtoul> <- env.strtoul
- func[35] sig=1 <env.strstr> <- env.strstr
- func[36] sig=1 <env.fopen> <- env.fopen
- func[37] sig=12 <env.sprintf> <- env.sprintf
- func[38] sig=2 <env.fclose> <- env.fclose
- func[39] sig=12 <env.fseek> <- env.fseek
- func[40] sig=2 <env.ftell> <- env.ftell
- func[41] sig=6 <env.fread> <- env.fread
- func[42] sig=6 <env.fwrite> <- env.fwrite
- func[43] sig=2 <env.remove> <- env.remove
- func[44] sig=1 <env.gettimeofday> <- env.gettimeofday
- func[45] sig=1 <env.fdopen> <- env.fdopen
- func[46] sig=12 <env.strncpy> <- env.strncpy
- func[47] sig=24 <env.__extendsftf2> <- env.__extendsftf2
- func[48] sig=25 <env.__extenddftf2> <- env.__extenddftf2
- func[49] sig=9 <env.__floatunditf> <- env.__floatunditf
- func[50] sig=3 <env.__floatunsitf> <- env.__floatunsitf
- func[51] sig=26 <env.__trunctfsf2> <- env.__trunctfsf2
- func[52] sig=27 <env.__trunctfdf2> <- env.__trunctfdf2
- func[53] sig=28 <env.__netf2> <- env.__netf2
- func[54] sig=29 <env.__fixunstfdi> <- env.__fixunstfdi
- func[55] sig=30 <env.__subtf3> <- env.__subtf3
- func[56] sig=30 <env.__multf3> <- env.__multf3
- func[57] sig=28 <env.__eqtf2> <- env.__eqtf2
- func[58] sig=30 <env.__divtf3> <- env.__divtf3
- func[59] sig=30 <env.__addtf3> <- env.__addtf3
- func[60] sig=2 <env.strerror> <- env.strerror
- func[61] sig=1 <env.fputc> <- env.fputc
- func[62] sig=1 <env.strcat> <- env.strcat
- func[63] sig=12 <env.strncmp> <- env.strncmp
- func[64] sig=31 <env.ldexp> <- env.ldexp
- func[65] sig=32 <env.strtof> <- env.strtof
- func[66] sig=8 <env.strtold> <- env.strtold
- func[67] sig=33 <env.strtod> <- env.strtod
- func[68] sig=2 <env.time> <- env.time
- func[69] sig=2 <env.localtime> <- env.localtime
- func[70] sig=13 <env.qsort> <- env.qsort
- func[71] sig=19 <env.strtoll> <- env.strtoll
- table[0] type=funcref initial=4 <- env.__indirect_function_table
TODO: How to fix these missing POSIX Functions for WebAssembly (Web Browser)
TODO: Do we need all of them? Maybe we run in a Web Browser and see what crashes? Similar to this
We link our Compiled WebAssembly tcc.o
with our Zig App: zig/tcc-wasm.zig
## Compile our Zig App `tcc-wasm.zig` for WebAssembly
## and link with TCC compiled for WebAssembly
zig build-exe \
-target wasm32-freestanding \
-rdynamic \
-lc \
-fno-entry \
-freference-trace \
--verbose-cimport \
--export=compile_program \
zig/tcc-wasm.zig \
tcc.o
## Dump our Linked WebAssembly
wasm-objdump -h tcc-wasm.wasm
wasm-objdump -x tcc-wasm.wasm >/tmp/tcc-wasm.txt
## Run our Linked WebAssembly
## Shows: ret=123
node zig/test.js
Yep it runs OK and prints 123
, with our NodeJS Script: zig/test.js
const fs = require('fs');
const source = fs.readFileSync("./tcc-wasm.wasm");
const typedArray = new Uint8Array(source);
WebAssembly.instantiate(typedArray, {
env: {
print: (result) => { console.log(`The result is ${result}`); }
}}).then(result => {
const compile_program = result.instance.exports.compile_program;
const ret = compile_program();
console.log(`ret=${ret}`);
});
We test the TCC WebAssembly in a Web Browser with docs/index.html and docs/tcc.js...
## Start the Web Server
cargo install simple-http-server
simple-http-server ./docs &
## Copy the Linked TCC WebAssembly to the Web Server
cp tcc-wasm.wasm docs/
Browse to...
http://localhost:8000/index.html
Open the JavaScript Console. Yep our TCC WebAssembly runs OK in a Web Browser!
main: start
ret=123
main: end
Also published publicly here (see the JavaScript Console): https://lupyuen.github.io/tcc-riscv32-wasm/
When we call main()
in our Zig App: zig/tcc-wasm.zig
We see many many Undefined Symbols...
+ zig build-exe --verbose-cimport -target wasm32-freestanding -rdynamic -lc -fno-entry --export=compile_program zig/tcc-wasm.zig tcc.o
error: wasm-ld: tcc.o: undefined symbol: realloc
error: wasm-ld: tcc.o: undefined symbol: free
error: wasm-ld: tcc.o: undefined symbol: snprintf
[...many many more...]
So we stubbed them in our Zig App: zig/tcc-wasm.zig
/// Fix the Missing Variables
pub export var errno: c_int = 0;
pub export var stdout: c_int = 1;
pub export var stderr: c_int = 2;
/// Fix the Missing Functions
pub export fn atoi(_: c_int) c_int {
@panic("TODO: atoi");
}
pub export fn close(_: c_int) c_int {
@panic("TODO: close");
}
[...many many more...]
Then we...
-
Borrow from foundation-libc and ziglibc
When we run it in a Web Browser: TCC compiles hello.c
and writes to a.out
yay!
+ node zig/test.js
compile_program
open: path=hello.c, oflag=0, return fd=3
sem_init: sem=tcc-wasm.sem_t@107678, pshared=0, value=1
sem_wait: sem=tcc-wasm.sem_t@107678
TODO: setjmp
TODO: sscanf: str=0.9.27, format=%d.%d.%d
TODO: vsnprintf: size=128, format=#define __TINYC__ %d
TODO: vsnprintf: return str=#define __TINYC__ %d
TODO: vsnprintf: size=107, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=94, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=81, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=196, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=183, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=170, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=157, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=144, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=131, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=118, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=105, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=92, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=335, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=322, format=#define __SIZEOF_POINTER__ %d
TODO: vsnprintf: return str=#define __SIZEOF_POINTER__ %d
TODO: vsnprintf: size=292, format=#define __SIZEOF_LONG__ %d
TODO: vsnprintf: return str=#define __SIZEOF_LONG__ %d
TODO: vsnprintf: size=265, format=#define %s%s
TODO: vsnprintf: return str=#define FIX_vsnprintf
TODO: vsnprintf: size=252, format=#define __STDC_VERSION__ %dL
TODO: vsnprintf: return str=#define __STDC_VERSION__ %dL
TODO: vsnprintf: size=497, format=#define __BASE_FILE__ "%s"
TODO: vsnprintf: return str=#define __BASE_FILE__ "%s"
TODO: vsnprintf: size=128, format=In file included from %s:%d:
TODO: vsnprintf: return str=In file included from %s:%d:
TODO: vsnprintf: size=99, format=%s:%d:
TODO: vsnprintf: return str=%s:%d:
TODO: vsnprintf: size=92, format=warning:
TODO: vsnprintf: return str=warning:
TODO: vsnprintf: size=83, format=%s redefined
TODO: vsnprintf: return str=%s redefined
fprintf: stream=tcc-wasm.FILE@2, format=%s
read: fd=3, nbyte=8192
read: return buf=int main(int argc, char *argv[]) {
printf("Hello, World!!\n");
return 0;
}
TODO: vsnprintf: size=128, format=%s:%d:
TODO: vsnprintf: return str=%s:%d:
TODO: vsnprintf: size=121, format=warning:
TODO: vsnprintf: return str=warning:
TODO: vsnprintf: size=112, format=implicit declaration of function '%s'
TODO: vsnprintf: return str=implicit declaration of function '%s'
fprintf: stream=tcc-wasm.FILE@2, format=%s
TODO: sprintf: format=L.%u
TODO: sprintf: return str=L.0
TODO: snprintf: size=256, format=.rela%s
TODO: snprintf: return str=.rela.text
read: fd=3, nbyte=8192
read: return 0
close: fd=3
sem_post: sem=tcc-wasm.sem_t@107678
TODO: snprintf: size=1024, format=%s
TODO: snprintf: return str=%s
unlink: path=a.out
open: path=a.out, oflag=577, return fd=4
fdopen: fd=4, mode=wb, return FILE=5
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............
0016: 01 00 F3 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0032: 00 00 00 00 00 00 00 00 D0 01 00 00 00 00 00 00 ................
0048: 04 00 00 00 40 00 00 00 00 00 40 00 09 00 08 00 ....@.....@.....
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 13 01 01 FE 23 3C 11 00 23 38 81 00 13 04 01 02 ....#<..#8......
0016: 13 00 00 00 23 34 A4 FE 23 30 B4 FE 17 05 00 00 ....#4..#0......
0032: 13 05 05 00 97 00 00 00 E7 80 00 00 1B 05 00 00 ................
0048: 83 30 81 01 03 34 01 01 13 01 01 02 67 80 00 00 .0...4......g...
fwrite: size=1, nmemb=16, stream=tcc-wasm.FILE@5
0000: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 21 0A 00 Hello, World!!..
fwrite: size=1, nmemb=144, stream=tcc-wasm.FILE@5
0000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 01 00 00 00 04 00 F1 FF ................
0032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 0E 00 00 00 01 00 03 00 00 00 00 00 00 00 00 00 ................
0064: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ................
0080: 1C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0096: 09 00 00 00 12 00 01 00 00 00 00 00 00 00 00 00 ................
0112: 40 00 00 00 00 00 00 00 12 00 00 00 12 00 00 00 @...............
0128: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=25, stream=tcc-wasm.FILE@5
0000: 00 68 65 6C 6C 6F 2E 63 00 6D 61 69 6E 00 4C 2E .hello.c.main.L.
0016: 30 00 70 72 69 6E 74 66 00 0.printf.
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fwrite: size=1, nmemb=72, stream=tcc-wasm.FILE@5
0000: 1C 00 00 00 00 00 00 00 17 00 00 00 02 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
0032: 18 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 24 00 00 00 00 00 00 00 13 00 00 00 05 00 00 00 $...............
0064: 00 00 00 00 00 00 00 00 ........
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fputc: c=0x00, stream=tcc-wasm.FILE@5
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 00 2E 74 65 78 74 00 2E 64 61 74 61 00 2E 64 61 ..text..data..da
0016: 74 61 2E 72 6F 00 2E 62 73 73 00 2E 73 79 6D 74 ta.ro..bss..symt
0032: 61 62 00 2E 73 74 72 74 61 62 00 2E 72 65 6C 61 ab..strtab..rela
0048: 2E 74 65 78 74 00 2E 73 68 73 74 72 74 61 62 00 .text..shstrtab.
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 01 00 00 00 01 00 00 00 06 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........@.......
0032: 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @...............
0048: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 07 00 00 00 01 00 00 00 03 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 ................
0032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 0D 00 00 00 01 00 00 00 03 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 ................
0032: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 16 00 00 00 08 00 00 00 03 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 90 00 00 00 00 00 00 00 ................
0032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 1B 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
0016: 00 00 00 00 00 00 00 00 90 00 00 00 00 00 00 00 ................
0032: 90 00 00 00 00 00 00 00 06 00 00 00 04 00 00 00 ................
0048: 08 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 23 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 #...............
0016: 00 00 00 00 00 00 00 00 20 01 00 00 00 00 00 00 ........ .......
0032: 19 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0048: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 2B 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 +...............
0016: 00 00 00 00 00 00 00 00 40 01 00 00 00 00 00 00 ........@.......
0032: 48 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 H...............
0048: 08 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00 ................
fwrite: size=1, nmemb=64, stream=tcc-wasm.FILE@5
0000: 36 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 6...............
0016: 00 00 00 00 00 00 00 00 90 01 00 00 00 00 00 00 ................
0032: 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @...............
0048: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
close: stream=tcc-wasm.FILE@5
a.out: 1040 bytes
0000: 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............
0016: 01 00 F3 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0032: 00 00 00 00 00 00 00 00 D0 01 00 00 00 00 00 00 ................
0048: 04 00 00 00 40 00 00 00 00 00 40 00 09 00 08 00 ....@.....@.....
0064: 13 01 01 FE 23 3C 11 00 23 38 81 00 13 04 01 02 ....#<..#8......
0080: 13 00 00 00 23 34 A4 FE 23 30 B4 FE 17 05 00 00 ....#4..#0......
0096: 13 05 05 00 97 00 00 00 E7 80 00 00 1B 05 00 00 ................
0112: 83 30 81 01 03 34 01 01 13 01 01 02 67 80 00 00 .0...4......g...
0128: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 21 0A 00 Hello, World!!..
0144: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0160: 00 00 00 00 00 00 00 00 01 00 00 00 04 00 F1 FF ................
0176: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0192: 0E 00 00 00 01 00 03 00 00 00 00 00 00 00 00 00 ................
0208: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ................
0224: 1C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0240: 09 00 00 00 12 00 01 00 00 00 00 00 00 00 00 00 ................
0256: 40 00 00 00 00 00 00 00 12 00 00 00 12 00 00 00 @...............
0272: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0288: 00 68 65 6C 6C 6F 2E 63 00 6D 61 69 6E 00 4C 2E .hello.c.main.L.
0304: 30 00 70 72 69 6E 74 66 00 00 00 00 00 00 00 00 0.printf........
0320: 1C 00 00 00 00 00 00 00 17 00 00 00 02 00 00 00 ................
0336: 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
0352: 18 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 ................
0368: 24 00 00 00 00 00 00 00 13 00 00 00 05 00 00 00 $...............
0384: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0400: 00 2E 74 65 78 74 00 2E 64 61 74 61 00 2E 64 61 ..text..data..da
0416: 74 61 2E 72 6F 00 2E 62 73 73 00 2E 73 79 6D 74 ta.ro..bss..symt
0432: 61 62 00 2E 73 74 72 74 61 62 00 2E 72 65 6C 61 ab..strtab..rela
0448: 2E 74 65 78 74 00 2E 73 68 73 74 72 74 61 62 00 .text..shstrtab.
0464: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0496: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0512: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0528: 01 00 00 00 01 00 00 00 06 00 00 00 00 00 00 00 ................
0544: 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........@.......
0560: 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @...............
0576: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0592: 07 00 00 00 01 00 00 00 03 00 00 00 00 00 00 00 ................
0608: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 ................
0624: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0640: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0656: 0D 00 00 00 01 00 00 00 03 00 00 00 00 00 00 00 ................
0672: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 ................
0688: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0704: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0720: 16 00 00 00 08 00 00 00 03 00 00 00 00 00 00 00 ................
0736: 00 00 00 00 00 00 00 00 90 00 00 00 00 00 00 00 ................
0752: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0768: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0784: 1B 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
0800: 00 00 00 00 00 00 00 00 90 00 00 00 00 00 00 00 ................
0816: 90 00 00 00 00 00 00 00 06 00 00 00 04 00 00 00 ................
0832: 08 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00 ................
0848: 23 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 #...............
0864: 00 00 00 00 00 00 00 00 20 01 00 00 00 00 00 00 ........ .......
0880: 19 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0896: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0912: 2B 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 +...............
0928: 00 00 00 00 00 00 00 00 40 01 00 00 00 00 00 00 ........@.......
0944: 48 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 H...............
0960: 08 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00 ................
0976: 36 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 6...............
0992: 00 00 00 00 00 00 00 00 90 01 00 00 00 00 00 00 ................
1008: 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @...............
1024: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
ret=1040
Also published publicly here (see the JavaScript Console): https://lupyuen.github.io/tcc-riscv32-wasm/
TODO: Check our WebAssembly with Modsurfer
TODO: Need to implement vsnprintf() in C? Or we hardcode the patterns?
TODO: Didn't we pass the TCC Option -c
to generate as Object File? Why is the output a.out
?
Note: invalid macro name
is caused by #define %s%s
. We should mock up a valid name for %s%s
Let's verify the generated a.out
.
We copy the above a.out
Hex Dump into a Text File: a.txt
Then we decompile it...
## Convert a.txt to a.out
cat a.txt \
| cut --bytes=10-58 \
| xxd -revert -plain \
>a.out
## Decompile the a.out to RISC-V Disassembly a.S
riscv-none-elf-objdump \
--syms --source --reloc --demangle --line-numbers --wide \
--debugging \
a.out \
>a.S \
2>&1
And the Decompiled RISC-V Disassembly looks correct! a.S
main():
0: fe010113 add sp,sp,-32
4: 00113c23 sd ra,24(sp)
8: 00813823 sd s0,16(sp)
c: 02010413 add s0,sp,32
10: 00000013 nop
14: fea43423 sd a0,-24(s0)
18: feb43023 sd a1,-32(s0)
1c: 00000517 auipc a0,0x0 1c: R_RISCV_PCREL_HI20 L.0
20: 00050513 mv a0,a0 20: R_RISCV_PCREL_LO12_I .text
24: 00000097 auipc ra,0x0 24: R_RISCV_CALL_PLT printf
28: 000080e7 jalr ra # 24 <main+0x24>
2c: 0000051b sext.w a0,zero
30: 01813083 ld ra,24(sp)
34: 01013403 ld s0,16(sp)
38: 02010113 add sp,sp,32
3c: 00008067 ret
Very similar to hello_main.S
So yes TCC runs correctly in a Web Browser. With some limitations and lots of hacking! Yay!
Why is fprintf particularly problematic?
Here's the fearsome thing about fprintf and friends: sprintf, snprintf, vsnprintf...
-
C Format Strings are difficult to parse
-
Variable Number of Untyped Arguments might create Bad Pointers
Hence we hacked up an implementation of String Formatting that's safer, simpler and so-barebones-you-can-make-soup-tulang.
Soup tulang? Tell me more...
Our Zig Wrapper uses Pattern Matching to match the C Formats and substitute the Zig Equivalent: tcc-wasm.zig
/// Pattern Matching for String Formatting: We will match these patterns when formatting strings
const format_patterns = [_]FormatPattern{
// Format a Single `%d`, like `#define __TINYC__ %d`
FormatPattern{ .c_spec = "%d", .zig_spec = "{}", .type0 = c_int, .type1 = null },
// Format a Single `%u`, like `L.%u`
FormatPattern{ .c_spec = "%u", .zig_spec = "{}", .type0 = c_int, .type1 = null },
// Format a Single `%s`, like `#define __BASE_FILE__ "%s"` or `.rela%s`
FormatPattern{ .c_spec = "%s", .zig_spec = "{s}", .type0 = [*:0]const u8, .type1 = null },
// Format Two `%s`, like `#define %s%s\n`
FormatPattern{ .c_spec = "%s%s", .zig_spec = "{s}{s}", .type0 = [*:0]const u8, .type1 = [*:0]const u8 },
// Format `%s:%d`, like `%s:%d: `
FormatPattern{ .c_spec = "%s:%d", .zig_spec = "{s}:{}", .type0 = [*:0]const u8, .type1 = c_int },
};
We use comptime
functions in Zig to implement the C String Formatting: tcc-wasm.zig
/// CompTime Function to format a string by Pattern Matching.
/// Format a Single Specifier, like `#define __BASE_FILE__ "%s"`
/// If the Spec matches the Format: Return the number of bytes written to `str`, excluding terminating null.
/// Else return 0.
fn format_string1(
ap: *std.builtin.VaList,
str: [*]u8,
size: size_t,
format: []const u8, // Like `#define %s%s\n`
comptime c_spec: []const u8, // Like `%s%s`
comptime zig_spec: []const u8, // Like `{s}{s}`
comptime T0: type, // Like `[*:0]const u8`
) usize {
// Count the Format Specifiers: `%`
const spec_cnt = std.mem.count(u8, c_spec, "%");
const format_cnt = std.mem.count(u8, format, "%");
// Check the Format Specifiers: `%`
if (format_cnt != spec_cnt or // Quit if the number of specifiers are different
!std.mem.containsAtLeast(u8, format, 1, c_spec)) // Or if the specifiers are not found
{
return 0;
}
// Fetch the args
const a = @cVaArg(ap, T0);
if (T0 == c_int) {
debug("format_string1: size={}, format={s}, a={}", .{ size, format, a });
} else {
debug("format_string1: size={}, format={s}, a={s}", .{ size, format, a });
}
// Format the string. TODO: Check for overflow
var buf: [100]u8 = undefined; // Limit to 100 chars
const buf_slice = std.fmt.bufPrint(&buf, zig_spec, .{a}) catch {
wasmlog.Console.log("*** format_string1 error: buf too small", .{});
@panic("*** format_string1 error: buf too small");
};
// Replace the Format Specifier
var buf2 = std.mem.zeroes([100]u8); // Limit to 100 chars
_ = std.mem.replace(u8, format, c_spec, buf_slice, &buf2);
// Return the string
const len = std.mem.indexOfScalar(u8, &buf2, 0).?;
@memcpy(str[0..len], buf2[0..len]);
str[len] = 0;
return len;
}
Previously without comptime
, the implementation of C String Formatting gets very lengthy: tcc-wasm.zig
export fn vsnprintf(str: [*:0]u8, size: size_t, format: [*:0]const u8, ...) c_int {
// Count the Format Specifiers: `%`
const format_slice = std.mem.span(format);
const format_cnt = std.mem.count(u8, format_slice, "%");
// TODO: Catch overflow
if (format_cnt == 0) {
// If no Format Specifiers: Return the Format, like `warning: `
debug("vsnprintf: size={}, format={s}, format_cnt={}", .{ size, format, format_cnt });
_ = memcpy(str, format, strlen(format));
str[strlen(format)] = 0;
} else if (format_cnt == 2 and std.mem.containsAtLeast(u8, format_slice, 1, "%s%s")) {
// Format Two `%s`, like `#define %s%s\n`
var ap = @cVaStart();
defer @cVaEnd(&ap);
const s0 = @cVaArg(&ap, [*:0]const u8);
const s1 = @cVaArg(&ap, [*:0]const u8);
debug("vsnprintf: size={}, format={s}, s0={s}, s1={s}", .{ size, format, s0, s1 });
// Format the string
const format2 = "{s}{s}"; // Equivalent to C: `%s%s`
var buf: [100]u8 = undefined; // Limit to 100 chars
const buf_slice = std.fmt.bufPrint(&buf, format2, .{ s0, s1 }) catch {
wasmlog.Console.log("*** vsnprintf error: buf too small", .{});
@panic("*** vsnprintf error: buf too small");
};
// Replace the Format Specifier
var buf2 = std.mem.zeroes([100]u8); // Limit to 100 chars
_ = std.mem.replace(u8, format_slice, "%s%s", buf_slice, &buf2);
// Return the string
const len = std.mem.indexOfScalar(u8, &buf2, 0).?;
_ = memcpy(str, &buf2, @intCast(len));
str[len] = 0;
} else if (format_cnt == 2 and std.mem.containsAtLeast(u8, format_slice, 1, "%s:%d")) {
// ...
Plus lots lots more of tedious coding! It's a lot simpler now with comptime
Format Patterns.
Now we can handle all String Formatting correctly in TCC...
+ node zig/test.js
compile_program: start
compile_program: options=["-c","hello.c"]
compile_program: code=
int main(int argc, char *argv[]) {
printf("Hello, World!!\n");
return 0;
}
compile_program: options[0]=-c
compile_program: options[1]=hello.c
open: path=hello.c, oflag=0, return fd=3
sem_init: sem=tcc-wasm.sem_t@10cfe8, pshared=0, value=1
sem_wait: sem=tcc-wasm.sem_t@10cfe8
TODO: setjmp
TODO: sscanf: str=0.9.27, format=%d.%d.%d
format_string1: size=128, format=#define __TINYC__ %d, a=1991381505
vsnprintf: return str=#define __TINYC__ 1991381505
format_string2: size=99, format=#define %s%s, a0=__riscv, a1= 1
vsnprintf: return str=#define __riscv 1
format_string2: size=81, format=#define %s%s, a0=__riscv_xlen 64, a1=
vsnprintf: return str=#define __riscv_xlen 64
format_string2: size=185, format=#define %s%s, a0=__riscv_flen 64, a1=
vsnprintf: return str=#define __riscv_flen 64
format_string2: size=161, format=#define %s%s, a0=__riscv_div, a1= 1
vsnprintf: return str=#define __riscv_div 1
format_string2: size=139, format=#define %s%s, a0=__riscv_mul, a1= 1
vsnprintf: return str=#define __riscv_mul 1
format_string2: size=117, format=#define %s%s, a0=__riscv_fdiv, a1= 1
vsnprintf: return str=#define __riscv_fdiv 1
format_string2: size=94, format=#define %s%s, a0=__riscv_fsqrt, a1= 1
vsnprintf: return str=#define __riscv_fsqrt 1
format_string2: size=326, format=#define %s%s, a0=__riscv_float_abi_double, a1= 1
vsnprintf: return str=#define __riscv_float_abi_double 1
format_string2: size=291, format=#define %s%s, a0=__linux__, a1= 1
vsnprintf: return str=#define __linux__ 1
format_string2: size=271, format=#define %s%s, a0=__linux, a1= 1
vsnprintf: return str=#define __linux 1
format_string2: size=253, format=#define %s%s, a0=__unix__, a1= 1
vsnprintf: return str=#define __unix__ 1
format_string2: size=234, format=#define %s%s, a0=__unix, a1= 1
vsnprintf: return str=#define __unix 1
format_string2: size=217, format=#define %s%s, a0=__CHAR_UNSIGNED__, a1= 1
vsnprintf: return str=#define __CHAR_UNSIGNED__ 1
format_string1: size=189, format=#define __SIZEOF_POINTER__ %d, a=8
vsnprintf: return str=#define __SIZEOF_POINTER__ 8
format_string1: size=160, format=#define __SIZEOF_LONG__ %d, a=8
vsnprintf: return str=#define __SIZEOF_LONG__ 8
format_string2: size=134, format=#define %s%s, a0=__STDC__, a1= 1
vsnprintf: return str=#define __STDC__ 1
format_string1: size=115, format=#define __STDC_VERSION__ %dL, a=199901
vsnprintf: return str=#define __STDC_VERSION__ 199901L
format_string1: size=356, format=#define __BASE_FILE__ "%s", a=hello.c
vsnprintf: return str=#define __BASE_FILE__ "hello.c"
read: fd=3, nbyte=8192
read: return buf=
int main(int argc, char *argv[]) {
printf("Hello, World!!\n");
return 0;
}
format_string2: size=128, format=%s:%d: , a0=hello.c, a1=3
vsnprintf: return str=hello.c:3:
format_string0: size=117, format=warning:
vsnprintf: return str=warning:
format_string1: size=108, format=implicit declaration of function '%s', a=printf
vsnprintf: return str=implicit declaration of function 'printf'
format_string1: size=0, format=%s, a=hello.c:3: warning: implicit declaration of function 'printf'
fprintf: stream=tcc-wasm.FILE@2
hello.c:3: warning: implicit declaration of function 'printf'
format_string1: size=0, format=L.%u, a=0
sprintf: return str=L.0
format_string1: size=256, format=.rela%s, a=.text
snprintf: return str=.rela.text
read: fd=3, nbyte=8192
read: return buf=
close: fd=3
sem_post: sem=tcc-wasm.sem_t@10cfe8
format_string1: size=1024, format=%s, a=hello.c
snprintf: return str=hello.c
unlink: path=hello.o
open: path=hello.o, oflag=577, return fd=4
fdopen: fd=4, mode=wb, return FILE=5
...
close: stream=tcc-wasm.FILE@5
a.out: 1040 bytes
TODO: Implement sscanf: str=0.9.27, format=%d.%d.%d
TCC in WebAssembly has compiled our C Program into the ELF Binary a.out
. What happens when we run it on NuttX?
Let's run the TCC Output a.out
on NuttX! We copy a.out
to NuttX Apps Filesystem...
mv ~/Downloads/a.out ~/riscv/apps/bin/
chmod +x ~/riscv/apps/bin/*
file ~/riscv/apps/bin/a.out
ls -l ~/riscv/apps/bin
Which shows...
$ file ~/riscv/apps/bin/a.out
~/riscv/apps/bin/a.out: ELF 64-bit LSB relocatable, UCB RISC-V, version 1 (SYSV), not stripped
$ ls -l ~/riscv/apps/bin
total 4744
-rwxr-xr-x@ 1 1040 Jan 29 09:24 a.out
-rwxr-xr-x 1 200176 Jan 29 09:05 getprime
-rwxr-xr-x 1 119560 Jan 29 09:05 hello
-rwxr-xr-x 1 697368 Jan 29 09:05 init
-rwxr-xr-x 1 703840 Jan 29 09:05 ostest
-rwxr-xr-x 1 694648 Jan 29 09:05 sh
In NuttX we enable Binary Loader Logging: make menuconfig
then select...
- Build Setup > Debug Options > Binary Loader Debug Features > Enable "Binary Loader Error, Warnings and Info"
Then we boot NuttX on QEMU (64-bit RISC-V) and run a.out
on NuttX...
nsh> a.out
load_absmodule: Loading /system/bin/a.out
elf_loadbinary: Loading file: /system/bin/a.out
elf_init: filename: /system/bin/a.out loadinfo: 0x8020afa8
elf_read: Read 64 bytes from offset 0
elf_dumploadinfo: LOAD_INFO:
elf_dumploadinfo: textalloc: 00000000
elf_dumploadinfo: dataalloc: 00000000
elf_dumploadinfo: textsize: 0
elf_dumploadinfo: datasize: 0
elf_dumploadinfo: textalign: 0
elf_dumploadinfo: dataalign: 0
elf_dumploadinfo: filelen: 1040
elf_dumploadinfo: symtabidx: 0
elf_dumploadinfo: strtabidx: 0
elf_dumploadinfo: ELF Header:
elf_dumploadinfo: e_ident: 7f 45 4c 46
elf_dumploadinfo: e_type: 0001
elf_dumploadinfo: e_machine: 00f3
elf_dumploadinfo: e_version: 00000001
elf_dumploadinfo: e_entry: 00000000
elf_dumploadinfo: e_phoff: 0
elf_dumploadinfo: e_shoff: 464
elf_dumploadinfo: e_flags: 00000004
elf_dumploadinfo: e_ehsize: 64
elf_dumploadinfo: e_phentsize: 0
elf_dumploadinfo: e_phnum: 0
elf_dumploadinfo: e_shentsize: 64
elf_dumploadinfo: e_shnum: 9
elf_dumploadinfo: e_shstrndx: 8
elf_load: loadinfo: 0x8020afa8
elf_loadphdrs: No programs(?)
elf_read: Read 576 bytes from offset 464
elf_loadfile: Loaded sections:
elf_read: Read 64 bytes from offset 64
elf_loadfile: 1. 00000000->c0000000
elf_read: Read 0 bytes from offset 128
elf_loadfile: 2. 00000000->c0101000
elf_read: Read 16 bytes from offset 128
elf_loadfile: 3. 00000000->c0101000
elf_loadfile: 4. 00000000->c0101010
elf_dumploadinfo: LOAD_INFO:
elf_dumploadinfo: textalloc: c0000000
elf_dumploadinfo: dataalloc: c0101000
elf_dumploadinfo: textsize: 64
elf_dumploadinfo: datasize: 16
elf_dumploadinfo: textalign: 8
elf_dumploadinfo: dataalign: 8
elf_dumploadinfo: filelen: 1040
elf_dumploadinfo: symtabidx: 0
elf_dumploadinfo: strtabidx: 0
elf_dumploadinfo: ELF Header:
elf_dumploadinfo: e_ident: 7f 45 4c 46
elf_dumploadinfo: e_type: 0001
elf_dumploadinfo: e_machine: 00f3
elf_dumploadinfo: e_version: 00000001
elf_dumploadinfo: e_entry: 00000000
elf_dumploadinfo: e_phoff: 0
elf_dumploadinfo: e_shoff: 464
elf_dumploadinfo: e_flags: 00000004
elf_dumploadinfo: e_ehsize: 64
elf_dumploadinfo: e_phentsize: 0
elf_dumploadinfo: e_phnum: 0
elf_dumploadinfo: e_shentsize: 64
elf_dumploadinfo: e_shnum: 9
elf_dumploadinfo: e_shstrndx: 8
elf_dumploadinfo: Sections 0:
elf_dumploadinfo: sh_name: 00000000
elf_dumploadinfo: sh_type: 00000000
elf_dumploadinfo: sh_flags: 00000000
elf_dumploadinfo: sh_addr: 00000000
elf_dumploadinfo: sh_offset: 0
elf_dumploadinfo: sh_size: 0
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 0
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 1:
elf_dumploadinfo: sh_name: 00000001
elf_dumploadinfo: sh_type: 00000001
elf_dumploadinfo: sh_flags: 00000006
elf_dumploadinfo: sh_addr: c0000000
elf_dumploadinfo: sh_offset: 64
elf_dumploadinfo: sh_size: 64
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 2:
elf_dumploadinfo: sh_name: 00000007
elf_dumploadinfo: sh_type: 00000001
elf_dumploadinfo: sh_flags: 00000003
elf_dumploadinfo: sh_addr: c0101000
elf_dumploadinfo: sh_offset: 128
elf_dumploadinfo: sh_size: 0
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 3:
elf_dumploadinfo: sh_name: 0000000d
elf_dumploadinfo: sh_type: 00000001
elf_dumploadinfo: sh_flags: 00000003
elf_dumploadinfo: sh_addr: c0101000
elf_dumploadinfo: sh_offset: 128
elf_dumploadinfo: sh_size: 16
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 4:
elf_dumploadinfo: sh_name: 00000016
elf_dumploadinfo: sh_type: 00000008
elf_dumploadinfo: sh_flags: 00000003
elf_dumploadinfo: sh_addr: c0101010
elf_dumploadinfo: sh_offset: 144
elf_dumploadinfo: sh_size: 0
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 5:
elf_dumploadinfo: sh_name: 0000001b
elf_dumploadinfo: sh_type: 00000002
elf_dumploadinfo: sh_flags: 00000000
elf_dumploadinfo: sh_addr: 00000000
elf_dumploadinfo: sh_offset: 144
elf_dumploadinfo: sh_size: 144
elf_dumploadinfo: sh_link: 6
elf_dumploadinfo: sh_info: 4
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 24
elf_dumploadinfo: Sections 6:
elf_dumploadinfo: sh_name: 00000023
elf_dumploadinfo: sh_type: 00000003
elf_dumploadinfo: sh_flags: 00000000
elf_dumploadinfo: sh_addr: 00000000
elf_dumploadinfo: sh_offset: 288
elf_dumploadinfo: sh_size: 25
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 1
elf_dumploadinfo: sh_entsize: 0
elf_dumploadinfo: Sections 7:
elf_dumploadinfo: sh_name: 0000002b
elf_dumploadinfo: sh_type: 00000004
elf_dumploadinfo: sh_flags: 00000000
elf_dumploadinfo: sh_addr: 00000000
elf_dumploadinfo: sh_offset: 320
elf_dumploadinfo: sh_size: 72
elf_dumploadinfo: sh_link: 5
elf_dumploadinfo: sh_info: 1
elf_dumploadinfo: sh_addralign: 8
elf_dumploadinfo: sh_entsize: 24
elf_dumploadinfo: Sections 8:
elf_dumploadinfo: sh_name: 00000036
elf_dumploadinfo: sh_type: 00000003
elf_dumploadinfo: sh_flags: 00000000
elf_dumploadinfo: sh_addr: 00000000
elf_dumploadinfo: sh_offset: 400
elf_dumploadinfo: sh_size: 64
elf_dumploadinfo: sh_link: 0
elf_dumploadinfo: sh_info: 0
elf_dumploadinfo: sh_addralign: 1
elf_dumploadinfo: sh_entsize: 0
elf_read: Read 72 bytes from offset 320
elf_read: Read 24 bytes from offset 192
elf_symvalue: Other: 00000000+c0101000=c0101000
up_relocateadd: PCREL_HI20 at c000001c [00000517] to sym=0x80209030 st_value=c0101000
_calc_imm: offset=1052644: hi=257 lo=-28
elf_read: Read 24 bytes from offset 216
elf_symvalue: Other: 0000001c+c0000000=c000001c
up_relocateadd: PCREL_LO12_I at c0000020 [00050513] to sym=0x80209070 st_value=c000001c
_calc_imm: offset=1052644: hi=257 lo=-28
elf_read: Read 24 bytes from offset 264
elf_read: Read 32 bytes from offset 306
elf_symvalue: SHN_UNDEF: Exported symbol "printf" not found
elf_relocateadd: Section 7 reloc 2: Failed to get value of symbol[5]: -2
elf_loadbinary: Failed to bind symbols program binary: -2
exec_internal: ERROR: Failed to load program 'a.out': -2
nsh: a.out: command not found
nsh>
It says printf
is missing. Let's fix it...
For Reference: Here's the log for an ELF that loads properly on NuttX: NuttX ELF Loader Log
printf
is missing from our TCC Output a.out
. How does NuttX Build link a NuttX App?
Let's find out...
cd apps
rm bin/hello
make --trace import
We see the Linker Command that produces the hello
app...
riscv-none-elf-ld \
--oformat elf64-littleriscv \
-e _start \
-Bstatic \
-Tapps/import/scripts/gnu-elf.ld \
-Lapps/import/libs \
-L "xpack-riscv-none-elf-gcc-13.2.0-2/bin/../lib/gcc/riscv-none-elf/13.2.0/rv64imafdc_zicsr/lp64d" apps/import/startup/crt0.o hello_main.c.workspaces.bookworm.apps.examples.hello.o \
--start-group \
-lmm \
-lc \
-lproxies \
-lgcc apps/libapps.a xpack-riscv-none-elf-gcc-13.2.0-2/bin/../lib/gcc/riscv-none-elf/13.2.0/rv64imafdc_zicsr/lp64d/libgcc.a \
--end-group \
-o apps/bin/hello
This says that NuttX Build links NuttX Apps with these libraries...
-
crt0.o
: Start Code_start
-
-lmm
: Mmmmm? -
-lc
: C Library -
-lproxies
: NuttX Proxy Functions for NuttX System Calls -
-lgcc libgcc.a
: GCC Library
Which are located at apps/import/libs
...
$ ls -l apps/import/libs
total 18776
-rwxr-xr-x 1 3132730 Jan 29 02:12 libapps.a
-rw-r--r-- 1 1064 Jan 29 01:18 libarch.a
-rw-r--r-- 1 8946828 Jan 29 01:18 libc.a
-rw-r--r-- 1 1462710 Sep 24 08:10 libgcc.a
-rw-r--r-- 1 1276866 Jan 29 01:18 libm.a
-rw-r--r-- 1 1304366 Jan 29 01:18 libmm.a
-rw-r--r-- 1 3086312 Jan 29 01:18 libproxies.a
Let's run TCC to link a.out
with the above libraries...
We run TCC to link a.out
with the above libraries...
tcc-riscv32-wasm/riscv64-tcc \
-nostdlib \
apps/import/startup/crt0.o \
apps/bin/a.out \
apps/import/libs/libmm.a \
apps/import/libs/libc.a \
apps/import/libs/libproxies.a \
apps/import/libs/libgcc.a
It says...
tcc: error: Unknown relocation type for got: 60
When we remove libproxies.a
, we don't see the Unknown Relocation Type...
$ tcc-riscv32-wasm/riscv64-tcc \
-nostdlib \
apps/import/startup/crt0.o \
apps/bin/a.out \
apps/import/libs/libmm.a \
apps/import/libs/libc.a \
apps/import/libs/libgcc.a
tcc: error: undefined symbol '_exit'
tcc: error: undefined symbol '_assert'
tcc: error: undefined symbol 'nxsem_destroy'
tcc: error: undefined symbol 'gettid'
tcc: error: undefined symbol 'nxsem_wait'
tcc: error: undefined symbol 'nxsem_trywait'
tcc: error: undefined symbol 'clock_gettime'
tcc: error: undefined symbol 'nxsem_clockwait'
tcc: error: undefined symbol 'nxsem_post'
tcc: error: undefined symbol 'write'
tcc: error: undefined symbol 'lseek'
tcc: error: undefined symbol 'nx_pthread_exit'
Why is libproxies.a
using Relocation Type 60? We dump the Proxy Object File...
riscv-none-elf-readelf --wide -all nuttx/syscall/PROXY_write.o
TODO: Check the ELF Dump
Let's link the Proxy Functions ourselves...
tcc-riscv32-wasm/riscv64-tcc \
-nostdlib \
apps/bin/a.out \
nuttx/syscall/PROXY__exit.o \
nuttx/syscall/PROXY__assert.o \
nuttx/syscall/PROXY_nxsem_destroy.o \
nuttx/syscall/PROXY_gettid.o \
nuttx/syscall/PROXY_nxsem_wait.o \
nuttx/syscall/PROXY_nxsem_trywait.o \
nuttx/syscall/PROXY_clock_gettime.o \
nuttx/syscall/PROXY_nxsem_clockwait.o \
nuttx/syscall/PROXY_nxsem_post.o \
nuttx/syscall/PROXY_write.o \
nuttx/syscall/PROXY_lseek.o \
nuttx/syscall/PROXY_nx_pthread_exit.o \
apps/import/startup/crt0.o \
apps/import/libs/libmm.a \
apps/import/libs/libc.a \
apps/import/libs/libgcc.a
Does it work?
Arg nope...
tcc: error: Unknown relocation type for got: 60
Now Unknown Relocation Type is coming from libc.a
. If we remove libc.a
...
$ tcc-riscv32-wasm/riscv64-tcc \
-nostdlib \
apps/import/startup/crt0.o \
apps/bin/a.out \
nuttx/syscall/PROXY__exit.o \
nuttx/syscall/PROXY__assert.o \
nuttx/syscall/PROXY_nxsem_destroy.o \
nuttx/syscall/PROXY_gettid.o \
nuttx/syscall/PROXY_nxsem_wait.o \
nuttx/syscall/PROXY_nxsem_trywait.o \
nuttx/syscall/PROXY_clock_gettime.o \
nuttx/syscall/PROXY_nxsem_clockwait.o \
nuttx/syscall/PROXY_nxsem_post.o \
nuttx/syscall/PROXY_write.o \
nuttx/syscall/PROXY_lseek.o \
nuttx/syscall/PROXY_nx_pthread_exit.o \
apps/import/libs/libmm.a
tcc: error: undefined symbol 'printf'
tcc: error: undefined symbol 'exit'
What if we call write
directly? (Since write
is a Proxy to NuttX System Call) And stub out exit
?
int write(int fildes, const void *buf, int nbyte);
int main(int argc, char *argv[]) {
const char msg[] = "Hello, World!!\\n";
write(1, msg, sizeof(msg));
return 0;
}
void exit(int status) {
const char msg[] = "TODO: exit\\n";
write(1, msg, sizeof(msg));
}
Still the same sigh...
tcc: error: Unknown relocation type for got: 60
Let's skip everything, we link only a.out
. Since the other modules are causing the Unknown Relocation Type.
We discovered that a.out
must be Relocatable Code, otherwise it crashes in NuttX. So we add -r
to TCC Compiler Options in our modified test.js...
// Allocate a String for passing the Compiler Options to Zig
const options = ["-c", "-r", "hello.c"];
const options_ptr = allocateString(JSON.stringify(options));
// Allocate a String for passing Program Code to Zig
const code_ptr = allocateString(`
int main(int argc, char *argv[]) {
return 0;
}
`);
// Call TCC to compile a program
const ptr = wasm.instance.exports
.compile_program(options_ptr, code_ptr);
console.log(`ptr=${ptr}`);
And we run a.out
on NuttX. Now we get an Instruction Page Fault...
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
binfmt_copyargv: args=2 argsize=23
exec_module: Initialize the user heap (heapsize=528384)
riscv_exception: EXCEPTION: Instruction page fault. MCAUSE: 000000000000000c, EPC: 000000008000ad8a, MTVAL: 000000008000ad8a
riscv_exception: PANIC!!! Exception = 000000000000000c
_assert: Current Version: NuttX 12.4.0 f8b0b06b978 Jan 29 2024 01:16:20 risc-v
_assert: Assertion failed panic: at file: common/riscv_exception.c:85 task: /system/bin/init process: /system/bin/init 0xc000001a
up_dump_register: EPC: 000000008000ad8a
Where is the Exception Program Counter 0x8000ad8a?
0x8000ad8a is actually in NuttX Kernel...
up_task_start():
nuttx/arch/risc-v/src/common/riscv_task_start.c:65
void up_task_start(main_t taskentry, int argc, char *argv[]) {
8000ad7a: 1141 add sp,sp,-16
8000ad7c: 86b2 mv a3,a2
nuttx/arch/risc-v/src/common/riscv_task_start.c:68
/* Let sys_call3() do all of the work */
sys_call3(SYS_task_start, (uintptr_t)taskentry, (uintptr_t)argc,
8000ad7e: 862e mv a2,a1
8000ad80: 85aa mv a1,a0
8000ad82: 4511 li a0,4
nuttx/arch/risc-v/src/common/riscv_task_start.c:65
8000ad84: e406 sd ra,8(sp)
nuttx/arch/risc-v/src/common/riscv_task_start.c:68
sys_call3(SYS_task_start, (uintptr_t)taskentry, (uintptr_t)argc,
8000ad86: 875f50ef jal 800005fa <sys_call0>
nuttx/arch/risc-v/src/common/riscv_task_start.c:71
(uintptr_t)argv);
PANIC();
8000ad8a: 0000e617 auipc a2,0xe
Maybe NuttX Kernel crashed because our NuttX App terminated without calling exit()
?
We're guessing: NuttX Apps should NOT simply ret
to the caller. They should call the NuttX System Call __exit
to terminate peacefully.
But is our NuttX App actually started?
Let's tweak our code to loop forever, see whether our app actually gets started...
// Allocate a String for passing Program Code to Zig
const code_ptr = allocateString(`
int main(int argc, char *argv[]) {
for (;;) {}
return 0;
}
`);
// Call TCC to compile a program
const ptr = wasm.instance.exports
.compile_program(options_ptr, code_ptr);
console.log(`ptr=${ptr}`);
Yep NuttX hangs when starting our app! Which means our TCC Compiled App is actually started by NuttX yay!
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
load_absmodule: Successfully loaded module /system/bin/a.out
binfmt_dumpmodule: Module:
binfmt_dumpmodule: entrypt: 0xc0000000
binfmt_dumpmodule: mapped: 0 size=0
binfmt_dumpmodule: alloc: 0 0 0
binfmt_dumpmodule: addrenv: 0x80209b80
binfmt_dumpmodule: stacksize: 2048
binfmt_dumpmodule: unload: 0
exec_module: Executing a.out
binfmt_copyargv: args=1 argsize=6
binfmt_copyargv: args=2 argsize=23
exec_module: Initialize the user heap (heapsize=528384)
< ...NuttX Hangs... >
(NuttX seems to be starting the first thing that appears in a.out
)
TODO: Unknown Relocation Type may be due to Thread Local Storage generated by GCC Compiler?
TCC fails to link our NuttX App because of Unknown Relocation Type for printf(). How else can we print something in our NuttX App?
We can make a NuttX System Call (ECALL) to write(fd, buf, buflen)
.
Directly in our C Code! Like this: test-nuttx.js
int main(int argc, char *argv[])
{
// Make NuttX System Call to write(fd, buf, buflen)
const unsigned int nbr = 61; // SYS_write
const void *parm1 = 1; // File Descriptor (stdout)
const void *parm2 = "Hello, World!!\n"; // Buffer
const void *parm3 = 15; // Buffer Length
// Execute ECALL for System Call to NuttX Kernel
register long r0 asm("a0") = (long)(nbr);
register long r1 asm("a1") = (long)(parm1);
register long r2 asm("a2") = (long)(parm2);
register long r3 asm("a3") = (long)(parm3);
asm volatile
(
// ECALL for System Call to NuttX Kernel
"ecall \n"
// NuttX needs NOP after ECALL
".word 0x0001 \n"
// Input+Output Registers: None
// Input-Only Registers: A0 to A3
// Clobbers the Memory
:
: "r"(r0), "r"(r1), "r"(r2), "r"(r3)
: "memory"
);
// Loop Forever
for(;;) {}
return 0;
}
Why SysCall 61? Because that's the value of SYS_write
System Call according to nuttx.S
(the RISC-V Disassembly of NuttX Kernel).
In NuttX we enable System Call Logging: make menuconfig
then select...
- Build Setup > Debug Options > SYSCALL Debug Features > Enable "SYSCALL Error, Warnings and Info"
Does it work?
Nope we don't see SysCall 61, but we see a SysCall 15 (what?)...
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
riscv_swint: Entry: regs: 0x8020be10 cmd: 15
up_dump_register: EPC: 00000000c000006c
up_dump_register: A0: 000000000000000f A1: 00000000c0202010 A2: 0000000000000001 A3: 00000000c0202010
up_dump_register: A4: 00000000c0000000 A5: 0000000000000000 A6: 0000000000000000 A7: 0000000000000000
up_dump_register: T0: 0000000000000000 T1: 0000000000000000 T2: 0000000000000000 T3: 0000000000000000
up_dump_register: T4: 0000000000000000 T5: 0000000000000000 T6: 0000000000000000
up_dump_register: S0: 00000000c0202800 S1: 0000000000000000 S2: 0000000000000000 S3: 0000000000000000
up_dump_register: S4: 0000000000000000 S5: 0000000000000000 S6: 0000000000000000 S7: 0000000000000000
up_dump_register: S8: 0000000000000000 S9: 0000000000000000 S10: 0000000000000000 S11: 0000000000000000
up_dump_register: SP: 00000000c02027a0 FP: 00000000c0202800 TP: 0000000000000000 RA: 000000008000adee
riscv_swint: SWInt Return: 7
But the registers A0, A1, A2 and A3 don't look right!
Let's hardcode Registers A0, A1, A2 and A3 in Machine Code (because TCC won't assemble the li
instruction): test-nuttx.js
// Load 61 to Register A0 (SYS_write)
"addi a0, zero, 61 \n"
// Load 1 to Register A1 (File Descriptor)
"addi a1, zero, 1 \n"
// Load 0xc0101000 to Register A2 (Buffer)
"lui a2, 0xc0 \n"
"addiw a2, a2, 257 \n"
"slli a2, a2, 0xc \n"
// Load 15 to Register A3 (Buffer Length)
"addi a3, zero, 15 \n"
// ECALL for System Call to NuttX Kernel
"ecall \n"
// NuttX needs NOP after ECALL
".word 0x0001 \n"
(We used this RISC-V Online Assembler to assemble the Machine Code)
When we run this, we see SysCall 61...
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
riscv_swint: Entry: regs: 0x8020be10 cmd: 61
up_dump_register: EPC: 00000000c0000084
up_dump_register: A0: 000000000000003d A1: 0000000000000001 A2: 00000000c0101000 A3: 000000000000000f
up_dump_register: A4: 00000000c0000000 A5: 0000000000000000 A6: 0000000000000000 A7: 0000000000000000
up_dump_register: T0: 0000000000000000 T1: 0000000000000000 T2: 0000000000000000 T3: 0000000000000000
up_dump_register: T4: 0000000000000000 T5: 0000000000000000 T6: 0000000000000000
up_dump_register: S0: 00000000c0202800 S1: 0000000000000000 S2: 0000000000000000 S3: 0000000000000000
up_dump_register: S4: 0000000000000000 S5: 0000000000000000 S6: 0000000000000000 S7: 0000000000000000
up_dump_register: S8: 0000000000000000 S9: 0000000000000000 S10: 0000000000000000 S11: 0000000000000000
up_dump_register: SP: 00000000c02027a0 FP: 00000000c0202800 TP: 0000000000000000 RA: 000000008000adee
riscv_swint: SWInt Return: 35
Hello, World!!
And "Hello, World!!" is printed yay!
How did we figure out that the buffer is at 0xc0101000?
We saw this in the NuttX Log...
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
elf_read: Read 576 bytes from offset 512
elf_loadfile: Loaded sections:
elf_read: Read 154 bytes from offset 64
elf_loadfile: 1. 00000000->c0000000
elf_read: Read 0 bytes from offset 224
elf_loadfile: 2. 00000000->c0101000
elf_read: Read 16 bytes from offset 224
elf_loadfile: 3. 00000000->c0101000
elf_loadfile: 4. 00000000->c0101010
Which says that the NuttX ELF Loader copied 16 bytes from our NuttX App Data Section .data.ro
to 0xc0101000. That's all 15 bytes of "Hello, World!!\n", including the terminating null!
Something odd about the TCC-generated RISC-V Machine Code?
The registers seem to be mushed up in the generated RISC-V Machine Code. That's why it was passing value 15 in Register A0. (Supposed to be Register A3)
// Watch how TCC compiles this C Program to RISC-V Assembly...
// register long a0 asm("a0") = 61; // SYS_write
// register long a1 asm("a1") = 1; // File Descriptor (stdout)
// register long a2 asm("a2") = "Hello, World!!\\n"; // Buffer
// register long a3 asm("a3") = 15; // Buffer Length
// Execute ECALL for System Call to NuttX Kernel
// asm volatile (
// ECALL for System Call to NuttX Kernel
// "ecall \\n"
// ".word 0x0001 \\n"
main():
0: fc010113 add sp,sp,-64
4: 02113c23 sd ra,56(sp)
8: 02813823 sd s0,48(sp)
c: 04010413 add s0,sp,64
10: 00000013 nop
14: fea43423 sd a0,-24(s0)
18: feb43023 sd a1,-32(s0)
// Correct: Load Register A0 with 61 (SYS_write)
1c: 03d0051b addw a0,zero,61
20: fca43c23 sd a0,-40(s0)
// Nope: Load Register A0 with 1?
// Mixed up with Register A1! (Value 1)
24: 0010051b addw a0,zero,1
28: fca43823 sd a0,-48(s0)
// Nope: Load Register A0 with "Hello World"?
// Mixed up with Register A2!
2c: 00000517 auipc a0,0x0 2c: R_RISCV_PCREL_HI20 L.0
30: 00050513 mv a0,a0 30: R_RISCV_PCREL_LO12_I .text
34: fca43423 sd a0,-56(s0)
// Nope: Load Register A0 with 15?
// Mixed up with Register A3! (Value 15)
38: 00f0051b addw a0,zero,15
3c: fca43023 sd a0,-64(s0)
// Execute ECALL with Register A0 set to 15.
// Nope A0 should be 1!
40: 00000073 ecall
44: 0001 nop
// Loop Forever
46: 0000006f j 46 <main+0x46>
4a: 03813083 ld ra,56(sp)
4e: 03013403 ld s0,48(sp)
52: 04010113 add sp,sp,64
56: 00008067 ret
TODO: Is there a workaround? Do we paste the ECALL Machine Code ourselves?
TODO: Call the NuttX System Call __exit
to terminate peacefully
Read the article...
OK so we can compile NuttX Apps in a Web Browser... But can we run them in a Web Browser?
Yep! A NuttX App compiled in the Web Browser... Now runs OK with NuttX Emulator in Web Browser! 🎉
-
Browse to our latest NuttX Emulator in Web Browser...
NuttX Emulator for Ox64 (modified for TCC)
Enter
a.out
and we'll see...NuttShell (NSH) NuttX-12.4.0-RC0 nsh> a.out nsh: a.out: command not found
-
Browse to our TCC Compiler in Web Browser...
Paste this code into the Web Browser...
int main(int argc, char *argv[]) { // Make NuttX System Call to write(fd, buf, buflen) const unsigned int nbr = 61; // SYS_write const void *parm1 = 1; // File Descriptor (stdout) const void *parm2 = "Hello, World!!\n"; // Buffer const void *parm3 = 15; // Buffer Length // Load the Parameters into Registers A0 to A3 // TODO: This doesn't work, so we load again below register long r0 asm("a0") = (long)(nbr); register long r1 asm("a1") = (long)(parm1); register long r2 asm("a2") = (long)(parm2); register long r3 asm("a3") = (long)(parm3); // Execute ECALL for System Call to NuttX Kernel // Load the Parameters into Registers A0 to A3 asm volatile ( // Load 61 to Register A0 (SYS_write) "addi a0, zero, 61 \n" // Load 1 to Register A1 (File Descriptor) "addi a1, zero, 1 \n" // Load 0x80101000 to Register A2 (Buffer) "lui a2, 0x80 \n" "addiw a2, a2, 257 \n" "slli a2, a2, 0xc \n" // Load 15 to Register A3 (Buffer Length) "addi a3, zero, 15 \n" // ECALL for System Call to NuttX Kernel "ecall \n" // NuttX needs NOP after ECALL ".word 0x0001 \n" // Input+Output Registers: None // Input-Only Registers: A0 to A3 // Clobbers the Memory : : "r"(r0), "r"(r1), "r"(r2), "r"(r3) : "memory" ); // Loop Forever for(;;) {} return 0; }
Note that the Buffer Address (Register A2) is now 0x80101000 because we're running on Ox64 Emulator.
(Instead of 0xC0101000 for QEMU Emulator)
-
Click the "Compile" button.
-
Head back to our latest NuttX Emulator in Web Browser...
NuttX Emulator for Ox64 (modified for TCC)
Enter
a.out
and we'll see...NuttShell (NSH) NuttX-12.4.0-RC0 nsh> a.out Hello, World!!
A NuttX App compiled in the Web Browser... Now runs OK with NuttX Emulator in Web Browser! 🎉
-
Try changing "Hello World" to "Hello aaaaa" (or any other message) as long as the Length stays the same.
(Because we hardcoded the length in Register A3)
Recompile, switch back to the Emulator, re-run
a.out
. It changes!
Wow how does it work?
In Chrome Web Browser, click to Menu > Developer Tools > Application Tab > Local Storage > lupyuen.github.io
We'll see that the RISC-V ELF a.out
is stored locally as elf_data
in Local Storage.
That's why NuttX Emulator can pick up the a.out
from our Web Browser!
TCC Compiler saves a.out
to elf_data
in Local Storage: tcc.js
// Call TCC to compile a program
const ptr = wasm.instance.exports
.compile_program(options_ptr, code_ptr);
console.log(`main: ptr=${ptr}`);
...
// Encode the `a.out` data from the rest of the bytes returned
const data = new Uint8Array(memory.buffer, ptr + 4, len);
let encoded_data = "";
for (const i in data) {
const hex = Number(data[i]).toString(16).padStart(2, "0");
encoded_data += `%${hex}`;
}
// Save the ELF Data to Local Storage for loading by NuttX Emulator
localStorage.setItem("elf_data", encoded_data);
console.log({ elf_data: localStorage.getItem("elf_data") });
But NuttX Emulator boots from a fixed NuttX Image, loaded from our Static Web Server. How did a.out
appear inside the NuttX Image?
We used a nifty illusion... a.out
was in the NuttX Image all along!
## Create a Fake a.out that contains a Distinct Pattern:
## 22 05 69 00
## 22 05 69 01
## For 1024 times
rm -f /tmp/pattern.txt
start=$((0x22056900))
for i in {0..1023}
do
printf 0x%x\\n $(($start + $i)) >> /tmp/pattern.txt
done
## Copy the Fake a.out to our NuttX Apps Folder
cat /tmp/pattern.txt \
| xxd -revert -plain \
>~/ox64/apps/bin/a.out
hexdump -C ~/ox64/apps/bin/a.out
During NuttX Build, the Fake a.out
gets bundled into the Initial RAM Disk (initrd).
Which gets appended to the NuttX Image.
But because a.out
doesn't contain a valid ELF File, NuttX says "command not found" because it couldn't load a.out
as an ELF Executable.
(This won't work with QEMU, because NuttX QEMU doesn't append the Initial RAM Disk to NuttX Image. Instead QEMU uses Semihosting to access the NuttX Apps, which won't work in a Web Browsr)
So we patched Fake a.out
in the NuttX Image with the Real a.out
?
Exactly! In the NuttX Emulator JavaScript, we read elf_data
from the Local Storage and pass it to TinyEMU WebAssembly: jslinux.js
function start()
{
//// Patch the ELF Data to a.out in Initial RAM Disk
//// For Testing: localStorage.setItem("elf_data", "%7f%45%4c%46%02%01%01%00%00%00%00%00%00%00%00%00%01%00%f3%00%01%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%02%00%00%00%00%00%00%04%00%00%00%40%00%00%00%00%00%40%00%09%00%08%00%13%01%01%fa%23%3c%11%04%23%38%81%04%13%04%01%06%13%00%00%00%23%34%a4%fe%23%30%b4%fe%1b%05%d0%03%23%2e%a4%fc%1b%05%10%00%23%38%a4%fc%17%05%00%00%13%05%05%00%23%34%a4%fc%1b%05%f0%00%23%30%a4%fc%03%25%c4%fd%13%15%05%02%13%55%05%02%23%3c%a4%fa%03%35%04%fd%23%38%a4%fa%03%35%84%fc%23%34%a4%fa%03%35%04%fc%23%30%a4%fa%13%05%d0%03%93%05%10%00%37%06%08%00%1b%06%16%10%13%16%c6%00%93%06%f0%00%73%00%00%00%01%00%6f%00%00%00%83%30%81%05%03%34%01%05%13%01%01%06%67%80%00%00%00%00%00%00%00%00%48%65%6c%6c%6f%2c%20%57%6f%72%6c%64%21%21%0a%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%01%00%00%00%04%00%f1%ff%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%0e%00%00%00%01%00%03%00%00%00%00%00%00%00%00%00%10%00%00%00%00%00%00%00%00%00%00%00%00%00%01%00%2c%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%09%00%00%00%12%00%01%00%00%00%00%00%00%00%00%00%9a%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%68%65%6c%6c%6f%2e%63%00%6d%61%69%6e%00%4c%2e%30%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%2c%00%00%00%00%00%00%00%17%00%00%00%02%00%00%00%00%00%00%00%00%00%00%00%30%00%00%00%00%00%00%00%18%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%2e%74%65%78%74%00%2e%64%61%74%61%00%2e%64%61%74%61%2e%72%6f%00%2e%62%73%73%00%2e%73%79%6d%74%61%62%00%2e%73%74%72%74%61%62%00%2e%72%65%6c%61%2e%74%65%78%74%00%2e%73%68%73%74%72%74%61%62%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%01%00%00%00%01%00%00%00%06%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%40%00%00%00%00%00%00%00%9a%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%08%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%07%00%00%00%01%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%e0%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%08%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%0d%00%00%00%01%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%e0%00%00%00%00%00%00%00%10%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%08%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%16%00%00%00%08%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%f0%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%08%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%1b%00%00%00%02%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%f0%00%00%00%00%00%00%00%78%00%00%00%00%00%00%00%06%00%00%00%04%00%00%00%08%00%00%00%00%00%00%00%18%00%00%00%00%00%00%00%23%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%70%01%00%00%00%00%00%00%12%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%01%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%2b%00%00%00%04%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%90%01%00%00%00%00%00%00%30%00%00%00%00%00%00%00%05%00%00%00%01%00%00%00%08%00%00%00%00%00%00%00%18%00%00%00%00%00%00%00%36%00%00%00%03%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%c0%01%00%00%00%00%00%00%40%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%01%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00");
let elf_len = 0;
let elf_data = new Uint8Array([]);
const elf_data_encoded = localStorage.getItem("elf_data");
if (elf_data_encoded) {
elf_data = new Uint8Array(
elf_data_encoded
.split("%")
.slice(1)
.map(hex=>Number("0x" + hex))
);
elf_len = elf_data.length;
console.log({elf_len, elf_data});
}
...
// Pass elf_data and elf_len to TinyEMU
Module.ccall(
"vm_start",
null,
["string", "number", "string", "string", "number", "number", "number", "string", "array", "number"],
[url, mem_size, cmdline, pwd, width, height, (net_state != null) | 0, drive_url, elf_data, elf_len]
);
Inside the TinyEMU WebAssembly: We receive the elf_data
and copy it locally, because it will be clobbered (why?): jsemu.c
void vm_start(const char *url, int ram_size, const char *cmdline,
const char *pwd, int width, int height, BOOL has_network,
const char *drive_url, uint8_t *elf_data0, int elf_len0)
{
//// Patch the ELF Data to a.out in Initial RAM Disk
extern uint8_t elf_data[]; // From riscv_machine.c
extern int elf_len;
elf_len = elf_len0;
// Must copy ELF Data to Local Buffer because it will get overwritten
printf("elf_len=%d\n", elf_len);
if (elf_len > 4096) { puts("*** ERROR: elf_len exceeds 4096, increase elf_data and a.out size"); }
memcpy(elf_data, elf_data0, elf_len);
Then we search for our Magic Pattern 22 05 69 00
in our Fake a.out
: riscv_machine.c
// Patch the ELF Data to a.out in Initial RAM Disk
uint64_t elf_addr = 0;
printf("elf_len=%d\n", elf_len);
if (elf_len > 0) {
//// TODO: Fix the Image Size
for (int i = 0; i < 0xD61680; i++) {
const uint8_t pattern[] = { 0x22, 0x05, 0x69, 0x00 };
if (memcmp(&kernel_ptr[i], pattern, sizeof(pattern)) == 0) {
//// TODO: Catch overflow of a.out
memcpy(&kernel_ptr[i], elf_data, elf_len);
elf_addr = RAM_BASE_ADDR + i;
printf("Patched ELF Data to a.out at %p\n", elf_addr);
break;
}
}
if (elf_addr == 0) { puts("*** ERROR: Pattern for ELF Data a.out is missing"); }
}
And we overwrite the Fake a.out
with the Real a.out
from elf_data
.
That's how we compile a NuttX App in the Web Browser, and run it with NuttX Emulator in the Web Browser! 🎉
Read the article...
TCC WebAssembly needs an Embedded Filesystem that will have C Header Files and C Library Files for building apps...
How will we implement this Embedded Filesystem in Zig?
Let's embed the simple ROM FS Filesystem inside our Zig Wrapper...
-
Our TCC JavaScript will fetch the Bundled ROM FS Filesystem over HTTP: romfs.bin
-
Then copy the Bundled ROM FS into Zig Wrapper's WebAssembly Memory
-
Our Zig Wrapper will mount the ROM FS in memory
-
And expose POSIX Functions to TCC that will access the Emulated Filesystem
(Works like the Emscripten Filesystem)
How to bundle our C Header Files and C Library Files into the ROM FS Filesystem?
Like this...
## For Ubuntu: Install genromfs
sudo apt install genromfs
## For macOS: Install genromfs
brew install genromfs
## Bundle the romfs folder into ROM FS Filesystem romfs.bin
## and label with this Volume Name
genromfs \
-f zig/romfs.bin \
-d zig/romfs \
-V "ROMFS"
(See the ROM FS Binary zig/romfs.bin
)
(See the ROM FS Files zig/romfs
)
How to implement the ROM FS in our Zig Wrapper?
We'll borrow the ROM FS Driver from Apache NuttX RTOS. And compile it from C to WebAssembly with Zig Compiler...
(See the Modified Source Files)
This compiles OK with Zig Compiler with a few tweaks, let's test it in Zig...
Read the article...
We borrowed the ROM FS Driver from Apache NuttX RTOS. Zig Compiler compiles it to WebAssembly with a few tweaks...
How do we call the ROM FS Driver to Mount the ROM FS Filesystem?
This is how we mount the ROM FS Filesystem in Zig: tcc-wasm.zig
/// Import the ROM FS
const c = @cImport({
@cInclude("zig_romfs.h");
});
/// Compile a C program to 64-bit RISC-V
pub export fn compile_program(...) [*]const u8 {
// Create the Memory Allocator for malloc
memory_allocator = std.heap.FixedBufferAllocator.init(&memory_buffer);
// Mount the ROM FS Filesystem
const ret = c.romfs_bind( // Bind the ROM FS Filesystem
c.romfs_blkdriver, // blkdriver: ?*struct_inode_6
null, // data: ?*const anyopaque
&c.romfs_mountpt // handle: [*c]?*anyopaque
);
assert(ret >= 0);
Zig won't let us create objects for romfs_blkdriver
and romfs_mountpt
, so we create them in C: fs_romfs.c
struct inode romfs_blkdriver_inode;
struct inode *romfs_blkdriver = &romfs_blkdriver_inode;
void *romfs_mountpt = NULL;
This crashes inside romfs_fsconfigure...
$ node zig/test.js
compile_program: start
Entry
wasm://wasm/0085e9b2:1
RuntimeError: unreachable
at signature_mismatch:mtd_bread (wasm://wasm/0085e9b2:wasm-function[10]:0x842)
at romfs_fsconfigure (wasm://wasm/0085e9b2:wasm-function[22]:0xab3)
at romfs_bind (wasm://wasm/0085e9b2:wasm-function[20]:0x954)
at compile_program (wasm://wasm/0085e9b2:wasm-function[251]:0x4e683)
at /workspaces/bookworm/tcc-riscv32-wasm/zig/test.js:63:6
We need to return the XIP Address so that romfs_fsconfigure will read the RAM directly. (Instead of reading from the device)
From fs_romfsutil.c:
// Implement mid_ioctl() so that BIOC_XIPBASE
// sets the XIP Address in rm_xipbase
ret = MTD_IOCTL(inode->u.i_mtd, BIOC_XIPBASE,
(unsigned long)&rm->rm_xipbase);
We implement mid_ioctl
for BIOC_XIPBASE
: tcc-wasm.zig
export fn mtd_ioctl(_: *mtd_dev_s, cmd: c_int, rm_xipbase: ?*c_int) c_int {
assert(rm_xipbase != null);
if (cmd == c.BIOC_XIPBASE) {
// Return the XIP Base Address
rm_xipbase.?.* = @intCast(@intFromPtr(ROMFS_DATA));
} else if (cmd == c.MTDIOC_GEOMETRY) {
// Return the Storage Device Geometry
const geo: *c.mtd_geometry_s = @ptrCast(rm_xipbase.?);
geo.*.blocksize = 64;
geo.*.erasesize = 64;
geo.*.neraseblocks = 1024; // TODO: Is this needed?
const name = "ZIG_ROMFS";
@memcpy(geo.*.model[0..name.len], name);
geo.*.model[name.len] = 0;
} else {
debug("mtd_ioctl: Unknown command {}", .{cmd});
}
return 0;
}
/// Embed the ROM FS Filesystem.
/// Later our JavaScript shall fetch this over HTTP.
const ROMFS_DATA = @embedFile("romfs.bin");
Also we embed the ROM FS Data inside our Zig Wrapper for now. Later our JavaScript shall fetch romfs.bin
over HTTP.
And the mounting succeeds yay!
$ node zig/test.js
compile_program: start
compile_program: Mounting ROM FS...
Entry
compile_program: ROM FS mounted OK!
The ROM FS Driver verifies the Magic Number when mounting. So we know it's correct: fs_romfsutil.c
int romfs_fsconfigure(FAR struct romfs_mountpt_s *rm) {
...
/* Verify the magic number at that identifies this as a ROMFS filesystem */
#define ROMFS_VHDR_MAGIC "-rom1fs-"
if (memcmp(rm->rm_buffer, ROMFS_VHDR_MAGIC, 8) != 0)
{ return -EINVAL; }
We're sure it's correct?
If we don't embed a proper ROM FS Filesystem, the Magic Number will fail...
## Let's embed some junk:
## const ROMFS_DATA = @embedFile("build.sh");
## The ROM FS Mounting fails...
$ node zig/test.js
compile_program: start
Entry
ERROR: romfs_fsconfigure failed: -22
So yeah we're correct.
Let's open a file from ROM FS...
Read the article...
This is how we open a file from ROM FS in Zig: tcc-wasm.zig
// Create the Mount Inode
const mount_inode = c.create_mount_inode(c.romfs_mountpt);
// Create the File Struct
var filep = std.mem.zeroes(c.struct_file);
filep.f_inode = mount_inode;
// Open the file
const ret2 = c.romfs_open( // Open "hello" for Read-Only. `mode` is used only for creating files.
&filep, // filep: [*c]struct_file
"hello", // relpath: [*c]const u8
c.O_RDONLY, // oflags: c_int
0 // mode: mode_t
);
assert(ret2 >= 0);
Our file has been opened successfully yay!
$ node zig/test-nuttx.js
compile_program: start
compile_program: Mounting ROM FS...
Entry
compile_program: ROM FS mounted OK!
compile_program: Opening ROM FS File `hello`...
Open 'hello'
compile_program: ROM FS File `hello` opened OK!
"/hello" works OK too...
// Open "/hello"
romfs_open(..., "/hello", ...);
What if the file doesn't exist?
ROM FS Driver says that the file doesn't exist...
## Let's try a file that doesn't exist:
## romfs_open(..., "hello2", ...)
compile_program: Opening ROM FS File
Open 'hello2'
ERROR: Failed to find directory directory entry for '%s': %d
So yep our ROM FS Driver is reading the ROM FS Directory correctly!
How did we figure out the Mount Inode?
See the NuttX Code: Create a Mount Inode with inode_reserve
Finally we read a ROM FS file...
Read the article...
This is how we read a ROM FS File in Zig (and close it): tcc-wasm.zig
// Read the file
var buf = std.mem.zeroes([4]u8);
const ret3 = c.romfs_read( // Read the file
&filep, // filep: [*c]struct_file
&buf, // buffer: [*c]u8
buf.len // buflen: usize
);
assert(ret3 >= 0);
hexdump.hexdump(@ptrCast(&buf), @intCast(ret3));
// Close the file
const ret4 = c.romfs_close(&filep);
assert(ret4 >= 0);
And it works yay!
$ node zig/test.js
compile_program: start
compile_program: Mounting ROM FS...
Entry
compile_program: ROM FS mounted OK!
compile_program: Opening ROM FS File `hello`...
Open 'hello'
compile_program: ROM FS File `hello` opened OK!
compile_program: Reading ROM FS File `hello`...
Read %zu bytes from offset %jd
Read sector %jd
sector: %d cached: %d ncached: %d sectorsize: %d XIP base: %p buffer: %p
XIP buffer: %p
Return %d bytes from sector offset %d
compile_program: ROM FS File `hello` read OK!
0000: 7F 45 4C 46 .ELF
compile_program: Closing ROM FS File `hello`...
Closing
compile_program: ROM FS File `hello` closed OK!
This works OK in the Web Browser too!
Let's integrate the ROM FS Driver with TCC...
Read the article...
TCC WebAssembly needs a ROM FS Filesystem that will have C Header Files and C Library Files for building apps...
How will we integrate the NuttX ROM FS Driver in Zig?
At Startup: We call the NuttX ROM FS Driver to mount the ROM FS Filesystem: tcc-wasm.zig
/// Next File Descriptor Number.
/// First File Descriptor is reserved for C Program `hello.c`
var next_fd: c_int = FIRST_FD;
const FIRST_FD = 3;
/// Map a File Descriptor to the ROM FS File
/// Index of romfs_files = File Descriptor Number - FIRST_FD - 1
var romfs_files: std.ArrayList(*c.struct_file) = undefined;
/// Compile a C program to 64-bit RISC-V
pub export fn compile_program(...) [*]const u8 {
// Create the Memory Allocator for malloc
memory_allocator = std.heap.FixedBufferAllocator.init(&memory_buffer);
// Map from File Descriptor to ROM FS File
romfs_files = std.ArrayList(*c.struct_file).init(std.heap.page_allocator);
defer romfs_files.deinit();
// Mount the ROM FS Filesystem
const ret = c.romfs_bind( // Bind the ROM FS Filesystem
c.romfs_blkdriver, // blkdriver: ?*struct_inode_6
null, // data: ?*const anyopaque
&c.romfs_mountpt // handle: [*c]?*anyopaque
);
assert(ret >= 0);
// Create the Mount Inode and test the ROM FS
romfs_inode = c.create_mount_inode(c.romfs_mountpt);
test_romfs();
NuttX ROM FS Driver will call mtd_ioctl
to map the ROM FS Data in memory: tcc-wasm.zig
/// Embed the ROM FS Filesystem.
/// Later our JavaScript shall fetch this over HTTP.
const ROMFS_DATA = @embedFile("romfs.bin");
export fn mtd_ioctl(_: *mtd_dev_s, cmd: c_int, rm_xipbase: ?*c_int) c_int {
assert(rm_xipbase != null);
if (cmd == c.BIOC_XIPBASE) {
// Return the XIP Base Address
rm_xipbase.?.* = @intCast(@intFromPtr(ROMFS_DATA));
} else if (cmd == c.MTDIOC_GEOMETRY) {
// Return the Storage Device Geometry
const geo: *c.mtd_geometry_s = @ptrCast(rm_xipbase.?);
geo.*.blocksize = 64;
geo.*.erasesize = 64;
geo.*.neraseblocks = 1024; // TODO: Is this needed?
const name = "ZIG_ROMFS";
@memcpy(geo.*.model[0..name.len], name);
geo.*.model[name.len] = 0;
} else {
debug("mtd_ioctl: Unknown command {}", .{cmd});
}
return 0;
}
When TCC WebAssembly calls open
to open an Include File, we call the NuttX ROM FS Driver to open the file in ROM FS: tcc-wasm.zig
export fn open(path: [*:0]const u8, oflag: c_uint, ...) c_int {
// If opening the C Program File `hello.c`
// Or creating `hello.o`...
// Just return the File Descriptor
// TODO: This might create a hole in romfs_files if we open a file for reading after writing another file
if (next_fd == FIRST_FD or oflag == 577) {
const fd = next_fd;
next_fd += 1;
return fd;
} else {
// If opening an Include File or Library File...
// Allocate the File Struct
const files = std.heap.page_allocator.alloc(c.struct_file, 1) catch {
debug("open: Failed to allocate file", .{});
@panic("open: Failed to allocate file");
};
const file = &files[0];
file.* = std.mem.zeroes(c.struct_file);
file.*.f_inode = romfs_inode;
// Strip the path from System Include
const sys = "/usr/local/lib/tcc/include/";
const strip_path = if (std.mem.startsWith(u8, std.mem.span(path), sys)) (path + sys.len) else path;
// Open the ROM FS File
const ret = c.romfs_open( // Open for Read-Only. `mode` is used only for creating files.
file, // filep: [*c]struct_file
strip_path, // relpath: [*c]const u8
c.O_RDONLY, // oflags: c_int
0 // mode: mode_t
);
if (ret < 0) { return ret; }
// Remember the ROM FS File
const fd = next_fd;
next_fd += 1;
const f = fd - FIRST_FD - 1;
assert(romfs_files.items.len == f);
romfs_files.append(file) catch {
debug("Unable to allocate file", .{});
@panic("Unable to allocate file");
};
return fd;
}
}
When TCC WebAssembly calls read
to read the Include File, we call ROM FS: tcc-wasm.zig
export fn read(fd: c_int, buf: [*:0]u8, nbyte: size_t) isize {
// If reading the C Program...
if (fd == FIRST_FD) {
// Copy from the Read Buffer
const len = read_buf.len;
assert(len < nbyte);
@memcpy(buf[0..len], read_buf[0..len]);
buf[len] = 0;
read_buf.len = 0;
return @intCast(len);
} else {
// Fetch the ROM FS File
const f = fd - FIRST_FD - 1;
const file = romfs_files.items[@intCast(f)];
// Read from the ROM FS File
const ret = c.romfs_read( // Read the file
file, // filep: [*c]struct_file
buf, // buffer: [*c]u8
nbyte // buflen: usize
);
assert(ret >= 0);
return @intCast(ret);
}
}
And finally we call ROM FS Driver to close the Include File: tcc-wasm.zig
export fn close(fd: c_int) c_int {
// If closing an Include File or Library File...
if (fd > FIRST_FD) {
// Fetch the ROM FS File
const f = fd - FIRST_FD - 1;
if (f >= romfs_files.items.len) {
// Skip the closing of `hello.o`
return 0;
}
const file = romfs_files.items[@intCast(f)];
// Close the ROM FS File. TODO: Deallocate the file
const ret = c.romfs_close(file);
assert(ret >= 0);
}
return 0;
}
We stage the Include Files stdio.h
and stdlib.h
here: zig/romfs
$ ls -l zig/romfs
-rw-r--r-- 1 25 stdio.h
-rw-r--r-- 1 23 stdlib.h
And we bundle them into romfs.bin
...
## For Ubuntu: Install genromfs
sudo apt install genromfs
## For macOS: Install genromfs
brew install genromfs
## Bundle the romfs folder into ROM FS Filesystem romfs.bin
## and label with this Volume Name
genromfs \
-f zig/romfs.bin \
-d zig/romfs \
-V "ROMFS"
(See the ROM FS Binary zig/romfs.bin
)
At last we have a proper POSIX (Read-Only) Filesystem for TCC WebAssembly yay!
open: path=/usr/local/lib/tcc/include/stdio.h, oflag=0, return fd=4
Open 'stdio.h'
read: fd=4, nbyte=8192
XIP buffer: anyopaque@10b672
read: return buf=
int puts(const char *s);
read: fd=4, nbyte=8192
read: return buf=
close: fd=4
Closing
open: path=/usr/local/lib/tcc/include/stdlib.h, oflag=0, return fd=5
Open 'stdlib.h'
read: fd=5, nbyte=8192
XIP buffer: anyopaque@10b6b2
read: return buf=
void exit(int status);
read: fd=5, nbyte=8192
read: return buf=
close: fd=5
Closing
What if we need a Temporary Writeable Filesystem?
Try the NuttX Tmp FS Driver: nuttx/fs/tmpfs
Time to wrap up and run everything in a Web Browser...
Read the article...
Remember we're doing a Decent Demo of Building and Testing a #NuttX App in the Web Browser... puts
and exit
finally work OK yay! 🎉
-
TCC Compiler in WebAssembly compiles
puts
andexit
to proper NuttX System Calls -
By loading
<stdio.h>
and<stdlib.h>
from the ROM FS Filesystem (thanks to the NuttX Driver) -
TCC Compiler generates the 64-bit RISC-V ELF
a.out
-
Which gets automagically copied to NuttX Emulator in WebAssembly
-
And NuttX Emulator executes
puts
andexit
correctly as NuttX System Calls!
Try the new ROM FS Demo here: https://lupyuen.github.io/tcc-riscv32-wasm/romfs/
#include <stdio.h>
#include <stdlib.h>
void main(int argc, char *argv[]) {
puts("Hello, World!!\n");
exit(0);
}
Click "Compile". Then run the a.out
here: https://lupyuen.github.io/nuttx-tinyemu/tcc/
Loading...
TinyEMU Emulator for Ox64 BL808 RISC-V SBC
ABC
NuttShell (NSH) NuttX-12.4.0-RC0
nsh> a.out
Hello, World!!
nsh> a.out
Hello, World!!
nsh> a.out
Hello, World!!
nsh>
Try changing "Hello World" to something else. Recompile and Reload the NuttX Emulator. It works!
Impressive, no? 3 things we fixed...
How did we get <stdio.h> and <stdlib.h> in TCC WebAssembly?
We create a Staging Folder zig/romfs that contains our C Header Files for TCC Compiler...
Then we bundle the Staging Folder into a ROM FS Filesystem...
## For Ubuntu: Install genromfs
sudo apt install genromfs
## For macOS: Install genromfs
brew install genromfs
## Bundle the romfs folder into ROM FS Filesystem romfs.bin
## and label with this Volume Name
genromfs \
-f zig/romfs.bin \
-d zig/romfs \
-V "ROMFS"
Which becomes the ROM FS Data File zig/romfs.bin
Inside our TCC WebAssembly: We mounted the ROM FS Filesystem by calling the NuttX ROM FS Driver. (Which has been integrated into our Zig WebAssembly)
See the earlier sections to find out how we modded the POSIX Filesystem Calls (from TCC WebAssembly) to access the NuttX ROM FS Driver.
In our Demo NuttX App, we implement puts
by calling write
: stdio.h
// Print the string to Standard Output
inline int puts(const char *s) {
return
write(1, s, strlen(s)) +
write(1, "\n", 1);
}
Then we implement write
the exact same way as NuttX, making a System Call: stdio.h
// Caution: This may change
#define SYS_write 61
// Write to the File Descriptor
// https://lupyuen.github.io/articles/app#nuttx-app-calls-nuttx-kernel
inline ssize_t write(int parm1, const void * parm2, size_t parm3) {
return (ssize_t) sys_call3(
(unsigned int) SYS_write, // System Call Number
(uintptr_t) parm1, // File Descriptor (1 = Standard Output)
(uintptr_t) parm2, // Buffer to be written
(uintptr_t) parm3 // Number of bytes to write
);
}
sys_call3
is our hacked implementation of NuttX System Call: stdio.h
// Make a System Call with 3 parameters
// https://github.com/apache/nuttx/blob/master/arch/risc-v/include/syscall.h#L240-L268
inline uintptr_t sys_call3(
unsigned int nbr, // System Call Number
uintptr_t parm1, // First Parameter
uintptr_t parm2, // Second Parameter
uintptr_t parm3 // Third Parameter
) {
// Pass the Function Number and Parameters in
// Registers A0 to A3
register long r3 asm("a0") = (long)(parm3); // Will move to A3
asm volatile ("slli a3, a0, 32"); // Shift 32 bits Left then Right
asm volatile ("srli a3, a3, 32"); // To clear the top 32 bits
register long r2 asm("a0") = (long)(parm2); // Will move to A2
asm volatile ("slli a2, a0, 32"); // Shift 32 bits Left then Right
asm volatile ("srli a2, a2, 32"); // To clear the top 32 bits
register long r1 asm("a0") = (long)(parm1); // Will move to A1
asm volatile ("slli a1, a0, 32"); // Shift 32 bits Left then Right
asm volatile ("srli a1, a1, 32"); // To clear the top 32 bits
register long r0 asm("a0") = (long)(nbr); // Will stay in A0
// `ecall` will jump from RISC-V User Mode
// to RISC-V Supervisor Mode
// to execute the System Call.
// Input + Output Registers: A0 to A3
// Clobbers the Memory
asm volatile
(
// ECALL for System Call to NuttX Kernel
"ecall \n"
// NuttX needs NOP after ECALL
".word 0x0001 \n"
// Input+Output Registers: None
// Input-Only Registers: A0 to A3
// Clobbers the Memory
:
: "r"(r0), "r"(r1), "r"(r2), "r"(r3)
: "memory"
);
// Return the result from Register A0
return r0;
}
Why so complicated?
That's because TCC won't load the RISC-V Registers correctly. Thus we load the registers ourselves.
Why not simply copy A0 to A2?
register long r2 asm("a0") = (long)(parm2); // Will move to A2
asm volatile ("addi a2, a0, 0"); // Copy A0 to A2
Because then Register A2 becomes negative...
riscv_swint: Entry: regs: 0x8020be10
cmd: 61
EPC: 00000000c0000160
A0: 000000000000003d
A1: 0000000000000001
A2: ffffffffc0101000
A3: 000000000000000f
[...Page Fault because A2 is Invalid Address...]
So we Shift away the Negative Sign...
register long r2 asm("a0") = (long)(parm2); // Will move to A2
asm volatile ("slli a2, a0, 32"); // Shift 32 bits Left then Right
asm volatile ("srli a2, a2, 32"); // To clear the top 32 bits
Then Register A2 becomes Positively OK...
riscv_swint: Entry: regs: 0x8020be10
cmd: 61
EPC: 00000000c0000164
A0: 000000000000003d
A1: 0000000000000001
A2: 00000000c0101000
A3: 000000000000000f
Hello, World!!
BTW andi
doesn't work...
register long r2 asm("a0") = (long)(parm2); // Will move to A2
asm volatile ("andi a2, a0, 0xffffffff");
Because 0xffffffff gets assembled to -1. (Bug?)
In our Demo NuttX App, we implement exit
the same way as NuttX, by making a System Call: stdlib.h
// Caution: This may change
#define SYS__exit 8
// Terminate the NuttX Process
// From nuttx/syscall/proxies/PROXY__exit.c
inline void exit(int parm1) {
sys_call1((unsigned int)SYS__exit, (uintptr_t)parm1);
while(1);
}
sys_call1
makes a NuttX System Call, with our hand-crafted RISC-V Assembly (as a workaround): stdlib.h
// Make a System Call with 1 parameters
// https://github.com/apache/nuttx/blob/master/arch/risc-v/include/syscall.h#L188-L213
inline uintptr_t sys_call1(
unsigned int nbr, // System Call Number
uintptr_t parm1 // First Parameter
) {
// Pass the Function Number and Parameters
// Registers A0 to A1
register long r1 asm("a0") = (long)(parm1); // Will move to A1
asm volatile ("slli a1, a0, 32"); // Shift 32 bits Left then Right
asm volatile ("srli a1, a1, 32"); // To clear the top 32 bits
register long r0 asm("a0") = (long)(nbr); // Will stay in A0
// `ecall` will jump from RISC-V User Mode
// to RISC-V Supervisor Mode
// to execute the System Call.
// Input + Output Registers: A0 to A1
// Clobbers the Memory
asm volatile
(
// ECALL for System Call to NuttX Kernel
"ecall \n"
// NuttX needs NOP after ECALL
".word 0x0001 \n"
// Input+Output Registers: None
// Input-Only Registers: A0 to A1
// Clobbers the Memory
:
: "r"(r0), "r"(r1)
: "memory"
);
// Return the result from Register A0
return r0;
}
And everything works OK now!
Wow this looks horribly painful... Are we doing any more of this?
Nope we won't do any more of this! Hand-crafting the NuttX System Calls in RISC-V Assembly was extremely painful.
(Maybe we'll revisit this when the RISC-V Registers are working OK in TCC)
TODO: Define the printf formats %jd, %zu
TODO: Iteratively handle printf formats
Read the article...
Based on ROM FS Spec
And our ROM FS Filesystem romfs.bin
...
hexdump -C tcc-riscv32-wasm/zig/romfs.bin
We see the ROM FS Filesystem Header...
[ Magic Number ] [ FS Size ] [ Checksm ]
0000 2d 72 6f 6d 31 66 73 2d 00 00 0f 90 58 57 01 f8 |-rom1fs-....XW..|
[ Volume Name: ROMFS ]
0010 52 4f 4d 46 53 00 00 00 00 00 00 00 00 00 00 00 |ROMFS...........|
Followed by File Header for .
...
---- File Header for `.`
[ NextHdr ] [ Info ] [ Size ] [ Checksm ]
0020 00 00 00 49 00 00 00 20 00 00 00 00 d1 ff ff 97 |...I... ........|
[ File Name: `.` ]
0030 2e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
(NextHdr & 0xF = 9 means Executable Directory)
Followed by File Header for ..
...
---- File Header for `..`
[ NextHdr ] [ Info ] [ Size ] [ Checksm ]
0040 00 00 00 60 00 00 00 20 00 00 00 00 d1 d1 ff 80 |...`... ........|
[ File Name: `..` ]
0050 2e 2e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
(NextHdr & 0xF = 0 means Hard Link)
Followed by File Header and Data for stdio.h
...
---- File Header for `stdio.h`
[ NextHdr ] [ Info ] [ Size ] [ Checksm ]
0060 00 00 0a 42 00 00 00 00 00 00 09 b7 1d 5d 1f 9e |...B.........]..|
[ File Name: `stdio.h` ]
0070 73 74 64 69 6f 2e 68 00 00 00 00 00 00 00 00 00 |stdio.h.........|
(NextHdr & 0xF = 2 means Regular File)
---- File Data for `stdio.h`
0080 2f 2f 20 43 61 75 74 69 6f 6e 3a 20 54 68 69 73 |// Caution: This|
....
0a20 74 65 72 20 41 30 0a 20 20 72 65 74 75 72 6e 20 |ter A0. return |
0a30 72 30 3b 0a 7d 20 0a 00 00 00 00 00 00 00 00 00 |r0;.} ..........|
Followed by File Header and Data for stdlib.h
...
---- File Header for `stdlib.h`
[ NextHdr ] [ Info ] [ Size ] [ Checksm ]
0a40 00 00 00 02 00 00 00 00 00 00 05 2e 23 29 67 fc |............#)g.|
[ File Name: `stdlib.h` ]
0a50 73 74 64 6c 69 62 2e 68 00 00 00 00 00 00 00 00 |stdlib.h........|
(NextHdr & 0xF = 2 means Regular File)
---- File Data for `stdio.h`
0a60 2f 2f 20 43 61 75 74 69 6f 6e 3a 20 54 68 69 73 |// Caution: This|
....
0f80 72 65 74 75 72 6e 20 72 30 3b 0a 7d 20 0a 00 00 |return r0;.} ...|
0f90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
Read the article...
TCC calls surprisingly few External Functions! We might get it running on WebAssembly. Here's our analysis of the Missing Functions: zig/tcc-wasm.zig
Not sure why TCC uses Semaphores? Maybe we'll understand when we support #include
files.
TODO: Borrow Semaphore Functions from where?
- sem_init, sem_post, sem_wait
qsort isn't used right now. Maybe for the Linker later?
TODO: Borrow qsort from where?
- exit, qsort
Not used right now.
TODO: Borrow Time Functions from where?
- time, gettimeofday, localtime
Also not used right now.
TODO: Borrow Math Functions from where?
- ldexp
Varargs will be tricky to implement in Zig. Probably we should implement in C. Maybe MUSL?
Right now we're doing simple Pattern Matching. But it won't work for Real Programs.
- printf, snprintf, sprintf, vsnprintf
- sscanf
Will mock up these functions for WebAssembly. Maybe an Emulated Filesystem, similar to Emscripten File System?
- getcwd
- remove, unlink
Will mock up these functions for WebAssembly. Right now we read only 1 simple C Source File, and produce only 1 Object File. No header files, no libraries. And it works!
But later we might need an Emulated Filesystem, similar to Emscripten File System. And our File I/O code will support Multiple Files with proper Buffer Overflow Checks.
- open, fopen, fdopen,
- close, fclose
- fprintf, fputc, fputs
- read, fread
- fwrite
- fflush
- fseek, ftell, lseek
- puts
Borrow from foundation-libc and ziglibc
- atoi
- strcat, strchr, strcmp
- strncmp, strncpy, strrchr
- strstr, strtod, strtof
- strtol, strtold, strtoll
- strtoul, strtoull
- strerror