Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Establishing Coding Standards for lind-wasm #79

Open
Yaxuan-w opened this issue Jan 7, 2025 · 3 comments
Open

Establishing Coding Standards for lind-wasm #79

Yaxuan-w opened this issue Jan 7, 2025 · 3 comments
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed

Comments

@Yaxuan-w
Copy link
Member

Yaxuan-w commented Jan 7, 2025

While working on adding new syscalls to lind-wasm and discussing with Qianxi, I noticed that due to the code being contributed by multiple people, there is some confusion regarding the specific code structure and formatting in the codebase. Different approaches have been used, and it’s not always consistent. Additionally, after discussing with Nick, we currently lack documentation outlining the coding standards for lind-wasm. Considering that we will have new contributors joining in the future, I think it would be helpful to have a discussion about what we would like the final coding standards to be and add them to documentation for lind-wasm.

Current solution for adding new syscalls

On the rawposix side:

dispatcher.rs:

  • Add an arbitrary call number that should align with the one used in glibc.
  • Add type conversions and redirections to the actual syscall implementation in rawposix.

On the glibc side:

glibc/include/unistd.h: Modify visibility if necessary.

sysdeps/unix/syscall.list: Modify strong and weak names if necessary.

sysdeps/unix/sysv/linux/

  • Find the corresponding .c implementation (if the syscall is a public API).
  • Replace the syscall name with the _libc prefix.

Key Considerations

1. Use of __libc_ Prefix:

Glibc uses the __GI_ prefix to distinguish syscalls as part of its public API. I noticed that in previous reimplementations of syscalls in lind-wasm, we used the __libc_ prefix. Since I decided to expose readlinkat as public API as readlink and for consistency in the code base, I chose to adjust the naming accordingly (with the __libc_ prefix on both implementations). I’d like to confirm if this naming convention aligns with our long-term goals for the project.

2. Inclusion of sys-cancel.h:

While reviewing implementations of syscalls (e.g., open), I noticed that many syscalls include sys-cancel.h even those that are not directly blocking. From my understanding, this header is primarily used for handling thread cancellation. To maintain consistency and handle potential edge cases, I included sys-cancel.h in my implementation. I'd appreciate feedback on whether this is the best approach.

3. Which syscall.list to Modify:

From my understanding:

sysdeps/unix/syscalls.list defines general Unix syscall that glibc supports
sysdeps/unix/sysv/linux/syscalls.list adds Linux-specific extensions or overrides the former.

According to this, there are multiple approaches for exposing syscalls as public APIs. Below is the reasoning behind my thoughts.

In previous glibc implementation, readlinkat wasn’t directly exposed as a public API but was invoked via readlink (as seen with the __GI_ naming convention and original glibc source code). I modified sysdeps/unix/syscalls.list to support exposing both readlink and readlinkat as public APIs for lind-wasm.

Considering whether sysdeps/unix/sysv/linux/syscalls.list might overwrite these changes, I think the weak alias weak_alias(__libc_readlinkat, readlinkat) in readlinkat.c should bind readlinkat to __libc_readlinkat and prevent overwriting by original linux syscall.list. However, I’m not entirely certain if modifying only sysdeps/unix/sysv/linux/syscalls.list could achieve the same effect. I would appreciate any clarification on this matter.

@Yaxuan-w Yaxuan-w added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Jan 7, 2025
@qianxichen233
Copy link
Contributor

qianxichen233 commented Jan 7, 2025

  1. The real file for syscalls may not always be inside sysdeps/unix/sysv/linux/. It could be inside some other folders like io. A syscall may have several file defined it but make system only used one of it. A large portion of syscalls are indeed inside sysdeps/unix/sysv/linux/ but I have seen a lot of exceptions. Currently the best approach to find the real file for a specific syscall is still using lindtool cpsrc [filename] -p, which works by looking up the makefile log and find the correct file location.
  2. For the glibc/include/unistd.h, not all the syscalls are defined inside here. Some syscalls like sockets are defined in socket.h, so we need to look for the correct header file for the syscall.
  3. syscall.list is a good point and I haven't noticed before. It looks like readlinkat is the first syscall I've encountered that is not exposed to public by default in glibc. So I think it should be pretty rare that we need to modify syscall.list but this is definitely a good instruction to fix the issue when we encountered undefined symbol error even the file is correctly compiled.
  4. And for the sys-cancel.h, realistically, I think whether we need to add this depends on the actual syscall implementation. If the syscall is just a direct call down to rawposix (e.g. just one line of MAKE_SYSCALL), I believe including sys-cancel.h is not nescessary and will just increase the code size. I think only syscalls that maintained part of its original glibc implemention that possibly used thread_cancel somewhere (e.g. via something like #ifdef USE_THREAD_CANCEL) is nescessary to add the header.

@rennergade
Copy link
Contributor

  1. The real file for syscalls may not always be inside sysdeps/unix/sysv/linux/. It could be inside some other folders like io. A syscall may have several file defined it but make system only used one of it. A large portion of syscalls are indeed inside sysdeps/unix/sysv/linux/ but I have seen a lot of exceptions. Currently the best approach to find the real file for a specific syscall is still using lindtool cpsrc [filename] -p, which works by looking up the makefile log and find the correct file location.

    1. For the glibc/include/unistd.h, not all the syscalls are defined inside here. Some syscalls like sockets are defined in socket.h, so we need to look for the correct header file for the syscall.

    2. syscall.list is a good point and I haven't noticed before. It looks like readlinkat is the first syscall I've encountered that is not exposed to public by default in glibc. So I think it should be pretty rare that we need to modify syscall.list but this is definitely a good instruction to fix the issue when we encountered undefined symbol error even the file is correctly compiled.

    3. And for the sys-cancel.h, realistically, I think whether we need to add this depends on the actual syscall implementation. If the syscall is just a direct call down to rawposix (e.g. just one line of MAKE_SYSCALL), I believe including sys-cancel.h is not nescessary and will just increase the code size. I think only syscalls that maintained part of its original glibc implemention that possibly used thread_cancel somewhere (e.g. via something like #ifdef USE_THREAD_CANCEL) is nescessary to add the header.

@yzhang71 Let's add all of this to the libc section of the docs.

@JustinCappos
Copy link
Member

On the rawposix side:

dispatcher.rs:

  • Add an arbitrary call number that should align with the one used in glibc.

Are you implying there would be a single system call receiver where you then split out into different system calls? If so, you don't need this. The entry in the 3i table will go to the correct system call function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants