-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up identifier detection #89
base: main
Are you sure you want to change the base?
Conversation
The original code in function 'read_punct' relies on heavy string specific function calls, resulting in slower execution. Instead, this function can be faster using straightforward control flow.
Thank you for the patches. Honestly performance hasn't been a priority, I consider the lexer less significant in the total pipeline compared to cache-misses from leaking and the overhead of external assembler. I do have some ideas as well, so a branch for performance experiments should make sense. In terms of performance I want to ensure a common basis to work on: which projects to benchmark and what kind of environment, is it self-hosted, another simple C compiler or full-on gcc/clang optimizations with custom allocator etc. Do you have a preference? |
On the pull request: I think the short-circuiting of ASCII characters can be done earlier, skipping |
Using uftrace, I identified the performance bottlenecks in the compilation process. Here's what I found: --- a/GNUmakefile
+++ b/GNUmakefile
@@ -1,4 +1,4 @@
-CFLAGS=-std=c99 -g -fno-common -Wall -pedantic -Wno-switch
+CFLAGS=-std=c99 -g -pg -fno-common -Wall -pedantic -Wno-switch
SRCS=$(wildcard *.c)
OBJS=$(SRCS:.c=.o) After running
Based on this analysis, I tweaked the |
By the way, I am working with my students to enhance the shecc project, a self-compiling C compiler with some optimizations. Since shecc only implements a subset of C, we are planning to modify slimcc to output compatible C code that shecc can process, rather than having slimcc generate x86-64 assembly directly. This way, slimcc would serve as a full C language frontend for shecc, while allowing shecc to focus on optimization and backend support. To achieve this, we plan to contribute improvements to slimcc's parsing, tokenization, and preprocessing components. |
I'm thrilled to know that the project has some real world value and am happy to target different backends as long as the IR is stable, but I imagine memory usage will be a major pain point on embedded targets that shecc currently self-hosts on. I thought about re-architecting, but then the code wouldn't be as accessible and one might as well work on other more optimized C frontend like tcc, cproc, tyfkda/xcc, libfirm/cparser etc. If C++ is allowed in your classroom, I do have an unfinished C++ port sitting around that use RAII for nearly identical code with much better memory consumption, a minimal C++ front-end bootstrappable with C99 is also something I've always wanted to work on. |
The C++ version is uploaded in If that particular style is okay to you, I'd prefer basing performance optimizations on that version, since it feels a bit bike shedding to optimize parsing while compiling larger projects hit OOM on an embedded device. |
Speed up recognize punctuation
I defer this pull request to @ChAoSUnItY, who is assigned to proceed with the above research prototype under my supervision. |
Previously, HashMap relied on division operations, which are expensive. This update replaces divisions with logical AND operations using an additional mask, improving performance.
Avoid expensive division in HashMap
I believe identifier detection is of decent speed after efcbb85, feel free to open new issues regarding the C backend. Btw thanks for mentioning uftrace, logging exact call counts is incredible. Can't believe it's not more well known. |
In 'preprocess2' function, 'expand_macro' was called multiple times, some of which were unnecessary. This change ensures that 'expand_macro' is called only when the preprocessor token is an identifier.
Expand macro only when preprocessor token is identifier
I've adopted identifier reading logic from efcbb85 and further refine it with the macro functions introduced previously in this PR, and also applied fool-proofing logic to the function. This way, it improves readability. |
892c6b6
to
7c97de8
Compare
Let's rebase the latest |
d4f5463
to
644be33
Compare
Both 'is_ident1' and 'is_ident2' are now macros instead of function calls, and they are tweaked for ASCII detection in advance with the fallback to table lookup for non-ASCII characters. Also discard unnecessary variable and add safe guard: - Variable is_first can be replaced with boolean expression "p == start" - Add safe guard for ascii character checking to ensure starting identifier character must not be numberic Additionally, replace first ascii character check with macro is_ident2_ascii to keep readability. The later is_ident2 function call is replaced with is_ident2_non_ascii because the expanded macro function will result in multiple decode_utf8 function call, also it's redundant to check if it's an ascii character or not. Co-authored-by: Jim Huang <[email protected]>
644be33
to
4b34032
Compare
d4283bc
to
c10050f
Compare
Both 'is_ident1' and 'is_ident2' are now macros instead of function calls, and they are tweaked for ASCII detection in advance with the fallback to table lookup for non-ASCII characters.