Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: implement a timeout when running formatter #139

Open
ccoVeille opened this issue Apr 1, 2024 · 10 comments
Open

Feature request: implement a timeout when running formatter #139

ccoVeille opened this issue Apr 1, 2024 · 10 comments
Labels
enhancement New feature or request

Comments

@ccoVeille
Copy link
Contributor

Your code launches formatter on each code snippet.

Based on what I can see the code is extracting the code snippet, look for the header, then call the formatter according to the one available

I was struggling with some files formating that were taking 700s to be formatted.

When I moved the file in /tmp/ it was instant.

I used strace to identify the issue. At first I thought the code was looking for tons of files because of a bug in discovering foldes.

It appears the code calls npx to launch formatter such as sqlformatter

npx relies on node_modules, so when the files were in a repository that needs a private npm registry, AND my VPN is down. It will try over and over to reach the npm registry that cannot be reached.

When I switched my VPN on, the mdsf was able to format instantly.

TL;DR; This should be considered as a source of problem. I think you should implement at timeout setting (that maybe configured) this was when formatting a file you could stop processing and inform a user there were an error.

@hougesen
Copy link
Owner

hougesen commented Apr 1, 2024

This makes a lot of sense. I'll look into it 🚀

@hougesen hougesen added the enhancement New feature or request label Apr 21, 2024
@hougesen
Copy link
Owner

Okay, so I have been looking into this a bit.

I cannot see an easy way to implement a timeout, without having to refactor large parts of the tool.

Currently the way the formatter works is by creating a temporary file for each code block and then invoking each formatter sequentially on the file. In the end the file is read and the markdown file is updated.

By calling sigkill on an invoked formatter mid process we risk the formatter being stopped during a write operation, which could leave the code in a broken state.

The only way I can see to make a timeout work, is by converting the formatters to instead rely on stdin/stdout (Something like cat $FILENAME | stylua -), but that would add overhead since each codesnippet will be written to stdin/stdout multiple times. It also has the added problem of dealing with formatters that don't support stdin as input.

@ccoVeille
Copy link
Contributor Author

Ouch indeed. I might be easy

What about a global timeout then? I mean not the lintee level and display error information.

But, it might be the exact way you thought about abd I didn't get it 😁

@ccoVeille
Copy link
Contributor Author

Thanks for checking anyway

@hougesen
Copy link
Owner

A global timeout could work!

I’ll take a look at it once I have some spare time 😃

@ccoVeille
Copy link
Contributor Author

Because my issue was your code npx so my VPN protected npm repository, but I wasn't on the VPN and without timeout I was waiting for nothing.

BTW, I think I saw you refactored and now local binary are tried before using an npx version

@hougesen
Copy link
Owner

BTW, I think I saw you refactored and now local binary are tried before using an npx version

Yes, local binaries are ran before trying npx/bunx/deno in most cases.

I have been looking into a global timeout and it is actually blocked by the multithreading implementation in #330 since Rust does not support stopping/killing native threads 😅

I am going to benchmark the difference between using native (std::thread) and green threads (tokio::task) to see how big a difference it makes, if any.

I am currently expecting file io performance to be worse when using Tokio (async runtime for Rust) as described in tokio-rs/tokio/issues/3664.

I do not expect the performance of difference of command spawning to have any big impact since we are dealing with such a small amount of processes. I did find the finding of Jakub Beránek in Process spawning performance in Rust rather interesting though.

@hougesen
Copy link
Owner

hougesen commented Jun 27, 2024

It looks like the difference between native and green threads is pretty negligible. Native threads did feel a lot smoother to look at, and impacted the overall performance of my computer a lot less. For that reason I am leaning towards not switching to green threads, and putting the timeout argument on hold.

I ran both on rust-lang/rfcs which has a lot of markdown files.

rust-lang/rfcs file breakdown

Left side is native, right is green threads.

  • mdsf format

mdsf format

  • mdsf format --cache with no cache

mdsf format --cache without cache

  • mdsf format --cache with cache from previous operation

mdsf format --cache with cache

  • mdsf verify

mdsf verify

@hougesen
Copy link
Owner

The async implementation can be tested here https://github.com/hougesen/mdsf/tree/refactor/async-runtime

This is also not a good sign for the async implementation 😅

image

@ccoVeille
Copy link
Contributor Author

I'm impressed by the benchmark you made ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants