Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pass a char buffer to simplecpp instead of a stream #6379

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

firewave
Copy link
Collaborator

@firewave firewave commented May 3, 2024

No description provided.

@firewave

This comment was marked as resolved.

@firewave

This comment was marked as outdated.

@firewave
Copy link
Collaborator Author

The callgrind CI job shows a small reduction in Ir: 64,070,509,276 -> 63,862,736,388.

@firewave

This comment was marked as outdated.

@firewave

This comment was marked as resolved.

@firewave firewave force-pushed the charbuf branch 2 times, most recently from 10eaa80 to 0b40ad2 Compare July 15, 2024 08:10
@firewave

This comment was marked as outdated.

Copy link
Owner

@danmar danmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the interface that we pass the istream to simplecpp. it makes the interface more strict. a const char* parameter can be a filename, code, configuration, etc.

cppcheck.check(FileWithDetails("test.cpp"), code);
template<std::size_t size>
void run(const char (&code)[size]) {
cppcheck.check(FileWithDetails("test.cpp"), reinterpret_cast<const uint8_t*>(code), size-1);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the buffer that we pass to this function is always 4000 bytes right? But the actual code is much shorter typically. I think it would make sense to pass only the data until the terminating zero to simplecpp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that definitely felt off when I took a final look before I made it ready for review.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually changes the behavior since before it was doing implicitly doing strlen() on the input and now we provide a fixed size.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

lib/cppcheck.cpp Outdated
@@ -562,10 +562,9 @@ unsigned int CppCheck::check(const FileWithDetails &file)
return checkFile(file, emptyString);
}

unsigned int CppCheck::check(const FileWithDetails &file, const std::string &content)
unsigned int CppCheck::check(const FileWithDetails &file, const uint8_t* data, std::size_t size)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this interface is less safe.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep the std::string version for convenience. I got rid of it to make sure we always specify a buffer and because it was unused now.

@danmar
Copy link
Owner

danmar commented Sep 29, 2024

The callgrind CI job shows a small reduction in Ir: 64,070,509,276 -> 63,862,736,388.

Please let's take that with a grain of salt. It's the instruction count not a time estimate.

@danmar
Copy link
Owner

danmar commented Sep 29, 2024

Please let's take that with a grain of salt. It's the instruction count not a time estimate.

some kind of small benchmark that measures actual time would be interesting.

@firewave
Copy link
Collaborator Author

some kind of small benchmark that measures actual time would be interesting.

I will provide additional values. This will also help with value evaluation stuff which is not executed in that job. I just should it to highlight that it even has a measurable impact without that.

@firewave
Copy link
Collaborator Author

I like the interface that we pass the istream to simplecpp. it makes the interface more strict. a const char* parameter can be a filename, code, configuration, etc.

That just moves the potential mistakes to a different layer. You could still pass unterminated data to the stream or improperly size your buffer.

If we were using newer standards we could provide std::string_view and std::span overloads in the interface and deprecate the raw pointer one. I will file a ticket about that later. But those are also prone to error though because of lifetime stuff. None of the possible options appears to be 100% safe if you don't know what you are doing.

@danmar
Copy link
Owner

danmar commented Sep 29, 2024

That just moves the potential mistakes to a different layer.

Sorry I feel we talk about different things. I talked about type safety and you about bounds safety.

So ok we can still have bounds safety problems with the std::istream.

But developers sometimes mix up the parameters and with a const char* there is nothing that prevents that for instance the filename is passed by mistake to that parameter.

@firewave
Copy link
Collaborator Author

But developers sometimes mix up the parameters and with a const char* there is nothing that prevents that for instance the filename is passed by mistake to that parameter.

My idea would be to deprecate the raw pointers when you have the later standard wrappers. As I said I will file a ticket about it and try to address it in this dev cycle.

I also do not know how many people use the interface since we modified those several times and never get any reports about it so I assume those user might be non-existent.

@firewave
Copy link
Collaborator Author

a const char* parameter can be a filename, code, configuration, etc.

You should check the documentation before you use it but point taken.

Maybe we could use more descriptive names like Buffer or FromFile suffixes. That would also make sense for the TokenList::createTokens() functions.

I would do the function names in this PR and do the safer interface function in a follow-up since that needs some looking into.

@firewave
Copy link
Collaborator Author

Maybe we could use more descriptive names like Buffer or FromFile suffixes. That would also make sense for the TokenList::createTokens() functions.

I did that - please have a look.

@firewave
Copy link
Collaborator Author

I might also still need to add convenience stream versions to the interface. And we would need tests for that - I am not so sure anymore about adding essentially dead code.

@danmar
Copy link
Owner

danmar commented Oct 3, 2024

I just don't like this. A istream parameter is better than 2 const uint8_t*, int size parameters imo. An istream is also more explicit than a string_view or a span.

@firewave
Copy link
Collaborator Author

firewave commented Oct 3, 2024

I just don't like this. A istream parameter is better than 2 const uint8_t*, int size parameters imo. An istream is also more explicit than a string_view or a span.

I will add the other safer versions in this PR as well and try to find a way to opt into the unsafe version so it can still be used internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants