Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AVRO-4019] [C++] Turn on even more compiler warnings #2966

Merged
merged 7 commits into from
Jul 16, 2024

Conversation

Gerrit0
Copy link
Contributor

@Gerrit0 Gerrit0 commented Jun 22, 2024

What is the purpose of the change

Followup to #2931, which turned on -Wextra, with this PR turning on more warnings

https://issues.apache.org/jira/browse/AVRO-4019

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Documentation

  • Does this pull request introduce a new feature? No
  • If yes, how is the feature documented? N/A

@github-actions github-actions bot added the C++ Pull Requests for C++ binding label Jun 22, 2024
@Fokko
Copy link
Contributor

Fokko commented Jun 26, 2024

Looks like it triggers some warnings:

/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:198:41: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  198 |         temp.push_back((checksum >> 24) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:199:41: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  199 |         temp.push_back((checksum >> 16) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:200:40: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  200 |         temp.push_back((checksum >> 8) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:201:33: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  201 |         temp.push_back(checksum & 0xFF);
      |                        ~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc: In function ‘avro::ValidSchema avro::makeSchema(const std::__debug::vector<unsigned char>&)’:
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:[45](https://github.com/apache/avro/actions/runs/9621526515/job/26619414933?pr=2966#step:6:46)8:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  458 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~

Gerrit0 added a commit to Gerrit0/avro that referenced this pull request Jun 28, 2024
@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jun 28, 2024

Ah, Snappy ifdefs! I didn't have Snappy set up locally, so didn't see it, sorry about that! Fixed.

int precision() const { return precision_; }
void setScale(int scale);
int scale() const { return scale_; }
void setPrecision(int64_t precision);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the scale and precision a int32 is more than enough. Precision probably won't go over 38 and scale not over 77 (or -77).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I've instead added static_cast to where we read this field

@@ -146,7 +146,7 @@ void Node::setLogicalType(LogicalType logicalType) {
if (type_ == AVRO_FIXED) {
// Max precision that can be supported by the current size of
// the FIXED type.
long maxPrecision = floor(log10(2.0) * (8.0 * fixedSize() - 1));
int32_t maxPrecision = static_cast<int32_t>(floor(log10(2.0) * (8.0 * static_cast<double>(fixedSize()) - 1)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko I didn't introduce this here, so I don't think I should fix it here... but I believe this calculation is incorrect.

According to the spec, the maximum precision should be:

$$ floor(log_{10}(2^{8 \times n - 1} - 1)) $$

Rust is the only language implementation that appears to implement this correctly. Python, Java, and C++ all implement this as:

$$ floor(log_{10}(2^{8 \times n - 1})) $$

JavaScript appears to have an example of a DecimalType with this same issue, but it doesn't appear to be in any library code shipped by the library.

@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jun 29, 2024

Out of curiosity I looked at the other open PRs, I believe this supersedes #1852 and #2300.

@Fokko
Copy link
Contributor

Fokko commented Jun 29, 2024

@Gerrit0 Got it, thanks again for working on this. There seems to be one pending issue:

/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc: In function ‘avro::ValidSchema avro::makeSchema(const std::__debug::vector<unsigned char>&)’:
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:458:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  458 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

@Gerrit0 Gerrit0 force-pushed the wconversion branch 2 times, most recently from ae81e42 to a2c7576 Compare June 29, 2024 13:00
@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jun 29, 2024

@Fokko fixed and rebased, hopefully... I can't seem to get GCC to report that error locally

@Fokko
Copy link
Contributor

Fokko commented Jul 8, 2024

@Gerrit0 same error, different line:

/home/runner/work/avro/avro/lang/c++/impl/Resolver.cc: In member function ‘virtual void avro::EnumParser::parse(avro::Reader&, uint8_t*) const’:
/home/runner/work/avro/avro/lang/c++/impl/Resolver.cc:310:16: error: useless cast to type ‘size_t’ {aka ‘long unsigned int’} [-Werror=useless-cast]
  310 |         assert(static_cast<size_t>(val) < mapping_.size());
      |                ^~~~~~~~~~~~~~~~~~~~~~~~

@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jul 9, 2024

@Fokko hopefully fixed this time! Still can't seem to reproduce this locally... if this still doesn't fix it, I'll propose a change to the CI pipeline to print out the GCC version used so I can install that.

@Fokko
Copy link
Contributor

Fokko commented Jul 10, 2024

New errors seem to pop up all the time. I'm not too familiar with the C++ build, but it would be good to list all the warnings before exiting the build. It looks like it is failing on the first warning right now.

@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jul 10, 2024

make has a -k flag to do that, unfortunately CMake doesn't have a wrapper to delegate to the correct one per platform. Since the CI uses makefiles, I've tried adding -k to build.sh, hopefully that works?

Also rebased on main after the move of avro headers to include/avro

@Fokko
Copy link
Contributor

Fokko commented Jul 10, 2024

Almost there:

[ 63%] Building CXX object CMakeFiles/CodecTests.dir/test/CodecTests.cc.o
/home/runner/work/avro/avro/lang/c++/test/CodecTests.cc: In function ‘avro::ValidSchema avro::parsing::makeValidSchema(const char*)’:
/home/runner/work/avro/avro/lang/c++/test/CodecTests.cc:528:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  528 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/CodecTests.dir/build.make:76: CMakeFiles/CodecTests.dir/test/CodecTests.cc.o] Error 1
gmake[2]: Target 'CMakeFiles/CodecTests.dir/build' not remade because of errors.
gmake[1]: *** [CMakeFiles/Makefile2:826: CMakeFiles/CodecTests.dir/all] Error 2
[ 64%] Building CXX object CMakeFiles/StreamTests.dir/test/StreamTests.cc.o
[ 65%] Linking CXX executable StreamTests
[ 65%] Built target StreamTests
[ 66%] Building CXX object CMakeFiles/SpecificTests.dir/test/SpecificTests.cc.o
[ 67%] Linking CXX executable SpecificTests
[ 67%] Built target SpecificTests
[ 68%] Building CXX object CMakeFiles/DataFileTests.dir/test/DataFileTests.cc.o
/home/runner/work/avro/avro/lang/c++/test/DataFileTests.cc: In function ‘avro::ValidSchema makeValidSchema(const char*)’:
/home/runner/work/avro/avro/lang/c++/test/DataFileTests.cc:126:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  126 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/DataFileTests.dir/build.make:76: CMakeFiles/DataFileTests.dir/test/DataFileTests.cc.o] Error 1
gmake[2]: Target 'CMakeFiles/DataFileTests.dir/build' not remade because of errors.
gmake[1]: *** [CMakeFiles/Makefile2:904: CMakeFiles/DataFileTests.dir/all] Error 2

* Update CodecTests.cc

* Update DataFileTests.cc
@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jul 10, 2024

Hopefully all fixed now!

@martin-g
Copy link
Member

The build on linux aarch64 still fails ...

@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jul 11, 2024

I really wish I understood why different versions of gcc detect different things... I tried setting up a docker container with the same version used by CI, but ran into issues getting stuff to install, and haven't gotten back to it.

@Gerrit0
Copy link
Contributor Author

Gerrit0 commented Jul 14, 2024

Well, at this point my theory is that the GCC 9.4 used on ARM (vs 11.4 in the other job and 14.1 on my box) doesn't have a properly constexpr'd std::numeric_limits... for makeString.... I guess the check just got smarter, so newer GCC knows it's impossible for that result to change the value? Either way, using INT_MAX there should make GCC 9.4 happy, and makeString can be simplified with a character table lookup. fingers crossed

@martin-g
Copy link
Member

@mkmkme Do you have time to review this PR ?

@mkmkme
Copy link
Contributor

mkmkme commented Jul 15, 2024

@mkmkme Do you have time to review this PR ?

Yup, thanks for pinging me! I'll do this promptly

Copy link
Contributor

@mkmkme mkmkme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addeed some comments and nitpicks, please tell me if it makes sense

lang/c++/impl/FileStream.cc Outdated Show resolved Hide resolved
lang/c++/impl/Node.cc Outdated Show resolved Hide resolved
lang/c++/impl/NodeImpl.cc Outdated Show resolved Hide resolved
lang/c++/impl/NodeImpl.cc Outdated Show resolved Hide resolved
lang/c++/impl/parsing/ResolvingDecoder.cc Show resolved Hide resolved
@@ -135,7 +135,11 @@ protected:
memcpy(c, gptr(), toCopy);
c += toCopy;
bytesCopied += toCopy;
gbump(toCopy);
while (toCopy > INT_MAX) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer using numeric_limits here as using macros can be fragile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that first, switched to the macro when gcc 9.4 on the ARM agent was complaining about signedness... I can add a static_cast instead, hopefully newer compilers don't flag that as a useless cast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see... thanks!
You can leave it as it is now and add comment that this is only needed for old gcc compatibility.

We do need to upgrade gcc on ARM runner at some point. FYI @martin-g, that's not the first time this one bites

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASF Infra team promised to upgrade the Github builders to a newer Ubuntu, but there is no timeframe ...

lang/c++/test/buffertest.cc Outdated Show resolved Hide resolved
lang/c++/include/avro/Reader.hh Show resolved Hide resolved
@@ -35,7 +35,7 @@ public:
explicit NullValidator(const ValidSchema &) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file also introduces user-facing sign-ness changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I believe the new type is more appropriate here though, as a negative value isn't valid for any of these values... does this deserve a Jira issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes. Also let's hear from @martin-g as he knows better the release cycle.

But at least this should be properly documented as a breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.12 is really close to be released!
The JIRA ticket is needed mostly for the CHANGELOG file, so users are aware of the change and can update their code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created https://issues.apache.org/jira/browse/AVRO-4019 to track this issue

@Gerrit0 Gerrit0 changed the title [C++] Turn on even more compiler warnings [AVRO-4019] [C++] Turn on even more compiler warnings Jul 16, 2024
@martin-g martin-g merged commit c460d64 into apache:main Jul 16, 2024
4 checks passed
@martin-g
Copy link
Member

Thank you, @Gerrit0 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build C++ Pull Requests for C++ binding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants