[AVRO-4019] [C++] Turn on even more compiler warnings #2966

Gerrit0 · 2024-06-22T00:49:08Z

What is the purpose of the change

Followup to #2931, which turned on -Wextra, with this PR turning on more warnings

https://issues.apache.org/jira/browse/AVRO-4019

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Documentation

Does this pull request introduce a new feature? No
If yes, how is the feature documented? N/A

Fokko · 2024-06-26T14:59:56Z

Looks like it triggers some warnings:

/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:198:41: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  198 |         temp.push_back((checksum >> 24) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:199:41: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  199 |         temp.push_back((checksum >> 16) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:200:40: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  200 |         temp.push_back((checksum >> 8) & 0xFF);
      |                        ~~~~~~~~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:201:33: error: conversion from ‘int32_t’ {aka ‘int’} to ‘char’ may change value [-Werror=conversion]
  201 |         temp.push_back(checksum & 0xFF);
      |                        ~~~~~~~~~^~~~~~
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc: In function ‘avro::ValidSchema avro::makeSchema(const std::__debug::vector<unsigned char>&)’:
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:[45](https://github.com/apache/avro/actions/runs/9621526515/job/26619414933?pr=2966#step:6:46)8:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  458 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~

Gerrit0 · 2024-06-28T01:27:44Z

Ah, Snappy ifdefs! I didn't have Snappy set up locally, so didn't see it, sorry about that! Fixed.

Fokko · 2024-06-28T04:58:56Z

lang/c++/api/LogicalType.hh

-    int precision() const { return precision_; }
-    void setScale(int scale);
-    int scale() const { return scale_; }
+    void setPrecision(int64_t precision);


I think for the scale and precision a int32 is more than enough. Precision probably won't go over 38 and scale not over 77 (or -77).

Fair enough, I've instead added static_cast to where we read this field

Gerrit0 · 2024-06-29T00:55:38Z

lang/c++/impl/Node.cc

@@ -146,7 +146,7 @@ void Node::setLogicalType(LogicalType logicalType) {
            if (type_ == AVRO_FIXED) {
                // Max precision that can be supported by the current size of
                // the FIXED type.
-                long maxPrecision = floor(log10(2.0) * (8.0 * fixedSize() - 1));
+                int32_t maxPrecision = static_cast<int32_t>(floor(log10(2.0) * (8.0 * static_cast<double>(fixedSize()) - 1)));


@Fokko I didn't introduce this here, so I don't think I should fix it here... but I believe this calculation is incorrect.

According to the spec, the maximum precision should be:

$$ floor(log_{10}(2^{8 \times n - 1} - 1)) $$

Rust is the only language implementation that appears to implement this correctly. Python, Java, and C++ all implement this as:

$$ floor(log_{10}(2^{8 \times n - 1})) $$

JavaScript appears to have an example of a DecimalType with this same issue, but it doesn't appear to be in any library code shipped by the library.

Gerrit0 · 2024-06-29T01:44:57Z

Out of curiosity I looked at the other open PRs, I believe this supersedes #1852 and #2300.

Fokko · 2024-06-29T05:47:21Z

@Gerrit0 Got it, thanks again for working on this. There seems to be one pending issue:

/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc: In function ‘avro::ValidSchema avro::makeSchema(const std::__debug::vector<unsigned char>&)’:
/home/runner/work/avro/avro/lang/c++/impl/DataFile.cc:458:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  458 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

Gerrit0 · 2024-06-29T13:00:33Z

@Fokko fixed and rebased, hopefully... I can't seem to get GCC to report that error locally

Fokko · 2024-07-08T16:02:58Z

@Gerrit0 same error, different line:

/home/runner/work/avro/avro/lang/c++/impl/Resolver.cc: In member function ‘virtual void avro::EnumParser::parse(avro::Reader&, uint8_t*) const’:
/home/runner/work/avro/avro/lang/c++/impl/Resolver.cc:310:16: error: useless cast to type ‘size_t’ {aka ‘long unsigned int’} [-Werror=useless-cast]
  310 |         assert(static_cast<size_t>(val) < mapping_.size());
      |                ^~~~~~~~~~~~~~~~~~~~~~~~

Gerrit0 · 2024-07-09T12:04:33Z

@Fokko hopefully fixed this time! Still can't seem to reproduce this locally... if this still doesn't fix it, I'll propose a change to the CI pipeline to print out the GCC version used so I can install that.

Fokko · 2024-07-10T08:32:53Z

New errors seem to pop up all the time. I'm not too familiar with the C++ build, but it would be good to list all the warnings before exiting the build. It looks like it is failing on the first warning right now.

Also attempt to make build.sh continue

Gerrit0 · 2024-07-10T12:24:54Z

make has a -k flag to do that, unfortunately CMake doesn't have a wrapper to delegate to the correct one per platform. Since the CI uses makefiles, I've tried adding -k to build.sh, hopefully that works?

Also rebased on main after the move of avro headers to include/avro

Fokko · 2024-07-10T12:36:32Z

Almost there:

[ 63%] Building CXX object CMakeFiles/CodecTests.dir/test/CodecTests.cc.o
/home/runner/work/avro/avro/lang/c++/test/CodecTests.cc: In function ‘avro::ValidSchema avro::parsing::makeValidSchema(const char*)’:
/home/runner/work/avro/avro/lang/c++/test/CodecTests.cc:528:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  528 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/CodecTests.dir/build.make:76: CMakeFiles/CodecTests.dir/test/CodecTests.cc.o] Error 1
gmake[2]: Target 'CMakeFiles/CodecTests.dir/build' not remade because of errors.
gmake[1]: *** [CMakeFiles/Makefile2:826: CMakeFiles/CodecTests.dir/all] Error 2
[ 64%] Building CXX object CMakeFiles/StreamTests.dir/test/StreamTests.cc.o
[ 65%] Linking CXX executable StreamTests
[ 65%] Built target StreamTests
[ 66%] Building CXX object CMakeFiles/SpecificTests.dir/test/SpecificTests.cc.o
[ 67%] Linking CXX executable SpecificTests
[ 67%] Built target SpecificTests
[ 68%] Building CXX object CMakeFiles/DataFileTests.dir/test/DataFileTests.cc.o
/home/runner/work/avro/avro/lang/c++/test/DataFileTests.cc: In function ‘avro::ValidSchema makeValidSchema(const char*)’:
/home/runner/work/avro/avro/lang/c++/test/DataFileTests.cc:126:12: error: useless cast to type ‘class avro::ValidSchema’ [-Werror=useless-cast]
  126 |     return ValidSchema(vs);
      |            ^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/DataFileTests.dir/build.make:76: CMakeFiles/DataFileTests.dir/test/DataFileTests.cc.o] Error 1
gmake[2]: Target 'CMakeFiles/DataFileTests.dir/build' not remade because of errors.
gmake[1]: *** [CMakeFiles/Makefile2:904: CMakeFiles/DataFileTests.dir/all] Error 2

* Update CodecTests.cc * Update DataFileTests.cc

Gerrit0 · 2024-07-10T13:02:05Z

Hopefully all fixed now!

martin-g · 2024-07-10T14:10:14Z

The build on linux aarch64 still fails ...

Gerrit0 · 2024-07-11T01:11:10Z

I really wish I understood why different versions of gcc detect different things... I tried setting up a docker container with the same version used by CI, but ran into issues getting stuff to install, and haven't gotten back to it.

Gerrit0 · 2024-07-14T17:39:39Z

Well, at this point my theory is that the GCC 9.4 used on ARM (vs 11.4 in the other job and 14.1 on my box) doesn't have a properly constexpr'd std::numeric_limits... for makeString.... I guess the check just got smarter, so newer GCC knows it's impossible for that result to change the value? Either way, using INT_MAX there should make GCC 9.4 happy, and makeString can be simplified with a character table lookup. fingers crossed

martin-g · 2024-07-15T08:53:53Z

@mkmkme Do you have time to review this PR ?

mkmkme · 2024-07-15T08:54:53Z

@mkmkme Do you have time to review this PR ?

Yup, thanks for pinging me! I'll do this promptly

lang/c++/impl/Compiler.cc

mkmkme

Addeed some comments and nitpicks, please tell me if it makes sense

lang/c++/impl/FileStream.cc

lang/c++/impl/Node.cc

lang/c++/impl/NodeImpl.cc

lang/c++/impl/parsing/ResolvingDecoder.cc

mkmkme · 2024-07-15T09:32:22Z

lang/c++/include/avro/buffer/BufferStreambuf.hh

@@ -135,7 +135,11 @@ protected:
                memcpy(c, gptr(), toCopy);
                c += toCopy;
                bytesCopied += toCopy;
-                gbump(toCopy);
+                while (toCopy > INT_MAX) {


I'd prefer using numeric_limits here as using macros can be fragile.

I did that first, switched to the macro when gcc 9.4 on the ARM agent was complaining about signedness... I can add a static_cast instead, hopefully newer compilers don't flag that as a useless cast.

Ah I see... thanks!
You can leave it as it is now and add comment that this is only needed for old gcc compatibility.

We do need to upgrade gcc on ARM runner at some point. FYI @martin-g, that's not the first time this one bites

ASF Infra team promised to upgrade the Github builders to a newer Ubuntu, but there is no timeframe ...

lang/c++/test/buffertest.cc

lang/c++/include/avro/Reader.hh

mkmkme · 2024-07-15T09:38:31Z

lang/c++/include/avro/Validator.hh

@@ -35,7 +35,7 @@ public:
    explicit NullValidator(const ValidSchema &) {}


This file also introduces user-facing sign-ness changes

True, I believe the new type is more appropriate here though, as a negative value isn't valid for any of these values... does this deserve a Jira issue?

I think yes. Also let's hear from @martin-g as he knows better the release cycle.

But at least this should be properly documented as a breaking change.

1.12 is really close to be released!
The JIRA ticket is needed mostly for the CHANGELOG file, so users are aware of the change and can update their code.

I created https://issues.apache.org/jira/browse/AVRO-4019 to track this issue

martin-g · 2024-07-16T06:46:31Z

Thank you, @Gerrit0 !

github-actions bot added the C++ Pull Requests for C++ binding label Jun 22, 2024

Gerrit0 added a commit to Gerrit0/avro that referenced this pull request Jun 28, 2024

apache#2966: Turn on more compiler warnings

7fd3ea8

Fokko reviewed Jun 28, 2024

View reviewed changes

Gerrit0 commented Jun 29, 2024

View reviewed changes

Gerrit0 force-pushed the wconversion branch 2 times, most recently from ae81e42 to a2c7576 Compare June 29, 2024 13:00

Gerrit0 force-pushed the wconversion branch from a2c7576 to 86e7c47 Compare July 9, 2024 12:02

Gerrit0 force-pushed the wconversion branch from 86e7c47 to 483c239 Compare July 10, 2024 02:24

github-actions bot added the build label Jul 10, 2024

Gerrit0 added 3 commits July 10, 2024 06:17

[C++] Turn on -Wuseless-cast and -Wconversion

6df100c

Print versions of compiler/tools used in CI

8df517b

Fix another compiler warning

886598f

Also attempt to make build.sh continue

Gerrit0 force-pushed the wconversion branch from 21d0928 to 886598f Compare July 10, 2024 12:23

Fix the last of the compiler warnings

8452341

* Update CodecTests.cc * Update DataFileTests.cc

Fix conversion warning on ARM64

cfba715

Hopefully make GCC 9.4 happy

e88bfff

martin-g approved these changes Jul 15, 2024

View reviewed changes

mkmkme reviewed Jul 15, 2024

View reviewed changes

lang/c++/impl/Compiler.cc Show resolved Hide resolved

mkmkme suggested changes Jul 15, 2024

View reviewed changes

[C++] Address review comments

94ec8a9

Gerrit0 changed the title ~~[C++] Turn on even more compiler warnings~~ [AVRO-4019] [C++] Turn on even more compiler warnings Jul 16, 2024

martin-g merged commit c460d64 into apache:main Jul 16, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AVRO-4019] [C++] Turn on even more compiler warnings #2966

[AVRO-4019] [C++] Turn on even more compiler warnings #2966

Gerrit0 commented Jun 22, 2024 •

edited

Loading

Fokko commented Jun 26, 2024

Gerrit0 commented Jun 28, 2024

Fokko Jun 28, 2024

Gerrit0 Jun 29, 2024

Gerrit0 Jun 29, 2024

Gerrit0 commented Jun 29, 2024

Fokko commented Jun 29, 2024

Gerrit0 commented Jun 29, 2024

Fokko commented Jul 8, 2024

Gerrit0 commented Jul 9, 2024

Fokko commented Jul 10, 2024

Gerrit0 commented Jul 10, 2024

Fokko commented Jul 10, 2024

Gerrit0 commented Jul 10, 2024

martin-g commented Jul 10, 2024

Gerrit0 commented Jul 11, 2024

Gerrit0 commented Jul 14, 2024

martin-g commented Jul 15, 2024

mkmkme commented Jul 15, 2024

mkmkme left a comment

mkmkme Jul 15, 2024

Gerrit0 Jul 15, 2024

mkmkme Jul 15, 2024

martin-g Jul 15, 2024

mkmkme Jul 15, 2024

Gerrit0 Jul 15, 2024

mkmkme Jul 15, 2024

martin-g Jul 15, 2024

Gerrit0 Jul 16, 2024

martin-g commented Jul 16, 2024

		@@ -35,7 +35,7 @@ public:
		explicit NullValidator(const ValidSchema &) {}

[AVRO-4019] [C++] Turn on even more compiler warnings #2966

[AVRO-4019] [C++] Turn on even more compiler warnings #2966

Conversation

Gerrit0 commented Jun 22, 2024 • edited Loading

What is the purpose of the change

Verifying this change

Documentation

Fokko commented Jun 26, 2024

Gerrit0 commented Jun 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Gerrit0 commented Jun 29, 2024

Fokko commented Jun 29, 2024

Gerrit0 commented Jun 29, 2024

Fokko commented Jul 8, 2024

Gerrit0 commented Jul 9, 2024

Fokko commented Jul 10, 2024

Gerrit0 commented Jul 10, 2024

Fokko commented Jul 10, 2024

Gerrit0 commented Jul 10, 2024

martin-g commented Jul 10, 2024

Gerrit0 commented Jul 11, 2024

Gerrit0 commented Jul 14, 2024

martin-g commented Jul 15, 2024

mkmkme commented Jul 15, 2024

mkmkme left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martin-g commented Jul 16, 2024

Gerrit0 commented Jun 22, 2024 •

edited

Loading