-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] NDK r28 beta1: issues with existing polyfills #2081
Comments
In gettext libintl, the mempcpy related errors also include:
|
Ooh, if vcpkg's CI has issues that gives us a great place to go look at the scope of the problem. Thanks! I should remember to try that whenever I'm making ambitious changes in the future. We'll have a look and see what can be done. My gut says that we'll re-hide the stuff that's causing problems for r28 and work on upstreaming fixes. If it were a shorter list of projects I'd say we should just fix those projects and ship r28 as-is, but if it's hit a non-trivial number of projects just in vcpkg, that's a lot of patches to upstream, and a lot of package updates that developers would need to take to get those fixes. A complete rollback is an option, but it's sort of the last resort. For the reasons mentioned in the changelog, this is a step in the right direction that we want to take, so I'd like to see us make progress wherever we can do so safely. I'm okay with gradual progress though, so if we need to re-hide the stuff that's broadly polyfilled already we can do that to avoid breaks while we fix the various open source projects to be compatible with the change. |
How did you do this? I was expecting to find a GitHub workflow for this, or instructions on how to build all the ports in their docs, but I'm not finding either. |
I eventually figured this out: |
This is very slow though so if there's a smarter way to do this please lmk. |
i agree with danalbert's plan to quickly fix all this by just adding ...but for the longer-term solution, here's a quick brain dump in case i go under a bus:
i don't really see any way to polyfill any of the unlocked stuff unless we re-expose the FILE implementation details. which maybe we should, but (a) that's a big decision and (b) we should definitely see what the wasm plan looks like in that area first. (since they might be the first actual beneficiaries of the increased opacity.)
large but doable, modulo similar FILE-internals leakage.
trivial one-liner. we should probably have done this as a polyfill in the first place and stopped there given that memcpy() can even be optimized to direct loads/stores, but mempcpy() is definitely going to cost you two function calls. where it should really just cost you an add.
so believe it or not, mblen() is a trivial wrapper for mbrlen() which is freely available. so surprisingly, this should be a trivial polyfill too.
this might seem like an obvious "no", but it's one where i wonder whether this is actually a great candidate for "let's just polyfill all the time". why? because Android has hard-coded answers to all the questions. and because every call i've ever seen asks a hard-coded question. so i'd expect that clang would be able to turn
the reliance on globals here makes this a "no".
even our limited set of supported encodings is probably too big for this to make sense.
as long as we don't care about not having the optimized arm64 implementation (which is kind of a moot point if your alternative is "nothing", as here), there's a trivial implementation of this.
certainly doable, though it's a weak symbol so that native bridge can override it, so doing so would likely cause a bit of trouble there.
nightmare.
this doesn't seem to touch any internal pthread state, and it's all relatively self-contained and based on <stdatomic.h>, so i think this is something we could do ... it would leak the currently-private internal implementation of pthread_barrier_t, though i suspect we could always cheat there by giving the polyfill struct and functions slightly different names?
conceptually easy, but really annoying as long as we're still supporting ILP32/32-bit off_t. which, realistically, is something we'll be doing for longer than we're still supporting API 24. certainly doable though. just gross.
getentropy() relies on getrandom() under the covers, and although that's been around since 3.17, it was only added to bionic at the end of 2017, so only 2018+ devices will have a seccomp policy that will let you use these. TL;DR: not likely. |
It a draft PR, microsoft/vcpkg#41293. It is in the timeline here. The failure logs are available as "assets" from Azure Pipelines.
Basically yes, but vcpkg CI also skips known failures with
Of course ... vcpkg builds debug+release, and you start with empty caches for assets and artifacts. You can look at individual ports with Maybe you can limit the build to the release config with |
I definitely looked for PRs... I must have accidentally done that in the wrong tab with a different repo. Thanks.
Ah, okay. If this is just a thing that's slow, alright. |
Note that when the Microsoft team added Android, they didn't fix all the existing ports. scripts/ci.baseline.txt really must be applied. Otherwise you will meet ports which will make you cry. |
Oh, I'd misunderstood you before and thought that was applied automatically. That explains why I've found mostly stuff that didn't work with r27 either. |
Bug: android/ndk#2081 Test: vcpkg install --triplet arm64-android gettext-libintl Change-Id: Id2eff76b929543f18d64accd5b6afb64bc63e117
I see (and have fixed) the |
No, this is more broken than you'd think. The I'm honestly not sure if I'm doing anyone any favors by "fixing" the NDK to make this port compatible again. I'm not convinced that the previous port which built would actually work. |
nothing says "Simple Authentication and Security Layer" like "we copied and merged multiple libcs" :-) |
I think this interpretation needs some adjustment, even if I didn't check gsasl in particular. Gnulib is not a standalone lib, but vendored into the configuration and build of software packages. It is organized into "modules", some meant to add, rectify or enhance implementations of ISO C or POSIX functtions. I don't consider myself an expert for autotools or gnulib. I honestly also don't know what is the best way to deal with this, and I don't ask for a particular solution. Gnulib providing an implementation under the same name may be hard to align with the Android ecosystem's variation over API level and actual runtime. The vcpkg problem is that the modules are littered into all those packages based on autotools. They all need to be patched individually until Gnulib improvements are integrated by each single package. |
Ah, that comment in the headers is pretty misleading if so. Odd. https://www.gnu.org/software/gnulib/manual/gnulib.html#Support-for-ISO-C-or-POSIX-functions_002e says that gnulib's polyfills are supposed to be prefixed to avoid exactly this kind of name conflict.
Yep, we do understand that. That's why we're partially reverting bionic's header changes for now. I am going to look through the existing ports while I do that though, because some of them may be problems with the ports. 7zip wasn't in your list, but it's the first one I looked at and it's just wrong, so soon 7zip should build for android: microsoft/vcpkg#41721. For the rest where it's not a port problem, I'll revert the bionic header change that's causing the problem. Like I said that's already been done for |
Ah, because I'd come to this bug prepared to find issues with polyfills, but that's not what this is. This looks to be an autoconf detection issue. It's trying to use |
I happened to be watching the log re-populate when rebuilding libuv after hiding the
It links though, so I guess those calls are probably culled by dead code elimination? |
Once this stack of commits is merged, the list you gave above all builds, aside from poppler, which is broken for other reasons. I think upstream forgot an It's going to be a bit before I can actually pull this update into the NDK. The OS repo is not in a state where I can pull a new sysroot, so I have to wait for that before I can update the NDK. We won't ship r28 without that, it's just going to be a while before we can get even a canary build to recheck vcpkg :( |
ah, funnily enough i saw your code change before i saw this, and my review comment there might be the answer --- i think they're actually checking for the corresponding constant #define instead. my guess is that if you undo the change you did make, and #if around the #define instead, you'll get the result you're looking for. (this was phrased more as a question on the code review because i wasn't sure whether i just didn't understand the cmake bumf...) |
I'd have to hide the
If I hide only the function decls, anyone that needs to call these via |
Bug: android/ndk#2081 Test: vcpkg install --triple=arm64-android gsasl Change-Id: If47f2676c312cd5e6a699b2fa0a2ce86a6f8724b
I'm considering another approach to this. What we're trying to accomplish by unguarding all these bionic APIs is to make is possible for stuff in libc to be runtime guarded with Exposing these decls when not using that mode (and not using it is the default behavior for the reasons explained on that page) is less important, and I'm not really sure how useful that would be. The main thing you get with this there is better error messages and marginally easier The third thing we'd like to accomplish by unhinding this stuff adding more polyfills directly to bionic. It's silly for everyone to have to make their own polyfill for something that's trivially inlined into the libc headers. https://android-review.googlesource.com/c/platform/bionic/+/3319120 (very much a draft) I think accomplishes both goals: it exposes these APIs to the people that probably do want accurate answers, while maintaining source compatibility as the default behavior. It doesn't really solve the third thing, but it would allow us to expose the polyfills in that mode and not in the default mode so we can make some progress. |
(i'm pretty sure the last time we had to do something like that, it was for this same library!)
yeah, and i'm definitely sad about that, but at the same time, this seems like the "not a regression from an ndk perspective" way to remove versioner (which is a nice tech debt removal for us). probably best not to combine too many unrelated concerns here. "sgtm". |
Putting this behavior behind a macro so we can easily change it across bionic if needed. Bug: android/ndk#2081 Test: None Change-Id: Ia08b54a61b5b06f6fa4646f036ef1e788d7c3ece
This was generated mechanically by reverting my recent re-hide commits (except for mempcpy, but versioner cleaned that up for me anyway), reverting the commits which removed versioner, then: ``` m out/soong/ndk_headers.timestamp cp -r out/soong/ndk/sysroot/usr/include/* bionic/libc/include git -C bionic clean -df ``` Effectively, this has restored the `versioner` processed headers, but I'm checking the results of that in so we don't have to keep `versioner` around. For the NDK, this restores r27 behavior by default. Anyone that's opted into weak APIs will get the new behavior. I think this is our best option. Anyone writing code with Android in mind should be using weak APIs, but any code being lightly ported (and thus using the default configuration) should not be, and it's those ports where we're having trouble with collisions. Bug: android/ndk#2081 Test: None Change-Id: I370079d27566b0c1543fb5890c958c8d09b05006
This should be fixed as of https://ci.android.com/builds/branches/aosp-master-ndk/grid?head=12591196&tail=12591196&legacy=1. @dg0yt, mind verifying that? I'll try one of the packages above in a bit but it'd be nice to know if your vcpkg PR would be unblocked :) |
Confirmed that gsasl builds fine with that. It was all one automated fix so the rest should be good too. lmk if they aren't. |
There is no easily accessible URL to download the actual artifact into vcpkg CI. |
https://androidbuildinternal.googleapis.com/android/internal/build/v3/builds/12591196/linux/attempts/latest/artifacts/android-ndk-12591196-linux-x86_64.zip/url should work. It works on my non-corp logged in browser, so I think the "internal" in the URL is just a historical oddity. |
Okay, the CI finished much more successfully now. The "polyfills" topic is mostly resolved. Looking at the remaining failures, some patterns involve SDK headers. Initially I thought it is mostly about being strict with old C standards, but there are also issues with C++11 and C++17. For example:
This is just a short collection of notes.
Logs can be downloaded from the assets page, https://dev.azure.com/vcpkg/public/_build/results?buildId=109033&view=artifacts&type=publishedArtifacts (Hover line, than use the menu on the right side). |
vcpkg (API level 21) with "r29-canary" from #2081 (comment):
Notes for all failing ports, grouped:
|
https://android-review.googlesource.com/c/platform/bionic/+/3343442 fixes the stray i think as for pthread_barrier_t --- that's not a regression, right? we've always had an unconditional definition of that type for anything that pulls in <pthread.h>... likewise the use of EOVERFLOW in an enum --- i don't think there's ever been a version of bionic where that wouldn't expand to |
Yeah, the libc++ thing is the same as I explained here: #2094 (comment). |
FTR all listed port build successfully with r27 in vcpkg. (I don't claim they run correctly.) |
Bug: android/ndk#2081 Change-Id: I8d63fb003510837c28dad0da9b0c6703a3487f7b
In r27, the definition of pthread_barrier_t is behind |
The inline thing is fixed (the CL is up, it hasn't propagated through the the NDK yet). The libc++ thing is out of our hands. That's a 3p library that we don't have control over, and the change was correct to match the C++ spec afaict. The most we can do there would be to file an LLVM bug but I'm not at all optimistic that it would be "fixed", since it's not actually wrong.
That's an easy fix, but it's also kind of a weird one? If the problem in hidapi is also an easy patch, we always prefer to keep types visible even if the APIs are not, otherwise you can't reasonably use the API's via
Ah, I think I need to fix this one separately from the others since it was a different problem than most of them since it was an explicit polyfill rather than just a decl. I'll do that in a moment.
In case it hasn't already been said, we do appreciate the work you do here :) This report in particular is some of the biggest help we've ever gotten in terms of pre-release feedback for making sure we don't ship a release that's hard for people to adopt. I'm pretty interested in making things easier for vcpkg, so keep the reports coming (and apologies in advance that for LLVM bugs we typically can't do anything until upstream fixes them, we're really just a distributor for that). |
The EOVERFLOW thing is local to a |
Well, basically it looks good to me now. |
https://r.android.com/3341580 rehides
Excellent, the |
This conflicts with an existing polyfill in libconfuse. Bug: android/ndk#2081 Test: None Change-Id: I4d0a1530ac0cd0e4e75f52106ca3f284e630c847
Description
This is in response to the r28 beta1 release notes.
A test build in vcpkg CI (which uses android 21) shows a lot of breakage related to
__INTRODUCED_IN()
with polyfills from gnulib. There is no full picture yet because the build error cascade starts early. The first hit ismempcpy
in GNU libiconv. I added a patch to rename the polyfill, but the next failures is already gettext-libintl.In the non-gnulib group, openssl is affected, removing many ports from the build. For poppler, I assume that GNU libiconv used to be the polyfill, but I didn't verify that. "libiconv as a polyfill" might be a broader pattern.
I grepped the arm64 logs for
__INTRODUCED_IN
. Transformed and compressed:Patching gnutext-libintl would probably add many more.
(All those packages build with NDK r26. NDK r27 is still blocked by its bugs.)
Affected versions
r28
Canary version
beta1
Host OS
Linux
Host OS version
vcpkg CI image
Affected ABIs
arm64-v8a
Build system
Other (specify below)
Other build system
No response
minSdkVersion
21
Device API level
No response
The text was updated successfully, but these errors were encountered: