Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Ignition Garden 🔥🌱 #107

Merged
merged 16 commits into from
Dec 18, 2021
Merged

Migrate to Ignition Garden 🔥🌱 #107

merged 16 commits into from
Dec 18, 2021

Conversation

chapulina
Copy link
Contributor

@chapulina chapulina commented Dec 6, 2021


We have a 🐥 and 🥚 situation, because there's a circular dependency between lrauv and lrauv-application.

  • 🐥 The mission tests here will fail until we publish a new Docker image with lrauv-application using Garden,
  • 🥚 but we can't build lrauv-application using Garden without this PR.

All version bumps haven't finished for Garden yet. Until they are, our main branch will be unstable. Track remaining bumps here:

Signed-off-by: Louise Poubel <[email protected]>
Signed-off-by: Louise Poubel <[email protected]>
@chapulina chapulina self-assigned this Dec 6, 2021
@chapulina chapulina mentioned this pull request Dec 7, 2021
11 tasks
Copy link
Collaborator

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me here. I tried testing the binaries and it builds but it's crashing. I don't see the new garden tag of the upstream repo https://hub.docker.com/r/mbari/lrauv-ignition-sim/tags

For the rollout we could trivially change the tag name and update the garden tag when we push not just latest so it's explicit what version we're building off of. And then the rollout doesn't need to be commenting things in and out. We can test this PR against garden instead of having to comment things out and then bring them back. Once we have garden stablized and in the main channel we can choose to switch over the latest tag to point to the garden instances and remove the direct reference to garden in the tag.

@chapulina
Copy link
Contributor Author

chapulina commented Dec 9, 2021

I don't see the new garden tag of the upstream repo

I can't push one until this PR is in. The upstream repo depends on the messages from this repository, so we need at least the messages here to be using Garden. That's the 🥚 above.

I suppose I could build an image using this branch instead of main 🤔

@chapulina
Copy link
Contributor Author

chapulina commented Dec 9, 2021

For the rollout we could trivially change the tag name and update the garden tag when we push not just latest

I like that, so here's the new proposal:

    • Push an image tagged garden using this branch and the equivalent one upstream
    • Use that here and re-enable tests to make sure they pass
    • Merge this PR
    • Merge upstream's PR
    • Push a new garden tag from the new main branches

Signed-off-by: Louise Poubel <[email protected]>
@chapulina
Copy link
Contributor Author

Ok, the first 2 steps have been executed. I'm investigating if the test failures are due to the Garden bump or something else.

@arjo129
Copy link
Member

arjo129 commented Dec 10, 2021

Maybe we should just disable the mission tests first. In any case I don't expect the main branch to pass tests like testPitchMass.xml as the behavior on the main branch is wrong.

Also with the fix in gazebosim/gz-sim#1211 I would expect all our mission trajectories to change. I have updated the tests in my various branches.

Furthermore, till the RDI logging issue is fixed I dont expect the tests to not be flaky. Digging into why the tests are not passing seems like extra effort.

@chapulina
Copy link
Contributor Author

I would expect all our mission trajectories to change

Yeah that's why I opened #112, so we're less tied to a specific trajectory and can adjust tolerances quicker.

with the fix in gazebosim/gz-sim#1211

That should also affect Fortress once we make a new 6.X release, right?

till the RDI logging issue is fixed I dont expect the tests to not be flaky

I merged Ben's upstream PR and that's already included in the MBARI Garden image (but not on the Fortress one yet). I'm getting different controller failures from you peeps, mine are always on the AHRS, so I'm having a hard time validating those fixes...

Maybe we should just disable the mission tests first.

My only concern is that there's some behaviour change other than the one caused by 1211 and we fail to track it down. But I'm not opposed to doing that if it's faster than tweaking the tests to the new behaviour.

@chapulina chapulina requested a review from mabelzhang December 15, 2021 17:36
@tfoote tfoote self-requested a review December 15, 2021 17:37
@chapulina
Copy link
Contributor Author

CI is 🟢

Signed-off-by: Louise Poubel <[email protected]>
Copy link
Collaborator

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good and I've validated that it runs for me locally. There were some caching issues with caching of older packages but clearing my docker cache and starting over took care of that.

@tfoote
Copy link
Collaborator

tfoote commented Dec 17, 2021

CI failed DepthVBS the last two times. I'm going to retrigger. Possibly fix in #113

@chapulina
Copy link
Contributor Author

It looks like the new tag has been pushed for fortress, but not for garden yet, so garden may still have the critical RDI timeouts

@tfoote
Copy link
Collaborator

tfoote commented Dec 17, 2021

This time it failed LrauvTestFixture.YoYoCircle (Failed) so it's looking like it's the flakey tests.

@chapulina
Copy link
Contributor Author

It's the RDI

2021-12-17T00:39:29.391Z,1639701569.391 [CBIT](ERROR): Data Fault in component: RDI_Pathfinder

Copy link
Collaborator

@mabelzhang mabelzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the 3 goodish missions (testDepthVBS, testPitchMass, testYoYoCircle) with MBARI BitBucket branch feature/2021-12-07-ignition-garden.
Approving to unblock, but I have questions.

I haven't run the missions in so long that, yoyo seems to look a lot better now, in that it continues in circles instead of going into a straight line after a while. Does that come from the time sync fix or the actuator fix?

testDepthVBS I cannot tell whether it's closing the loop visually. It seems to go back up quite a bit after going down - was that the case before Garden? I haven't checked the plots. If this behavior visually looks correct to you guys, then I'm not going to check the plots (or you guys can double-check the VBS plots, I haven't run the plotting scripts in a while):
2021-12-17-030759_2400x2000_scrot

Independent of this PR, is it just me, or does Ignition Garden seg fault on exit? It does that independent of the osrf/lrauv repo for me, e.g. ign gazebo alone seg faults on exit too. I compiled Ignition Garden after removing the build and install directories.

QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-developer'
Stack trace (most recent call last):
...
Segmentation fault (Address not mapped to object [0x7fdc71ed7ef0])

@chapulina
Copy link
Contributor Author

yoyo seems to look a lot better now, in that it continues in circles instead of going into a straight line after a while. Does that come from the time sync fix or the actuator fix?

I only see that when the controller crashes and the actuators are left in the last command (i.e. rudder is not zero).

is it just me, or does Ignition Garden seg fault on exit?

testDepthVBS I cannot tell whether it's closing the loop visually. It seems to go back up quite a bit after going down

That mission has been flaky. I think it's one of the ones @arjo129 mentioned will be improved with the hydrodynamics fixes. Related tickets:

@chapulina
Copy link
Contributor Author

Something's up with the new garden Docker image. CI used to take around 10 mins, but now it's still running past 36 mins. It's the 2nd time, I cancelled the last one after 38 mins. I'll dig more soon to see what's happening. FYI @braanan

@chapulina
Copy link
Contributor Author

Cancelled it after 58 mins

@chapulina
Copy link
Contributor Author

Ok, so the problem is that the LRAUV application is never quitting. I can reproduce it even with the latest fortress code. I'll open a PR adding a timeout just so tests never run forever, and check with Ben how to proceed.

@mabelzhang
Copy link
Collaborator

mabelzhang commented Dec 17, 2021

I only see that when the controller crashes and the actuators are left in the last command (i.e. rudder is not zero).

Hmm, you mean, it's not supposed to keep going in circles? Then maybe that's why things are stuck. Then it seems the mission is never finishing.

That mission has been flaky.

Okay. I just wanted to mention it because I remember seeing plots from Arjo saying VBS is finally closing the loop now, but visually it's hard to tell. Maybe that's not merged yet.

@chapulina
Copy link
Contributor Author

Ok, this is it folks, we're going to Garden! 🌱

@chapulina chapulina merged commit 701852c into main Dec 18, 2021
@chapulina chapulina deleted the chapulina/garden branch December 18, 2021 04:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bump to Garden 🌱
4 participants