Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression test for submodules with relative paths #557

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions tests/test_project.py
Original file line number Diff line number Diff line change
Expand Up @@ -871,6 +871,110 @@ def test_update_submodules_strategy(repos_tmpdir):
assert net_tools_project.sha('HEAD', cwd=kconfiglib_dst_dir) \
== kconfiglib_new_sha

@pytest.mark.xfail
def test_update_submodules_relpath(tmpdir):
# Regression test for
# https://github.com/zephyrproject-rtos/west/issues/545.
#
# We need to make sure that submodules with relative paths are
# initialized correctly even when ther is no "origin" remote.
#
# Background: unless the config variable git.$SUBMODULENAME.url is
# set, "git submodule init" relies on the "default remote" for
# figuring out how to turn that relative path into something
# useful. The "default remote" ends up being "origin" in the
# situation where we are initializing a submodule from a
# superproject which is a west project.
#
# However, west neither makes nor expects any guarantees
# about any particular git remotes being present in a project, so
# that only works if we set the config variable before
# initializing the module.

# The following paths begin with 'project' if they are west projects,
# and 'submodule' if they are direct or transitive submodules of a
# west project. Numeric suffixes reflect the tree structure of the
# workspace:
#
# workspace
# ├── manifest
# ├── project-1
# │ └── submodule-1-1
# └── project-2
# ├── submodule-2-1
# └── submodule-2-2
# └── submodule-2-2-1
#
# Some of them are declared to west directly, some implicitly.
pseudo_remotes = tmpdir / 'remotes' # parent directory for 'remote' repos
remote_repositories = [
pseudo_remotes / path for path in
['manifest',
'project-1',
'submodule-1-1',
'project-2',
'submodule-2-1',
'submodule-2-2',
'submodule-2-2-1',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to think of different names for "top" submodules known to west versus sub-submodules not know to west. This is obviously a big difference. How about top-submodule-x-y and rec-submodule-x-y-z? Or project-1-submodule-1 for the top ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the names are clear enough as-is, especially given the generous comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submodule-2-2 and submodule-2-2-1 look the same. They look like one is "only one level down" and that's it. But from a west perspective they are totally different. The last one is invisible to west.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, but I find this comment very nitpicky. This is just a test case, and a heavily commented test case at that. It isn't like this is user-facing documentation. Can we please just leave this be?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure this can be modified later if someone finds better names, no urgent need to do anything right now.

I thought you missed my point so I merely tried to rephrase it.

I think good names beat comments any day; one less level of indirection.

Nesting one hierarchy of git repos inside another hierarchy of git repos typically makes my head spin, so I was just trying to find a way to keep it more still. No big deal.

]
]
for remote_repo in remote_repositories:
create_repo(remote_repo)

add_commit(pseudo_remotes / 'manifest', 'manifest west.yml',
files={'west.yml': f'''
manifest:
remotes:
- name: not-origin
url-base: file://{pseudo_remotes}
defaults:
remote: not-origin
projects:
- name: project-1
submodules: true
- name: project-2
submodules:
- path: submodule-2-1
- name: sub-2-2
path: submodule-2-2
mbolivar-nordic marked this conversation as resolved.
Show resolved Hide resolved
'''})

def add_submodule(superproject, submodule_name,
submodule_url, submodule_path):
subprocess.check_call([GIT, '-C', os.fspath(superproject),
'submodule', 'add',
'--name', submodule_name,
submodule_url, submodule_path])
mbolivar-nordic marked this conversation as resolved.
Show resolved Hide resolved
add_commit(superproject, f'add submodule {submodule_name}')

add_submodule(pseudo_remotes / 'project-1', 'sub-1-1',
'../submodule-1-1', 'submodule-1-1')
add_submodule(pseudo_remotes / 'project-2', 'sub-2-1',
'../submodule-2-1', 'submodule-2-1')
add_submodule(pseudo_remotes / 'project-2', 'sub-2-2',
'../submodule-2-2', 'submodule-2-2')
add_submodule(pseudo_remotes / 'project-2' / 'submodule-2-2', 'sub-2-2-1',
'../submodule-2-2-1', 'submodule-2-2-1')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add these from the bottom up, not top down otherwise the commits at the top will be lagging behind (git status should show that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow, sorry. How can I add a sub-submodule for a submodule that isn't added yet?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so let's say we want A->B->C from parent to child. You certainly don't need A to add C to B (B->C). Plus it's faster to link B->C first because when you add B to A later then C comes with it "for free" without having to go back and perform another commit.

If you do A->B first, then after doing B->C then you need to go back to A to update B to the newer and "bigger" B that includes C.

BTW this test code does not seem to have any commit yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You certainly don't need A to add C to B (B->C).

I must be missing something. I think I definitely do need A to add C to B.

I need somewhere to add C. That place is B. I cannot create B without creating A, since B lives inside A.

What am I missing? Can you phrase this suggestion in the form of a diff?

Copy link
Collaborator

@marc-hb marc-hb Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the repo initialization routines and I'm not sure I have the best solution, but the initialization problem I noticed when reading the code seems confirmed when trying to run it:

tox -- -k submodules_rel --runxfail
cd /tmp/blah/pytest-6/test_update_submodules_relpath0/remotes/

tree project-2

project-2/
├── submodule-2-1
└── submodule-2-2
    └── submodule-2-2-1

git -C project-2/ status

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   submodule-2-2 (new commits)

git clone --recursive project-2 incomplete

tree incomplete/
incomplete/
├── submodule-2-1
└── submodule-2-2

Also, the remotes should all be at the same level. In a real-world example they would all be bare repos at the same level. Relative paths are complicated enough, better not add relative path divergences. It may be possible to simply move the remotes after initialization. As long as they're only used as remotes, only their .git/ will be used so it should hopefully not matter whether the checkout is in some inconsistent state.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot create B without creating A, since B lives inside A

I think you're confusing or conflating initial repo creation with adding a submodule.

$ tree
├── A
├── B
└── C

$ git -C B submodule add ../C
Cloning into 'submtest/test/B/C'...

$ git -C B commit -m 'add ../C to B'
[master 616d32033d4a] add ../C to B
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 160000 C

$ git -C A submodule add ../B
Cloning into 'submtest/test/A/B'...
done.

$ tree A/
A/
└── B
    └── C


workspace = tmpdir / 'workspace'
remote_manifest = pseudo_remotes / 'manifest'
cmd(f'init -m "{remote_manifest}" "{workspace}"')

workspace.chdir()
cmd('update')
expected_dirs = [
workspace / expected for expected in
['project-1',
'project-1' / 'submodule-1-1',
'project-2',
'project-2' / 'submodule-2-1',
'project-2' / 'submodule-2-2',
'project-2' / 'submodule-2-2' / 'submodule-2-2-1',
]
]
for expected_dir in expected_dirs:
expected_dir.check(dir=1)

def test_update_recovery(tmpdir):
# Make sure that the final 'west update' can recover from the
# following turn of events:
Expand Down