Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nuget packages namespace #1004

Open
RomanIakovlev opened this issue Sep 25, 2023 · 1 comment
Open

Nuget packages namespace #1004

RomanIakovlev opened this issue Sep 25, 2023 · 1 comment

Comments

@RomanIakovlev
Copy link

The vast majority of nuget packages have "-" as namespace, and rightfully so, because nuget doesn't support namespacing (short of id prefix reservation: https://learn.microsoft.com/en-us/nuget/nuget-org/id-prefix-reservation). However there are about 900 packages in of nuget type in which namespace is present and not equal to "-". I've put those packages into a gist here: https://gist.github.com/RomanIakovlev/d6e3e36175c184c802d17f088c829d1b.

I think this is a bug in the crawler. I could try finding and fixing the problem, if given some guidance as of where to start.

@qtomlinson
Copy link
Collaborator

qtomlinson commented Nov 15, 2023

Given any coordinates, crawler attempts to fetch as it is specified, and mark it missing when the package is not found. In NuGetFetch (in the crawler), only name and version is used, so the crawler was able to fetch the package even when the namespace provided does not exist. There is a mechanism for crawler to rewrite the coordinates as what is actually fetched (casedSpec). This casedSpec determines how the harvested information is stored. The drawback of using casedSpec to fix this issue is that components nuget/nuget/a/alphafs/2.0.1 and nuget/nuget/b/alphafs/2.0.1 will trigger two separate harvests, which is not ideal.

There is CoordinatesMapper in service to normalize coordinates to what actually exist in the component registry. Using this approach, all coordinates are corrected before sent to the crawler, the crawler can just go ahead and do its job in harvesting components. No change is necessary in the crawler with this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants