Make intersphinx (a.k.a. external references) more user friendly #12152
Replies: 5 comments 16 replies
-
Ok thats my ten cents, happy to hear feedback. cc @picnixz @danieleades @jayaddison, as people I know actively working on sphinx, |
Beta Was this translation helpful? Give feedback.
-
I'd be happy to have a CLI indeed. Especially when I want to know which ID I need to use when referring to sections for instance or whether something is generated as a class, data or anything. For instance, I'd be happy to write something like
and get something like
I would get, for isntance I would also expect the tool to be able to output something that is as short as possible in terms of characters if people are concerned about docstrings for instance. I agree also that we could have filters like --
Agreed.
I prefer JSON over yml because it's in the library and easy to handle with jq externally (also YAML is not supported natively and needs an additional dependency).
No, because extensions can add additional information if needed or process them differently. I personally don't want to change that logic.
I'm not sure whether it's the same for images. But referencing labelled equations would be indeed good. |
Beta Was this translation helpful? Give feedback.
-
Regarding the In regards to the proposed steps above:
|
Beta Was this translation helpful? Give feedback.
-
My phrasing might not have been too clear, it was the old syntax which was icky. I think it was something like
I agree that it probably can't be the only tool.
The idea was that one should write the same as if the target is in the same document. If there is no role to cross-reference a confval, then I think that is the missing piece for this case.
Then I suggest those roles to be added. In general, if an entity can be declared with some directive, then there should be a role to cross-reference it.
Due to domains being the only ones really knowing how to do lookups, I don't see a reasonable way in general to avoid that users most have those extensions as well to be able to cross-reference external projects.
If there is better tooling to inspect inventories, including role information, wouldn't that help?
Part of the plan for generalizing intersphinx was to let domains write blobs with their objects, including versioning information. I think modernizing the inventory to include necessary information is the best first step for solving this.
While it may address the problem, it seems to me that it complicates intersphinx, and makes it harder to generalize. Once there is a release with this, it becomes very hard to change without breaking projects. |
Beta Was this translation helpful? Give feedback.
-
In addition to the CLI, I'd also love to see the ability to add a page to projects (default could be on or off) that shows all the references available, something similar to intersphinx-untangled that @webknjaz created. hrm, I suppose that could be converted to an extension first 🤔 to prove things out, but I would love for it to be more of a default to promote cross-linking out in the wild. I also created https://github.com/orgs/sphinx-doc/discussions/12151, coincidentally in tandem, and it's mostly covered here, so I closed that and moved it here. In summary, it would be nice to be able to use intersphinx to cross-reference
(edit: as pointed out below, this code originally likely comes from #5562) Semi-related, I think the |
Beta Was this translation helpful? Give feedback.
-
Note I've already made a change in this direction (#12133) and plan some more, but I though it would be helpful to open a "meta" discussion to outline my thinking.
There is also a disclaimer, that I have already implemented some of the thing discussed below in MyST 😅 (https://myst-parser.readthedocs.io/en/latest/syntax/cross-referencing.html#cross-project-intersphinx-links)
Edit: see also https://github.com/orgs/sphinx-doc/discussions/12204, for a more specific discussion on changing the current
objects.inv
formatI think intersphinx is a great feature of Sphinx, and a key selling point:
That said, I think it could be a bit easier for the user to understand / use...
Desirable (simple) user workflow
At the very simplest, each
objects.inv
(generated in an HTML build) provides a unique mapping fromdomain name -> object type -> target name -> relative URL
(plus also an optionaldisplay name
a.k.a implicit reference text).Combine this with the
intersphinx_mapping
configuration, and you get a unique mapping frominv key -> domain name -> object type -> target name -> absolute URL
As a user, I feel the "minimal" work flow should be:
Have a sphinx CLI I can call, that reads the
intersphinx_mapping
then outputs a "human readable" file with all available mappings.Pick something I want to reference.
Have a sphinx "syntax" that I can use, to specify a
(inv key, domain name, object type, target name)
"key set" that I want to reference, together with an optional "explicit display name".Have sphinx check my "key set" is valid and, if so, render output the absolute URL, with the explicit or implicit display name (and warn if neither is available).
for brevity / flexibility, one might also want the syntax to allow for one or more of the
(inv key, domain name, object type)
to be omitted, and have sphinx check that there is exactly one match for the remaining keys.Current sphinx implementation (and its problems)
For (1), this is semi-implemented with:
python -m sphinx.ext.intersphinx <inv file/url>
.This is a good start, but:
conf.py
and output all mappings (possibly with some "filtering" options).YAML
might be nicer.For (3), you now have the
external
role;:external+inv_name:domain:type`compile`
:external:type:`compile`
:external:type`explicit display name <compile>`
BUT this is where it gets a bit problematic:
what the documentation does not tell you is that the
external
role expects arole name
INSTEAD of anobject type
,these are two "inter-linked" but distinct concepts.
The first problem with requiring role names (fixed in #12133), is that they are not always the same as the object type name, for example in the sphinx
objects.inv
, you have:Previous to #12133,
:external:py:function:`compile`
would fail becausefunction
is the object type, butfunc
it the role name,there is currently no obvious way for a user to know this.
The second problem (not yet fixed) is that an
objects.inv
may contain(domain name, object type)
that do not have a corresponding role name, available in your local sphinx project.For example in the sphinx
objects.inv
, you have:But, if you try to use
:external:std:confval:`add_function_parentheses`
in your local project, it will likely fail becauseconfval
is a bespoke object type created insphinx/doc/conf.py::setup
.Again, it would not be obvious to the user why this is failing, and how they could fix it.
Fixing would anyway basically entail you have to ensure you have all the same extensions / bespoke
setup
as the project you are referencing, which is definitely not ideal.Another issue related to (4), is that intersphinx reference resolution does not emit any warnings if an external reference has multiple matches, when the
inv_key
is not specified, it silently choses one (see use ofmain_inventory
inintersphinx.py
).I don't see the justification for this.
So why does
external
use role names and not object types?Well there is some advantages here, internally
external
actually locates/calls the role to generate the "reference" nodes.This allows for "special features" of that role to be utilised
For example, with
py:meth
you can prepend thetarget
with.
and/or~
to implement partial target matching (see bottom of https://www.sphinx-doc.org/en/master/usage/domains/python.html)This allows for "special formatting" of that role to be utilised
For example, for
:external:std:ref:`a`
creates:whereas
:external:std:keyword:`a`
creates:(note one uses
inline
and the other usesliteral
for the inner node)So yes, I understand why it uses role names, but I think this also can lead to issues and user confusion.
Another consequence to note, is that you can see above that
external
ends up creatingpending_xref
nodes,which then need to be resolved in a later (post-processing) stage.
Unless I'm missing something, there is really no need to "postpone" the resolution of these references, since we already have all the information we need to resolve them (i.e. loaded inventories).
Proposed solutions
An extended CLI could be implemented, that reads the
intersphinx_mapping
fromconf.py
then outputs a "human readable" file with all available mappings (possibly with some "filtering" options).The output from the CLI could additionally output what role names are available for each domain/object type in the local project (if any).
The
external
role should fallback to matching against the object type, if a role name is not available, with a "standard" reference format.A warning can be emitted for such fallbacks, but with a type/subtype that can be suppressed if desired.
The
external
role should emit a warning if an external reference has multiple matches, e.g. when theinv_key
is not specified.A related outstanding issue is that currently the
math
domain does not output anything inget_objects
(used to createobjects.inv
), and so its not possible to reference any labelled math equations usingexternal
.Additional thoughts 1 (user generated inventories)
Lets say you want to make lots of reference to a website that was not written in sphinx (boo) and so does not have an
objects.inv
.It might be nice to allow users to create their own
objects.inv
(without having to look into the depths of sphinx, on how to create the binary file format).We could either offer a CLI tool to do this, e.g. converting a YAML file to an
objects.inv
, or you could allowintersphinx
to directly read a.yaml
file.Additional thoughts 2 (internal references)
If you think about it, internal references are generally just a special case of external reference.
It would be nice to give users a quick overview of all the internal references in their project, and how they can be referenced.
At present this is a bit convoluted:
objects.inv
for your projectobjects.inv
to a human-readable formatIt would be nice if there was a builder, that just did all this in one.
The ideal output would also could be slightly different to external:
InventoryFile.dump
where the docname is converted to the URI)Domain.get_objects
method (perhapsDomain.get_objects_with_line
) for domains to implement. a bit of a pain but achievable in theory.This could also be useful for Language Service Providers to provide features like auto-completions and "jump to references/target"
Beta Was this translation helpful? Give feedback.
All reactions