The Signed Exchanges (SXG) spec introduces a new format for delivery of web content. AMP's use of SXG requires additional information to enable proper content negotation on a URL.
This format should be sent in two cases:
- delivery from origin server to intermediary
- delivery from intermediary to user
Ideally, it would not be sent in direct delivery from origin server to user, as that would best be served by a traditional HTTP exchange (e.g. requiring less computational overhead, and able to modify state).
Therefore, the need arises for the origin to distinguish requests from users and
requests from SXG intermediaries. That is, there is a difference between "I can
understand the SXG format" and "I prefer an SXG if available". Accept: application/signed-exchange
indicates the former. No currently-defined header
indicates the latter.
AMP SXG are intended for privacy-preserving prefetch from a referring page (such as a Google Search results page) to a coordinating AMP cache (such as the Google AMP Cache). If the referrer wishes to prefetch subresources as well, they must also be served from a coordinating AMP cache, in order to preserve privacy. In order for those subresources to be useful, they must be referenced by the signed HTML page.
Therefore, the requestor of an SXG may require the origin to produce an SXG tailored to the AMP Cache that is requesting it, by rewriting its subresource URLs appropriately.
AMP caches may may require the origin to apply AMP transforms, and may only accept specific versions of those transforms. This allows the AMP cache to:
- Make continuous improvements to the AMP transforms and the transformed AMP validation code.
- Try to satisfy AMP's design principles, especially as deficiencies in the transforms are found, by guaranteeing that its cache of SXGs don't contain those deficiencies.
- Keep its validation code of bounded complexity, by not needing to validate all possible versions of the transforms.
- Guarantee that all responses it fetches from publishers are useful, and don't require a second fetch for unsigned content.
The presence of the AMP-Cache-Transform
header indicates that the requestor
would prefer an application/signed-exchange
variant of the resource at the
given URL, but would accept a non-SXG variant. If a requestor sends this, it
should also explicitly include the relevant
application/signed-exchange;v=something
in its Accept
header, so that the
responder knows which versions of the SXG standard are supported by the
requestor.
The value of the header indicates target-specific constraints on the transformed AMP within the SXG. If a server is unable to meet those constraints, it should respond with non-SXG (unsigned) AMP, as the AMP Cache will need to apply those transforms itself, and thus be unable to use the provided signature.
If the server responds with an SXG, it should include an AMP-Cache-Transform
outer response header, specifying which of the alternatives it chose to
satisfy. This allows intermediary caching proxies to cache responses with
minimal understanding of the underlying format.
The header value is a parameterised list from header-structure-07.
The list represents an ordered set of constraints. The server should respond with an SXG variant matching the first parameterised identifier that it can satisfy. If it cannot satisfy any of them, then it should respond with non-SXG content.
For each identifier:
- If the identifier contains a
v
parameter, then its value represents a set of AMP transform versions. The server should respond with an SXG only if it can produce one of the versions in that set (see Version negotation). - If the identifier contains any parameters besides those mentioned above, then this identifier cannot be satisfied. The server should attempt to match the next one. (This reserves the parameter space for future additional constraints to be defined.)
- If the identifier is
any
, then the SXG is not intended for a particular prefetching intermediary, and therefore its subresource URLs needn't be (but may be) rewritten. - Otherwise, if the identifier is an
id
from the list in caches.json, then the SXG should have its subresource URLs rewritten. Thatid
's correspondingcacheDomain
indicates the fully-qualified domain name that forms the basis for the URL rewrites. - Otherwise, the identifier is invalid and cannot be satisfied. The server should attempt to match the next one.
The server should ensure its copy of caches.json
is no more than 60 days
out-of-date with the canonical linked above.
This section uses the ABNF rules of RFCF5234, augmented with the list extension defined in RFC7230 section 7, the OWS rule from RFC7230 section 3.2.3, and the "sh-" rules in header-structure-07.
The v
parameter value must be a string. Its value (after parsing as a string)
must conform to the following ABNF:
v_spec = #v_range
v_range = sh-integer / sh-integer OWS ".." OWS sh-integer
If the contents of the parameter string fail to parse as v_spec
, then the
server should fail to satisfy this parameterised identifier, but should not
immediately fail the entire parameterised list.
There are additional semantic constraints on the spec:
- Each
sh-integer
must be non-negative. - Two integers in a pair
X..Y
must satisfyX<=Y
. - No two ranges in a spec should intersect. (
A..B
andC..D
intersect ifB >= C && A <= D
.)
If the parameter fails to meet these criteria, then the server may choose not to
satisfy the parameterised identifier. Otherwise, the server can satisfy the
identifier if any of the following is true for any v_range
in the list:
- The
v_range
is a single integer, and the server can produce exactly that version. - The
v_range
is a pair of integersx..y
, and the server can produce a version in the closed interval [x, y].
If the server can respond with multiple versions in the set, it should respond with the highest version it can produce, but may respond with older versions in that set if other reasons dictate it (e.g. cache efficiency).
If the server does not know what version of the transforms it provides, then it
cannot satisfy any v_spec
.
If the server responds with an SXG, it should include an AMP-Cache-Transform
outer response header, with a value equal to the most specific constraint that
it can satisfy -- that is, a list of size 1. For now, that means:
- If it rewrote subresource URLs for a particular cache, the identifier should be the id of the cache.
- Otherwise, the identifier should be
any
. - It should have a
v
parameter whose value is the AMP transform version it responded with.
For a given URL, if the server content-negotiates on AMP-Cache-Transform
, it
must include Vary: AMP-Cache-Transform
in all responses, whether signed or
unsigned.
Note that this also likely means it's negotiating on Accept
, so it should
include Vary: Accept
in these cases, too. The high-entropy nature of Accept
causes cache fragmentation in default setups; publishers may wish to configure
caches under their control to convert incoming Accept
headers into
lower-entropy forms, e.g. by performing the content negotiation (using
hard-coded knowledge about what variants are available at a given URL) and
including only the negotiated media-type, without q-values. The publisher may
also specify
Variants
to aid caching proxies that understand that header.
The exact set of rewrites is not yet fully specified; a few examples are available, and a reference implementation will soon be available. In the interim, the Google AMP Cache will not require any rewrites (and, as a result, will not prefetch any subresources).
The list of rewrites is limited to base URLs within caches.json
in order to
provide the publisher some assurance that the rewritten subresources are
faithful representations of the original subresources.
For the sake of security, all script source URLs will need to be on
cdn.ampproject.org
, regardless of the target AMP cache. This provides the
publisher additional assurance that the JS is not an arbitrary payload. It would be
nice to get rid of this dependency; something like signature-based
SRI might be feasible.
If the URL serves multiple variants, and is thus subject to HTTP proactive
negotation, then
AMP-Cache-Transform
should only take effect after proactive negotiation has
selected a resource of content type application/signed-exchange
. In theory,
there may be an interaction with content negotation. For instance, assume the
request is:
Accept-Language: en, de
AMP-Cache-Transform: google
and the server can only deliver a resource of de
+google
or en
+cloudflare
.
In this case, content negotiation may select en
, and then
AMP-Cache-Transform
negotiation would see that the constraint cannot be
satisfied. In practice, it is expected that this will not happen. Servers should
avoid such pessimizing interactions with HTTP content negotiation, by being able
to serve SXGs on all variants of an AMP URL.
An intermediary proxy may choose to cache these SXG responses and serve them to
future requestors. Strict adherence to
Vary would mean that, e.g. a
response to a request containing AMP-Cache-Transform: any
would not match a
response to a request containing AMP-Cache-Transform: google, any
, since the
two requests are not semantically equivalent. However, this would lead to
unnecessary duplication in the cache, as the former response obviously can serve
the latter response.
If the proxy can ensure that a cached response satisfies a new request, then it
can serve that response. It can do that by comparing the AMP-Cache-Transform
response header of the cached response to the AMP-Cache-Transform
request
header of the new request. The response matches the request if there exists a
parameterised identifier spec
in the request list for which all of the
following is true:
- Any of:
spec
's identifier isany
.spec
's identifier is identical to the response's identifier.
- Any of:
spec
does not include av
parameter.spec
'sv
parameter is a validv_spec
, the response has av
parameter (specifying a single version as per above), and the response'sv
is an element of the request'sv
.
spec
does not include any parameter other than those mentioned above.
The above is merely informational; a cache may choose any strategy that doesn't serve mismatched responses (i.e. obeys the "Server behavior" specification above).
As this defines a new content negotiation header field, we should ensure that it meets the criteria set for integration with HTTP Variants.
Alternatively, one could use
q-values for specifying
preference of application/signed-exchange
over other variants, and media type
parameters for specifying
target and version requirements. These are idiomatic applications of existing
syntaxes, but may come with some downsides. This is an area under investigation
and
discussion;
feel free to get involved.
A requestor wishing to receive an SXG, without any constraints on its subresource URLs, would send:
AMP-Cache-Transform: any
The responder may send an SXG with subresource URLs rewritten for a particular cache or with the original subresource URLs, or a non-SXG response.
A requestor wishing to receive an SXG to be served from and prefetched from the Google AMP Cache (e.g. Googlebot) would send:
AMP-Cache-Transform: google
The responder must either send an SXG with subresource URLs rewritten for the Google AMP Cache, or a non-SXG response.
A requestor wishing to receive transformed AMP of a specific version may send a request like:
AMP-Cache-Transform: google;v="1..3,5"
The responder must either send an SXG with subresource URLs rewritten for the Google AMP Cache and whose AMP transform version is 1, 2, 3, or 5, or else a non-SXG response.