Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add client JSON storage API endpoint #1405

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cpeel
Copy link
Member

@cpeel cpeel commented Dec 21, 2024

We want the new proofreading interface to be able to persist client configuration details across browsers and devices. This requires that some configuration values are stored server-side. To enable this, allow clients to store opaque JSON blobs on the server, up to one blob per client per user.

Please read the updates to SETUP/API.md on important caveats about this endpoint and ensure we're OK with this from a server-side perspective as well as the client side.

This PR introduces a new generalized JsonStorage class and database table with an ApiClientStorage layer on top of that wired into the API. When JSON blobs are updated in the JsonStorage class their timestamp field in the database is also updated. This field isn't currently made available via the code right now but seems like it would be useful if, for instance, we want to clean up client blobs older than some period of time. I'm open to thoughts on this.

curl examples to test this:

export API_KEY=review

# Save a json blob
curl -i -H "Accept: application/json" -H "X-API-KEY: $API_KEY" \
    -X PUT -d '{"key": 1}' "https://www.pgdp.org/~cpeel/c.branch/client-storage/api/v1/storage/clients/newpi"

# Fetch a json blob
curl -i -H "Accept: application/json" -H "X-API-KEY: $API_KEY" \
    -X GET "https://www.pgdp.org/~cpeel/c.branch/client-storage/api/v1/storage/clients/newpi"

# Delete a json blob
curl -i -H "Accept: application/json" -H "X-API-KEY: $API_KEY" \
    -X DELETE "https://www.pgdp.org/~cpeel/c.branch/client-storage/api/v1/storage/clients/newpi"

This is likely to have minor merge conflicts with #1398 and I'll resolve those after that gets merged.

@cpeel cpeel self-assigned this Dec 21, 2024

Some important notes about this feature:
* Client storage is one blob per user per client. Said another way: clients are
only able to store one blob per `clientid` and that blob is only for the user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouln't this be 'one blob per userid ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's both. It's one per clientid and by the user that is authenticated. I'll change it slightly to maybe make it clearer.

// just shy of 32MB. That seems plenty big.
$sql = sprintf(
"
INSERT INTO json_storage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this amount to the same thing as 'REPLACE'? Sorry if this is a silly question but I have very little experience of databases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, yes. This is a MySQL-ism and not part of an SQL standard and we use it pretty frequently in the code.


$storage = new ApiClientStorage($clientid, $pguser);
if ($method == "GET") {
return json_decode($storage->get());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not the client decode and enode the JSON data so we are just dealing with strings here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API automatically converts returns to JSON values so if we don't decode this into an object we stringify it twice.

{
global $api_client_storage_keys;

if (!in_array($client, $api_client_storage_keys)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the client not already have been validated before we get here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would. I did it here too to ensure it's enforced at the data layer not just the user input layer if this is used elsewhere for some reason.

Copy link
Collaborator

@70ray 70ray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good overall structure, just a few comments about details.

Comment on lines +1325 to +1377
description: JSON blob that was persisted
content:
application/json:
schema:
type: object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to return the timestamp (something the client could use for caching) rather than the JSON blob that was sent by the client?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp is arguably just as useless as the JSON blob because it's always now(). I was uncertain what a good return value would be so I modeled what we do for PUTs for word lists -- and what the product at my job does for PUTs and PATCHes -- which is to return the object that was actually serialized.

Open to ideas / thoughts on what API clients should "expect" from a successful PUT / PATCH / etc beyond the HTTP status code.

api/v1_storage.inc Show resolved Hide resolved
pinc/ApiClientStorage.inc Outdated Show resolved Hide resolved
*/
public function get(string $setting): ?string
{
$value = null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move defining $value where need it before the loop on l.69.

api/v1_validators.inc Outdated Show resolved Hide resolved
Comment on lines +74 to +76
To enable this feature, add a string for the client to the
`_API_CLIENT_STORAGE_KEYS` configuration setting and have the client use that
string with the endpoint as the `clientid`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a hard time understanding the purpose of the API client storage keys. If the end goal is to allow any client to store this configuration down the road (which means that a user could trivially get one of our keys), is this some temporary measure to prevent arbitrary storage from storing JSON in our DB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have the gist of it, but it's not temporary.

Keep in mind that every user has de facto API access via PHP sessions -- that's how the Page Browser users the API and how the new proofreading UI will too. If we don't have some restriction, any logged in user could fill up our server with random JSON blobs. The clientid means that each user can have only at most len(API_CLIENT_STORAGE_KEYS) JSON blobs and those have a maximum size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "client" is the wrong word for this? I am using it in the "JS client that uses the APIs". Would "app" or some other word be clearer?

Copy link
Member Author

@cpeel cpeel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the questions and updates. I'll get the (excellent) code suggestions done later today.


$storage = new ApiClientStorage($clientid, $pguser);
if ($method == "GET") {
return json_decode($storage->get());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API automatically converts returns to JSON values so if we don't decode this into an object we stringify it twice.

// just shy of 32MB. That seems plenty big.
$sql = sprintf(
"
INSERT INTO json_storage
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, yes. This is a MySQL-ism and not part of an SQL standard and we use it pretty frequently in the code.

Comment on lines +74 to +76
To enable this feature, add a string for the client to the
`_API_CLIENT_STORAGE_KEYS` configuration setting and have the client use that
string with the endpoint as the `clientid`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have the gist of it, but it's not temporary.

Keep in mind that every user has de facto API access via PHP sessions -- that's how the Page Browser users the API and how the new proofreading UI will too. If we don't have some restriction, any logged in user could fill up our server with random JSON blobs. The clientid means that each user can have only at most len(API_CLIENT_STORAGE_KEYS) JSON blobs and those have a maximum size.

Comment on lines +1325 to +1377
description: JSON blob that was persisted
content:
application/json:
schema:
type: object
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp is arguably just as useless as the JSON blob because it's always now(). I was uncertain what a good return value would be so I modeled what we do for PUTs for word lists -- and what the product at my job does for PUTs and PATCHes -- which is to return the object that was actually serialized.

Open to ideas / thoughts on what API clients should "expect" from a successful PUT / PATCH / etc beyond the HTTP status code.

{
global $api_client_storage_keys;

if (!in_array($client, $api_client_storage_keys)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would. I did it here too to ensure it's enforced at the data layer not just the user input layer if this is used elsewhere for some reason.

pinc/ApiClientStorage.inc Outdated Show resolved Hide resolved
Add a generic JSON-based storage class and an extension of it that
will be used in the API to store client blobs. This is a similar
structure to the Settings class / user_settings table but enforces
a JSON value.
@cpeel
Copy link
Member Author

cpeel commented Dec 23, 2024

PR feedback so far incorporated and resolved the merge conflicts with the documents API PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants