Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusual behaviour when Page Reference field contains trashed page #620

Open
Toutouwai opened this issue Jun 13, 2018 · 12 comments
Open

Unusual behaviour when Page Reference field contains trashed page #620

Toutouwai opened this issue Jun 13, 2018 · 12 comments

Comments

@Toutouwai
Copy link

Toutouwai commented Jun 13, 2018

Short description of the issue

Page Reference fields behave in an unusual way when they contain a page that has been trashed. Take the following scenario...

In my home template I have two Page Reference fields, "test_page_single" and "test_page_multiple". In each of these fields I add the same "To trash" page (and only that page).

2018-06-13_194806

I then trash the "To trash" page. Now when I edit the Home page these fields appear to be empty.

2018-06-13_194845

But in the fields' tables in the database the value is retained. And I can save the Home page with these fields in their apparently empty state but still the trashed page is retained. This makes it impossible to remove the trashed page as a value and save these fields in a genuinely empty state. It is also potentially misleading as there is no way to tell the difference between an empty field and one that actually contains a trashed page.

When I get the value of these fields via the API they behave as if they are empty.

2018-06-13_195444

However, I cannot match the Home page in a selector that you would expect to work if these fields were empty.

2018-06-13_200237

I find this counter-intuitive. And if you are wanting to find both pages where the field is empty and appears to be empty due to containing a trashed page, what selector can you use for this?

So it's difficult to identify pages that are in this situation both via the admin interface and via the API, which can lead to a lot of head-scratching and time spent trying to work out what's going on.

I'm not sure what a good solution would be. It would be desirable if Page Reference fields containing trashed pages responded the same way as when they contain a page that is permanently deleted, but on the other hand I can imagine scenarios where a user trashes a page in error and you want to later restore it and retain the references to it. Does anyone have any ideas on how we could have the best of both worlds?

Setup/Environment

  • ProcessWire version: 3.0.105
@szabeszg
Copy link

szabeszg commented Jun 13, 2018

Does anyone have any ideas on how we could have the best of both worlds?

I guess at the minimum, the system should ask if the user really wants to trash the page as it is being referenced at least once "somewhere". It should also point out where the reference(s) are and what to expect if they decide to trash it...

@Toutouwai
Copy link
Author

I'm now using the hook below, because for me the unusual behaviour of Page Reference fields containing trashed pages is more problematic than losing the references to trashed pages that are later restored.

// Remove references to trashed pages - based on hookPagesDelete() in FieldtypePage
$pages->addHookAfter('trash', function(HookEvent $event) {
    if(!$event->return) return; // if trash failed, then don't continue
    $page_id = $event->arguments[0]->id;
    /* @var Database $database */
    $database = $this->wire('database');
    foreach($this->wire('fields') as $field) {
        if(!$field->type instanceof FieldtypePage) continue;
        $table = $database->escapeTable($field->table);
        // Delete references to this page
        /* @var \PDOStatement $query */
        $query = $database->prepare("DELETE FROM `$table` WHERE data=:page_id");
        $query->bindValue(":page_id", $page_id, \PDO::PARAM_INT);
        $query->execute();
    }
});

@ryancramerdesign
Copy link
Member

Couldn't it be solved just by emptying the trash? I don't think deleting the page references is what we should do, until the page is actually deleted (trash is emptied). The current behavior seems preferable to me—Trash is there for this reason, to temporarily contain these things before permanently deleting them. Until the trash is emptied, no data is removed, so that it can easily be restored. Perhaps an option to block trashing of pages that contain references to them would be helpful for some?

@Toutouwai
Copy link
Author

Toutouwai commented Jun 15, 2018

Couldn't it be solved just by emptying the trash?

No, because only superusers can empty the trash.

The current behavior seems preferable to me—Trash is there for this reason, to temporarily contain these things before permanently deleting them. Until the trash is emptied, no data is removed, so that it can easily be restored.

I have no issue with the current behaviour in terms of keeping/recovering references to pages when they are restored from the trash - that part is great. What I have an issue with is the current side-effects on Page Reference fields when they contain trashed pages.

I already described these above, but I'll reiterate if it helps to clarify:

  1. Why should it be impossible to remove a trashed page from a Page Reference field via Page Edit? Why should it be impossible to save a Page Reference field that contains a trashed page in an empty state via Page Edit? I suggest that trashed pages be visible in the Page Reference field in some kind of greyed-out or otherwise marked state so they can be removed if desired.

  2. The API value of a Page Reference field that is empty and a Page Reference field that contains only trashed pages is identical. So how can we find pages with such Page Reference fields using a selector?

An example showing how issue 2 can be a problem...

You have two types of pages: room and person. A room is available when it has no person assigned to its "person" Page Reference field. You have some listing of available rooms that finds those rooms like so...

$available = $pages->find("template=room, person.count=0");

Now suppose RoomB has Charlie assigned, and then Charlie is trashed. Now RoomB has no valid person assigned (the value of person is NullPage or an empty PageArray depending on the field settings) but cannot not be listed as an available room because it doesn't match the selector. And I've racked my brain but I can't think of any selector that will find such pages reliably and efficiently (consider a "multiple" Page Reference field that might contain two or more trashed pages and no untrashed pages).

@ryancramerdesign
Copy link
Member

It's not impossible to remove such a page in the page editor. It's just that no change is detected unless you make some change to the field. Once you make some change, then PW will know to save it. PW only saves fields that have changes made to them. So if you really needed to, you could select some other value for the field, save, then save it back to blank. But if a page is in the trash, then it's in an unknown state where it'll either get deleted or it'll end up back in the site. So I think PW's behavior here is consistent with retaining that state, where we don't know for sure what's going to happen with that page.

I don't think PW should automatically delete anything from the DB until a page is actually deleted. Now we could show it as a trashed page in the selection, but only users that have permission to view pages in the trash would be able to see it, so I'm not sure that helps here. Plus, it wouldn't solve the find() issue you mentioned, which seems like the real issue.

As for the find() issue, I understand and agree. It's true there's no simple way to pull those pages in with a "count=0", because that selector is searching the database, and the value is present in the database. There are sometimes runtime factors like access control that mean what's in the DB and what's actually available to the user, can differ. This is one of them, and depending on the case, it might be an undesirable side effect, but it's not a bug, so I think what @szabeszg was getting at seems like a good direction?

One solution for your particular case would be the advanced template setting "prevent pages from being in trash", for your "person" pages in the example above. That would prevent the possibility of those "person" pages being in the trash state, allowing only deletion for them, solving that find() issue. I also think your hook solution is a good one, but just for your case where your particular need outweighs the actual purpose of trash state—so I don't think this is something we should do in the core. I also proposed another potential solution in my previous message—where it doesn't let you trash a page until you remove the references to it first. You've not said anything about my previous or szabeszg's suggestions as far as I can tell, so I'm unclear what you are requesting, but of course open to any ideas.

@Toutouwai
Copy link
Author

Toutouwai commented Jun 16, 2018

So if you really needed to, you could select some other value for the field, save, then save it back to blank.

The thing is, why would it occur to you to do this? There is no indication that the field is anything but empty, even to superusers. A big component of these issues is that there is nothing that communicates to the user, superuser or non-superuser, from either the API side or the admin side, that the Page Reference fields are not empty.

Once you know about the quirks then it's true that there are ways to work around them. But I've been developing with PW daily for years and up until now never realised that trashed pages had these effects on Page Reference fields. I imagine this is the same for many other PW devs, as I don't know how you would discover this information until something in a site that depends on identifying empty Page Reference fields breaks (as was the case for me) or you happen to be poking around in the DB and comparing DB values to admin/API values (not likely).

I can imagine several possible changes to address these issues, some of which probably would not be simple to implement:

1. User can see and remove trashed pages in the inputfield (so the core is changed in whatever ways are needed to support this).

2. User cannot see the trashed pages, but does see a notice informing them that the field contains one or more trashed pages. To be useful though I think they would have to told which trashed pages are in the field so probably no easier than 1 but less functional.

In my opinion 1 or 2 would be some improvement to the status quo, but they're not a solution to the selector issue because in order to make a decision to remove the trashed pages or not the user would have to know that the presence of the trashed pages in the field affects the selector (and therefore some other part of the site) and I don't think it's reasonable to expect anyone apart from the site developer to understand the technical workings of a site. Also, a user may have page-delete permission for a page but not page-edit permission for a page that references the trashable page.

So maybe the issues could be addressed at the DB level...

3. Some magic to identify the IDs of trashed pages in Page Reference tables in the DB, perhaps with a prefix or suffix. The idea being that such IDs would not be matched in a selector (without include=all). Probably not possible/practical.

4. A new table in the DB for storing metadata of trashed pages. This table would record that page 1234 was referenced on page 4321 in sort position 3, and if page 1234 is restored from the trash the reference is restored. This would not be a perfect solution regarding the sort position because new pages may be added to the Page Reference field over time, but has the benefit that the reference will survive edits to 4321 while 1234 is in the trash (seeing as the deliberate keeping of references to trashed pages is why this issue exists in the first place).

But 3 and 4 would not be simple to implement.

These issues raise a question for me about what the expected and typical use of the trash is. I think the ability to restore pages from the trash is great, but I find I very rarely need to do that. And I can honestly say I have never needed to restore a page from the trash on a production website. This is probably due to the fact that the non-superuser (who does virtually all edits on production websites and whose experience is my primary concern) has no idea the trash even exists. To them there is no distinction between "trashed" and "deleted" - when they no longer need a page they go to a tab labelled "Delete" and tick a box and then the page is gone. So in terms of the client's needs, the trash is only used in a disaster recovery scenario ("I've made a huge mistake and deleted a critical page - is there anything you can do to help?"), one step less severe than a DB restore.

Of course I'm not suggesting that the trash isn't needed, but just that when some compromise has to be made - as I think is the case here - it should be factored in that restoring a page from the trash is not a very common need and restoring a page referenced by another page is less common again. So my opinion is that losing references to a trashed page that is later restored is less bad than having invisible pages in a field (difficult to debug) and having the selector issues I described (can have some quite adverse consequences for site functionality), which persist indefinitely given that pages remain in the trash indefinitely.

So a couple more possibilities...

5. References to pages are removed when they are trashed (what the hook above does). There could be $config option to enable this.

6. A LazyCron removes references to pages that have been in the trash for some period of time.

Now 5 and 6 could be done by the developer without requiring any core change. I'd be happy with that (solution 5 is the one that I would use myself) and I could build this into my default site profile. But the issue then is one of awareness - how can we make PW developers aware that the issues I've raised exist? Maybe something in the documentation, or a notice in the Page Reference field config maybe with links to example hook code for solutions 5 and 6?

You've not said anything about my previous or szabeszg's suggestions as far as I can tell, so I'm unclear what you are requesting, but of course open to any ideas.

Sorry I didn't respond earlier. I appreciate the ideas, but they're not solutions from my perspective. Warning the user that other pages reference a page they are about to trash assumes they know enough about how their site is coded to make an informed decision. I don't think we can expect that of typical users. Preventing the user from trashing the page has a similar problem in that unless the user has a fairly deep understanding of how their site functions they will probably just be confused by a generic message from the core, plus experience some added frustration that they can't perform the action they want to. And as I mentioned earlier, they may not have edit access to the page doing that is doing the referencing so could not fix the situation anyway.

One solution for your particular case would be the advanced template setting "prevent pages from being in trash", for your "person" pages in the example above. That would prevent the possibility of those "person" pages being in the trash state, allowing only deletion for them, solving that find() issue.

That option is less desirable than the earlier hook (which could target pages by template if needed), because it results in not just the discarding the reference but discarding the trashed page altogether, so loses the disaster recovery aspect of the trash.

@matjazpotocnik
Copy link
Collaborator

That reminds me of the old forum thread about preventing page delete when the page is "in use" https://processwire.com/talk/topic/9727-prevent-page-delete/ ...

@Toutouwai
Copy link
Author

@matjazpotocnik, I posted a hook in that forum topic in case you're still needing a solution for that.

@matjazpotocnik
Copy link
Collaborator

@Toutouwai thanks. Do you think preventing trashing/deleting the page is one of the possible solutions to your (our) problem? I guess it depends on use case. Sometimes deleting the page that is trashed is a viable solution, but in some cases preventing the page trashing/deleting is a must. How would the solution you posted on the forum scale on many pages? Maybe those two options could somehow be supported in the core?

@Toutouwai
Copy link
Author

Do you think preventing trashing/deleting the page is one of the possible solutions to your (our) problem? I guess it depends on use case. Sometimes deleting the page that is trashed is a viable solution, but in some cases preventing the page trashing/deleting is a must.

For me it isn't a solution - I mention why in my earlier comment. But as you said it might depend on the the use case. My preference is not to delete trashed pages but rather to remove references to them, and just bite the bullet on that in the unlikely event I need to restore a trashed page.

How would the solution you posted on the forum scale on many pages?

I think it would be fine - via the admin you can only trash one page at a time so the hook only has a very minor impact.

@gmclelland
Copy link

Here I agree with Robin's # 1.

  1. User can see and remove trashed pages in the inputfield (so the core is changed in whatever ways are needed to support this).

Drupal has a similar entity reference field which is like Processwire's page reference field.

  • If an entity reference field references a page that is unpublished, then the entity reference field will continue to show that value in the input field.
  • If an entity reference field references a page that is deleted, then the entity reference field will not show the deleted page in the input field. Well it does show a blank value in the input field until the page is saved.
  • After the page is saved the referenced deleted item is removed from the input field.

Note: Drupal doesn't have a trash like Processwire. In Drupal a page is either published/unpublished or deleted.

I don't think PW should automatically delete anything from the DB until a page is actually deleted. Now we could show it as a trashed page in the selection

Here I agree with Ryan. I don't think Processwire should delete any references from page fields until after the referenced page is deleted. If you restore a page from trash, all the references should still point to it.

but only users that have permission to view pages in the trash would be able to see it

Only users with permission to "view pages in trash" can see these references? Maybe we can change that to anybody that can edit a page can see the references? Maybe the trashed pages could be styled differently in the Page reference fields? greyed out, striked through, or prefixed with a trash icon? - I don't know if that is possible? May be to database intensive to check each value statusTrash before rendering in the input field?

I find this counter-intuitive. And if you are wanting to find both pages where the field is empty and appears to be empty due to containing a trashed page, what selector can you use for this?

So it's difficult to identify pages that are in this situation both via the admin interface and via the API, which can lead to a lot of head-scratching and time spent trying to work out what's going on.

Maybe Ryan can add some kind of selector to include/exclude trashed pages from the $page->pagereferences and $pages->find() calls?

I guess at the minimum, the system should ask if the user really wants to trash the page as it is being referenced at least once "somewhere". It should also point out where the reference(s) are and what to expect if they decide to trash it...

Sounds like a good idea. Drupal does part of this as well. When I click the delete button while editing a page, it redirects to a confirmation page. Are you sure you want to delete this page? I guess we might be able to use the new references and links api for showing a message like this to the user before a page is deleted or trashed? https://processwire.com/blog/posts/processwire-3.0.107-core-updates/#page-gt-references Pages that get trashed via the Page List could be redirected to a confirmation page that also shows the references and links as well.

Thats just my 2 cents. I know others here are far more knowledgeable than I am, so I may be missing a lot.

@szabeszg
Copy link

Is it possible to utilize the new API method: $page->references();
https://processwire.com/blog/posts/processwire-3.0.107-core-updates/#page-gt-references
so that at least admin users get some sort of warning before "destroying" page references?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants