Skip to content
This repository has been archived by the owner on Dec 21, 2023. It is now read-only.

Merging actions in the timeline based on the subject and/or components. #90

Open
Nemesisprime opened this issue Feb 9, 2013 · 56 comments

Comments

@Nemesisprime
Copy link

I haven't looked into this yet, so I'm prematurely throwing this out there, but in my application I'll begin to get riddled with multiple related actions like:

User 1 commented on Photo A
User 2 commented on Photo A
User 1 uploaded Photo A
User 1 uploaded Photo B

A much more elegant method would be to merge them based on components:

User 1 and User 2 commented on Photo A
User 1 uploaded Photo A and Photo B

Like I said, I haven't gotten into looking on how to implement something like this, but I'm curious of if anyone has any suggestions on methods, ideas, or features related to suggest before I get started.

@brentc
Copy link
Contributor

brentc commented Feb 9, 2013

For the 1.x codebase, I created a filter, CollectingFilter, that did this. Mine only collects direct complements for a specific indirect complement model and the verb 'added', but it could be modified to do what you're doing, probably.

I've collected (and genericised) most of how it works into a gist: https://gist.github.com/brentc/f0303bad7cb8b319856f

The only issue with this implementation, is that filters are applied AFTER pagination occurs, which means your "items per page" will be reduced every time actions are collected. For example, if your per page limit is 10, and you collect 3 of those items into 1 you'll end up only displaying 7 actions on that page. In an extreme case, 9 actions could even be collected into 1, and only 1 action will be displayed.

@stephpy
Copy link
Owner

stephpy commented Feb 9, 2013

Hi, there is imo two solutions:

@brentc solution

*pos : This solution works when you create the 3 actions at different interval *
**con: What @brentc said, number of results can be incorrect

Hacking the "logic" of actions, then you could store an action like that:

Subject => One of users
Verb => uploaded
complement:number_actors => 2
complement:actor_1 => Your user 1
complement:actor_2 => Your user 2

This solution is a bit tricky but you would be able to group each

** con : This solution doest not works when you create the X actions at different interval **

@stephpy
Copy link
Owner

stephpy commented Feb 9, 2013

This behavior is very interesting and it would be cool to support it in bundle :) Thanks to bringing it up.

@Nemesisprime
Copy link
Author

Unfortunately, the second option won't suffice. My interactions are being done individually through a rest API, so there'd be no realistic way to implement something of that style across requests.

@brentc's solution is a great start, thanks. I'm still working my way around the inner workings of Timeline, but I'd like to come up with an easily extendable system in the end.

A way to avoid the problem of shrank results is to use infinite scrolling over paged scrolling, which I would suggest for a Timeline anyway.

@thomask
Copy link

thomask commented Feb 10, 2013

I'm also very interested in this behaviour!

@stephpy
Copy link
Owner

stephpy commented Feb 11, 2013

Yes @Nemesisprime, the "infinite scrolling" is a good solution, but it'll still be a problem if there is 20 actions to group and you paginate on 10.

But it's complex to make something better ...

May be we could add a system which detect that some actions was filtered and btw RE fetch missing actions ? It should be optional ...

Or add components ManyToMany relations to timeline table and store other components (here users) which comments the photo. Via a POST deploy listener.

@Nemesisprime
Copy link
Author

Instead of ALL of the related actions in a timeline merging, they should be merged based on their "chunked" position in timeline. Like:

Jim ate potatoes
Mike ate potatoes
Freddy went skiing
Cara ate potatoes
Jeff commented on Cara
Kasey ate potatoes
Wes commented on Cara

the result of that list split into two "chunks" would similar to:
Jim, Mike, and Cara ate potatoes
Freddy went skiing
Jeff and Wes commented on Cara
Kasey ate potatoes

That way, time still plays a role in the order of results. That would also help if there are illogical breaks (such as 20 items to a group and 10 paginated), they'll look more natural.

@stephpy
Copy link
Owner

stephpy commented Feb 11, 2013

Yes, the time will be important.

For sure it's more complicated to use this solution:

Or add components ManyToMany relations to timeline table and store other components (here users) which comments the photo. Via a POST deploy listener.

But IMHO it's the best way. Because:

  • Having a lot of process when deploy action does not impact on user experience.
  • Pagination will not be faked

The process will be:

User1 comments Photo1

deployment action to User1, User2, User3 timelines.

User2 comments Photo1

-> deployment action to User1 and User2 timelines
-> delete first timeline entry and add on second entry a reference to User1
-> User 3 does not follow User2, no changes for him.

I had not yet think about how to do that ...

I think to a configuration like that:

group_actions:
    timelimit: 3600 # more than hour and actions will not be grouped
    verbs:
        ate:
            timelimit: ~ #overload root timelimit
            expect_identical_complements: [directComplement, indirectComplement]
        comment:    ~
        # expect_identical_complements: false

Any thought ?

@Nemesisprime
Copy link
Author

I like the idea as far as configuration and timeline storage but am no convinced about using time limits as the grouping factor. One of the reasons I would support some sort of auto-grouping would to avoid the problem of having to manually monitor your website's traffic. Active sites could run the risk of grouping everything in the active timeline and slower sites could never even see the grouping while ideally it still should be.

@stephpy
Copy link
Owner

stephpy commented Feb 12, 2013

An other factor could be the interval between actions.

User 1 has in timeline (top is the more recent):

action1: User1 eat potatoes
action2: User1 comment photo1

New action: User2 comment photo1.

We have to group it to action2, btw, fetching X last actions on User1 timeline would be an other solution.

Grouping automatically timelines cannot ensure us to group timelines ideally, and sure, all actions entries will stay stored. Only timelines entries will change.
To avoid a "mass" group on timeline we could add a limit too... No more than X users which comments PHOTO1 else it'll create a new "chunk"

@brentc
Copy link
Contributor

brentc commented Feb 12, 2013

I suspect each project will have it's own conditions for grouping actions and trying to encapsulate a wide array of options in configuration and stock implementations will be problematic.

I think the right approach here is to probably implement a standard means for grouping actions together (either at fetch-time or storage-time, there are pros-and-cons for each), an interface for the grouping class, (e.g. ActionGrouperInterface), and a simple stock implementation (e.g. SequentialActionGrouper extending AbstractActionGrouper).

Provide a simple way for developers to extend the grouping class with their own logic. (e.g. AbstractActionGrouper::shouldGroup($action, $timeline) or perhaps an event base system ActionGroupingEvent::PRE_GROUP, ::GROUP_PASS, ::POST_GROUP)

This should help prevent over-designing the feature and painting developers into a corner with the provided implementation, and should allow developers to provide their own logic based on their unique, concrete action/component implementations.

@stephpy
Copy link
Owner

stephpy commented Feb 12, 2013

+1, you are right.

@thomask
Copy link

thomask commented Feb 12, 2013

Sounds great!

@Nemesisprime
Copy link
Author

@brentc - Something like this is the most flexible and more in the area I would like to go, so I agree. It seems like doing this during fetch time would be the best solution as well, as we have the most complete set of information about the timeline to base our custom groups on, and we don't have to worry about a failure during the creation/modification of actions when performing an action.

Superficially, perhaps we could use another phrase instead of "grouping". AbstractActionConstraint or AbstractActionCongregator sounds more appropriate.

@brentc
Copy link
Contributor

brentc commented Feb 12, 2013

ActionAggregator?

@Nemesisprime
Copy link
Author

Aggregator and Aggression sound descriptive enough.

I need to better familiarize myself with components and the twig rendering of timeline before I start making any progress though, so bear with me if I'm a little sluggish at committing any work. But, unless someone beats me to it, I'll attempt to draft something up soonish.

@stephpy
Copy link
Owner

stephpy commented Feb 13, 2013

Thank you very much ;)

@Nemesisprime
Copy link
Author

I'm having a little trouble with this set up, it's proving far more complicated as far as implementing an interface to determine if an action should be grouped since I can't seem to get anything feasible down.

Made progress

@Nemesisprime
Copy link
Author

Here is a basic use-case example I've drafted up.


/**
 * PhotoCommentsAggregator example for the aggregator.
 * 
 * @implements ActionAggregatorInterface
 */
class PhotoCommentsAggregator implements ActionAggregatorInterface { 

    /**
     * {@inheritdoc}
     */
    public function shouldAggregate(ActionInterface $action, TimelineInterface $timeline, ConstraintManager $constaint_manager)
    { 
        $verb = ...

        if($verb == "commented on") 
        { 
            /* Basically, we see that it is a commenting action, if it is, we constain the 
            action based on the directComplement (which is the photo [i.e., all comments on photo 4 
            will be grouped */
            return $constaint_assistant->acceptAggregation(
            array(
                $action->getComponent("directComplement") //The DC is the photo, and the factor we group on.
            ),
            array(
                $action->getComponent("subject"), //This tells the ConstraintManager to group these components into a ComponentCollection
                $action->getComponent("indirectComplement") //The IC is the comment and just like above, they'll be looped into a ComponentCollection
            )
            );

        }

        /* Some other verb... */
        return $constaint_assistant->declineAggregation();
    }

}```

@stephpy
Copy link
Owner

stephpy commented Feb 17, 2013

Giving the action AND the timeline is not useful no ? It's one-one relation.

Otherwise it's looks good for a fetch post processing.

@Nemesisprime
Copy link
Author

One of the problems I've come across is with idea of chunking timeline results so they're not all mashed together in a single group. Ideally I'd like to make this as flexible as possible, but at the same time it shouldn't require the user to jump through hoops to decide how the timeline is chunked.

I think, for sake of simplicity at this point, there will be an additional option of ConstraintResolver (ConstraintManager in the example above)'s acceptAggregation to specify a method for chunking.

return $constaint_resolver->allowAggregation(
    array("directComplement"), //The DC is the photo, and the factor we group on.
    array("subject"), //This tells the ConstraintManager to group these components into a ComponentCollection
    $constaint_resolver->setChunkingMethod(ConstraintResolver::SUBGROUP, $options);
);

Ideally we'd support a NONE, SUBGROUP, and TIME method to determine which sets of actions should be grouped. This could also provide BC in the future if/when we allow custom chunking methods

$constaint_resolver->setChunkingMethod(ChunkingMethodClass());

@stephpy
Copy link
Owner

stephpy commented Feb 17, 2013

Using at this time a ChunkingMethodInterface would be cool imo.

$constraintResolve->setChukingMethod(new TimeChunkingMethod($options));
$constraintResolve->setChukingMethod(new SubGroupChukingMethod($options));
...

btw, we'll not have to be bc in the future.

@Nemesisprime
Copy link
Author

+1 I'll see what I can do

@stephpy
Copy link
Owner

stephpy commented Feb 17, 2013

Thank you for all ;)

@Nemesisprime
Copy link
Author

Sorry about the lack of support coming from my end. I've been in the middle of launching a project and decided not to bother until later on, but I should be able to start adding some work into it soon.

@vkartaviy
Copy link
Contributor

@Nemesisprime Good news 👍

@stephpy
Copy link
Owner

stephpy commented Apr 11, 2013

No problem @Nemesisprime ;) Thanks

@marcospassos
Copy link

This feature is very nice. I hope see this here soon 👍

@Nemesisprime
Copy link
Author

If you want to see the way I've decided to implement it, I have a rough draft commit of the modified Timeline and TimelimeBundle in my repo at Nemesisprime/timeline@1e58f71 and Nemesisprime@a2e107f respectively.

@marcospassos
Copy link

It seems promising! Waiting for news.

@Zeichen32
Copy link

Are there any new news on this topic?

Maybe we could simply add a grouping key?

$action->setGroupingKey(sprintf('image_gallery_%d', $gallery->getId()));

The behavior could be similar to the duplicate key but instead of deleting we can grouping the entries?

@pppdns
Copy link
Contributor

pppdns commented Nov 7, 2013

There's an other notification system for Python (https://feedly.readthedocs.org/en/latest/notification_systems.html), I just found that, maybe we could get some ideas on how to aggregate in an elegant way. I didn't have a chance yet to look into its source though, that's just an idea

@stephpy
Copy link
Owner

stephpy commented Nov 8, 2013

@Zeichen32 Grouping Key is indeed a simple implementation of this feature and could do the job ...

This group key field should be inserted in timeline table.

We must find a way to find x elements of a timeline with this group key parameter:

Example:

user1 commented article1
user2 commented article1
user eat tacos

->getTimeline(array('limit' => 2)) should return theses 3 actions and merge them later ...

It could be done easily with 2 requests, but i'm not sure it'll be easy to make it work with paginations systems. (Since offset parameter could not have page*per_page logic.)

I have to think about this ;)

Thanks @pppdns, I'll take a look to this library.

@Zeichen32
Copy link

@stephpy Maybe we could group the timeline actions on group_key to respect the pagination in the first request.

Then we sent a second request to get all related actions based on the group_keys which we had received in the other request.

At step three we can merge the results.

@stephpy
Copy link
Owner

stephpy commented Nov 8, 2013

@Zeichen32 Yes, indeed it's a good way to do that too. Surely easier for pagination ...

@abenmoussa
Copy link

Hello,

Any news for this feature please?

@stephpy solution (Hacking logic of...) may be a good idea..
I suggest the following solution:

for example if we have these two actions to persist:

User 1 commented on Photo A
User 2 commented on Photo A

1/ Before synchronize the action, we look if photo A is already commented (search actions by verb and directComplement)
-if it is not already commented, we save the action.
-if the photo is already commented we update the action by complements
number_actors=> 2
actor_1 =>User 2
then we render the result

{{ timeline_component_render(timeline, 'subject') }}
and
{{ timeline_component_render(timeline, 'actor_1') }}
commented
{{ timeline_component_render(timeline, 'directComplement') }}

What are the possible drawbacks of this solution?

Thank you very much for your help ;)

@abenmoussa
Copy link

Hello,

Any suggestions please?
Thank you!

@stephpy
Copy link
Owner

stephpy commented May 12, 2014

Hi,

This is hard to find a way to implement it ... And harder to implement it.
But if you would like to make a PR, feel free to do it. :)

@abenmoussa
Copy link

Hi stephy,

Ok, but do you have any comments on my solution please?
Thank you for Help.

@mpclarkson
Copy link

I've previously implemented this terrific bundle using the Redis Driver and am now using it on another project with the Doctrine ORM. The reason for choosing Doctrine this time is to make it easier to implement aggregation of actions.

I've done some work today and it looks like the following steps will work to aggregate actions and remove duplicates (without using filters as this screws up pagination):

  • Implement the ORM entities as per the bundle instructions.
  • In the Actions entity implement a ManyToMany self-referencing association so actions can be linked with other actions.
  • Set a duplicate key for each action. We want to aggregate most actions by each 24 hour period, verb, and the direct complement Id so this will work for us:
sprintf('%s:%s:%s', $date->format('Y-m-d'), $verb, $id)
  • Implement a prePersist doctrine listener to do the following
    • Find previous Actions with the same duplicate key as the new action
    • For each duplicate action create the many-to-many bi-directional association with the new action and remove the associated timeline entries
$duplicateActions = $em->getRepository("UmmTimelineBundle:Action")->findBy(array(
                    'duplicateKey' => $entity->getDuplicateKey(),
                ));

foreach($duplicateActions as $association) {

    $entity->addAssociation($association);
    $association->addAssociation($entity);

    }

   $timelines = $association->getTimelines();
   foreach($timelines as $timeline) {

          $em->remove($timeline);

    } 
}               

This basically means there are no duplicate actions in timelines and the only duplicate action in the timeline is the most recent action. The associated actions (and consequently the counts) are then available via the Action entity (e.g. getAssociations).

This allows us to create open graph style stories such as "Matt created a new project called XYZ and 3 others' or 'Matt and 5 others commented on Article A' etc.

Thoughts?

@stephpy
Copy link
Owner

stephpy commented Aug 20, 2014

Hi, this is indeed a good way 👍

I have some question:

  • What is the $id you talking about on duplicateKey ?
  • The duplicate_key you used is not the same than this one ?
    I guess we should let this behavior but add a new one key for merging actions. Thought ?

And if we add a key, imo we could have two keys:

  • merge_duplicate_key => $verb.'#'.$id
  • merge_duplicate_time => time()

By this way, you'll be able to configure than you want to merge actions for 1 hour, 2 hours, 2 days, etc...

@abenmoussa
Copy link

Hello Mpclarkson ;),

I have also some questions about your solution:

For Example, suppose we have users A, B, C,D and E.
A, B,C and D are friends, and E is friend with A and C.
A publish a new status, so, if B and C commented the status of A, D will see on his Timeline "B and C commented the status of A", but E will see on his Timeline "C commented the status of A".
Is your solution can cover this case?

Thank you very much for your help!!

@mpclarkson
Copy link

@stephpy The $id I am using for the duplicateKey is the id of the directComplement component. We are only using objects as action components (no text) so this works for this specific implementation.

The duplicate key field I am using while testing this is the same one that you have implemented in the bundle, but it should probably be a separate mergeKey to ensure no BC breaks if it were implemented into the bundle.

@mpclarkson
Copy link

@abenmoussa This merges all actions for all users - we are building a B2B app and everyone in a project needs to see all activities in a project (irrespective of whether they know each other). However, it should still work for you. All you'd need to do is to merge the subjects in the association actions with the user's friends to only show actions from users the viewer is friends with. Does that make sense?

@stephpy
Copy link
Owner

stephpy commented Aug 21, 2014

@mpclarkson ok, since we may create actions with:
subject verb somethingElseThanDirectComplement

I guess we should find something to define how to create this mergeKey.
By default it could be “object_class“ and “ids“ of components but may be user could define it.

@mpclarkson
Copy link

Yes, that would be more flexible. Ideally, you should be able to define the key on an action by action basis, as you may want to merge certain actions differently. Perhaps use a default mergeKey formula defined in the configuration tree that can be overridden prior to calling updateAction?

@stephpy
Copy link
Owner

stephpy commented Aug 22, 2014

👍

1 similar comment
@cordoval
Copy link
Contributor

👍

@semiaLi
Copy link

semiaLi commented Apr 28, 2015

i need your help, i have the same problem, but i want to filter with actions having same verb and same component

@mpclarkson
Copy link

@semiaLi You should be able to use the methodology I have described above. Just create a "mergeKey" that uses the verb + some sort of component identifier (e.g. hash of class and id).

@mpclarkson
Copy link

Here's some more info...

The additional properties and associations on the Action:

/**
 *
 * @ORM\Entity(repositoryClass="TimelineBundle\Repository\ActionRepository")
 * @ORM\Table(name="timeline_action", indexes={@ORM\Index(name="idx_action_merge_key", columns={"merge_key"})})
 */
class Action extends BaseAction implements MergeableActionInterface
{
    //Other standard mappings

 /**
     * @ORM\Column(name="merge_key", type="string", nullable=true)
     * @JMS\Exclude
     */
    private $mergeKey;

   /**
     * @ORM\ManyToMany(targetEntity="Action", inversedBy="associatedWith", cascade={"persist"})
     * @ORM\JoinColumn(name="parent_id", referencedColumnName="id", onDelete="CASCADE")
     * @ORM\JoinTable(name="timeline_action_x_action")
     * @JMS\Exclude
     */
    private $associations;

    /**
     *
     * @ORM\ManyToMany(targetEntity="Action", mappedBy="associations", cascade={"persist"})
     * @ORM\JoinColumn(onDelete="CASCADE")
     * @JMS\Exclude
     *
     */
    private $associatedWith;

    /**
     * Constructor
     */
    public function __construct()
    {
        parent::__construct();

        $this->associations = new ArrayCollection();
        $this->associatedWith = new ArrayCollection();

    }

    /**
     * @param ActionInterface $association
     * @return $this
     */
    public function addAssociation(ActionInterface $association)
    {
        $this->associations->add($association);

        return $this;
    }

    public function setMergeKey($mergeKey)
    {
        $this->mergeKey = $mergeKey;

        return $this;
    }

    public function getMergeKey()
    {
        return $this->mergeKey;
    }

    public function hasMergeKey()
    {
        return $this->mergeKey ? true : false;
    }

   // Other getters and setters

}

Then my listener to aggregate actions using the merge key

class AggregateActionsListener
{

    public function prePersist(LifecycleEventArgs $args)
    {
        $entity = $args->getEntity();

        if($entity instanceof MergeableActionInterface && $entity->hasMergeKey()) {

            $em = $args->getEntityManager();

            $associations = $em->getRepository("TimelineBundle:Action")->findBy(array(
                    'mergeKey' => $entity->getMergeKey(),
                ));

            foreach($associations as $association) {

                //Create the associations between grouped actions
                $entity->addAssociation($association);

                //Remove the associated timeline entries
                $timelines = $association->getTimelines();
                foreach($timelines as $timeline) {
                    $association->removeTimeline($timeline);
                    $em->remove($timeline);
                }
                $em->persist($association);
            }
        }
    }
} 

I hope this helps.

@semiaLi
Copy link

semiaLi commented Apr 29, 2015

Thank you but you don't think that removing timeline don't let the action done by one subject appear in his wall ?
I haven't understood what is MeargeableActionInterface ? it's an entity that you need in your work ?
The AggregateActionsListener is a service ,?

@mpclarkson
Copy link

We are only removing duplicate timeline entries for actions that have a merge key. In our case we are aggregating by day, verb and a component hash for certain actions. Also, we don't spread automatically on the subject - in most cases we spread to project members. It works as we require but you might need to handle it differently.

The MergeableActionInferface is just an interface we use to make it clear internally how the implementation is meant to work.

Yes, the AggregateActionsListener is a service that listens to a doctrine pre-persist event.

Matthew Clarkson

On 29 Apr 2015, at 6:07 pm, Semia Limem [email protected] wrote:

Thank you but you don't think that removing timeline don't let the action done by one subject appear in his wall ?
I haven't understood what is MeargeableActionInterface ? it's an entity that you need in your work ?
The AggregateActionsListener is a service ,?


Reply to this email directly or view it on GitHub.

@semiaLi
Copy link

semiaLi commented Apr 29, 2015

ok thank you i will try it later, because now i should advance in other fonctionalities so that i could have a real interface

@seltzlab
Copy link

seltzlab commented May 6, 2016

@mpclarkson a bit late probably, many thanks for sharing your solution. Could you also share if your solution worked in the real life? I noticed that grouping by date wouldn't solve some edge cases (User1 Likes A at 23:59, User2 Likes A at 00:00)

@mpclarkson
Copy link

Hey @seltzlab - yes it's been working in production for https://hilenium.com for quite a while now. Some of the specific implementation details have changed a little but the principle is broadly the same. And yes, you are correct that the approach above has some edge cases but you could work something out to deal with these. The duplicate / merge key above is just an example approach - you could do something similar for different time ranges (eg hourly) with a bit of tweaking.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests