Week 09 and 10

TL;DR

During weeks 9 and 10, I implemented one Recommender System model.

In summary, I have implemented a Recommender System (RS) model that learns a node embedding based on the Node2Vec method. This model leverages cosine similarity, and the learned embeddings, to recommend the top k items to each user.

Implementing a RS model

This framework was designed to allow practitioners to easily evaluate and compare their new proposed RS algorithm.

To create a new Recommendation Model is necessary to:

Create a Recommender subclass
For your subclass, override the following methods:
- name(): returns the model name as a string.
- train(G_train, ratings_train): defines the training strategy of the model.
- get_recommendations(k): returns the k-first recommendations for each user.
- get_user_recommendation(user, k): return the k-first recommendations for a given user.
Add your subclass to recommender/model2class.py, adding a dictionary key with your model name (to be used in the config file), it's submodule and class name.

In the .yaml file, the models directive defines the list of models to be evaluated during the experiment pipeline. Example:

experiment:
  models:
    - name: deepwalk_based
      config:
        save_weights: True
      parameters:
        parameter1: 10
        paramater2: value
    - # Other model to be evaluated...

Where,

models: specifies a list of models to be evaluated. (mandatory)
- name: model name (mandatory)
- config: metadata config
  - save_weights: boolean that indicates if the model parameters must be saved after training.
- parameters: model parameters in the format parameter_name: parameter_value

Node2Vec RS based model

This baseline model uses Node2Vec to learn embeddings for each node.

Node embedding is a technique for transforming nodes in a graph into dense vectors in a low-dimensional space. The goal of node embedding is to capture the underlying patterns and relationships between nodes in the graph, which can be useful for various tasks, including recommendation systems. There are several node embedding techniques available, such as DeepWalk and Node2Vec. Those methods use a NLP-based approach and random walk as sampling strategy to build sentences. The goal of this step is to learn user and item embeddings in a way that users are closer (neighbors) in the embedding space to the items they prefer. In the case of enriched datasets, the item’s properties will influence the resulting embeddings and will reflect the underlying relationships between items that share similar properties.

With the embeddings for each node, the model will recommend the k-first items closer to each user. It uses cosine similarity as the distance metric.

Main parameters:

walk_len: random walk length.
n_walks: number of random walks for each node.
p: likelihood of returning to the previous node, promoting more exploration of local structures.
q: likelihood of moving away from the previous node, promoting more exploration of different parts of the graph.
embedding_size: embedding size.
window_size: Word2Vec window size.

Next Steps

Implement some Evaluation metrics.

References

Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "DeepWalk: Online Learning of Social Representations."

Provide feedback

Saved searches