Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the algorithm used to identify the endpoints to be connected #83

Open
shravandoda opened this issue Feb 16, 2019 · 6 comments
Open

Comments

@shravandoda
Copy link
Contributor

shravandoda commented Feb 16, 2019

The algorithm used right now for identifying the endpoints to be connected traverses the whole graph to find the ones to connect. This method is really slow. One idea is to use python dictionaries for storing the nodes to be connected while they are being created.

@Mec-iS
Copy link
Contributor

Mec-iS commented Feb 16, 2019

we are working with the perspective of storing hundreds of millions of nodes, that is why we use Redis. Graph traversal has to be done using built-in Redis data structures and redis-graph querying system

@shravandoda
Copy link
Contributor Author

shravandoda commented Feb 17, 2019

@Mec-iS the issue is not with how edges are constructed between 2 nodes. It's with how the nodes to be connected are identified in the agent. Sorry for improper language. Currently the agent uses 4 nested loops(2 of which traverse the whole graph) to identify the nodes to be connected.

@shravandoda shravandoda changed the title Improve the algorithm used to connect endpoints Improve the algorithm used to identify the endponits to be connected Feb 17, 2019
@shravandoda shravandoda changed the title Improve the algorithm used to identify the endponits to be connected Improve the algorithm used to identify the endpoints to be connected Feb 17, 2019
@Mec-iS
Copy link
Contributor

Mec-iS commented Feb 17, 2019

in any case we cannot keep a dictionary in memory with nodes data. Redis is already an in-memory datastore, anything about the graph should be queried from the graph.

@shravandoda
Copy link
Contributor Author

shravandoda commented Feb 17, 2019

The way program works right now is that it uses a function to create nodes and then connect them. How about we use a dictionary as a local variable in the function? This way after all the concerned endpoints have been connected, the dictionary will be destroyed and node data will only be present in the graph. Using 4 nested loops is going to be very inefficient especially as the number of nodes increase.

@Mec-iS
Copy link
Contributor

Mec-iS commented Feb 17, 2019

@sandeepsajan0 ^

@Mec-iS
Copy link
Contributor

Mec-iS commented Feb 23, 2019

Using 4 nested loops is going to be very inefficient especially as the number of nodes increase.

yeah that's true. but any query or operation on the graph has to be done in Redis, we cannot allow to have the graph/nodes replicated by Python memory at any time. So I would try to refactor that code by leveraging Redis commands. The first version of the agent will be strictly coupled to Redis, working on it requires an understanding about what the datastore can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants