Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use look-up-table and threading for major speed improvements #2

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

carsten-AEI
Copy link

Currently, our netbox instance holds more than 3000 devices and more than 10k interfaces and IPs and creating an updated set of configuration files took more than 10 minutes.

The root problem seemed to be the API call to get the MAC address of the associated interfaces within the prefix loop. Adding a look-up-table greatly reduced run time to about 80s. Adding threading to the API instantiation got us another factor of 2 in speed-up.

(I hope this PR is good enough, I rarely do these on github. And sorry that unrelated commit 375d41e is also in here - I needed that to be able to call the script directly (./netbox_dnsmasq.py) - if you disagree with that one, I can try removing it)

@p-rintz
Copy link
Owner

p-rintz commented Jul 25, 2023

Hi,

Thank you for your contribution!

Could you please fix the linting issues the CI pipeline threw?

I'll have a look at the PR later tonight.

@carsten-AEI
Copy link
Author

gladly, if I completely understood them :-( .

Shall I also try to address the previously present linting issues? pylint throws a lot of stuff like

netbox_dnsmasq.py:102:32: C0103: Variable name "v2" doesn't conform to snake_case naming style (invalid-name)
netbox_dnsmasq.py:86:4: R1702: Too many nested blocks (8/5) (too-many-nested-blocks)
netbox_dnsmasq.py:46:0: R0912: Too many branches (16/12) (too-many-branches)
netbox_dnsmasq.py:116:41: C0103: Argument name "nb" doesn't conform to snake_case naming style (invalid-name)
netbox_dnsmasq.py:147:8: C0103: Variable name "ip" doesn't conform to snake_case naming style (invalid-name)

already at aa6b134 at me

@p-rintz
Copy link
Owner

p-rintz commented Jul 25, 2023

I do not use pylint.

You can find the linters that are being run and their options in the GitHub workflow here https://github.com/p-rintz/netbox-dnsmasq/blob/master/.github/workflows/linter.yml

Otherwise, I can also have a look later tonight.

@p-rintz
Copy link
Owner

p-rintz commented Jul 25, 2023

I will likely need a few more days, as Im using this as a chance to enable others to run the tests more easily.
The previous way I ran tests was rather limited to me, which is not optimal, obviously.

@carsten-AEI
Copy link
Author

no problem and after "fixing" the black linter issue I figured I had no idea how to fix the other coverage issues. Thus, take your time, I have the changes on my local branch anyway and am in the middle of trying to understand if I can make dsnmasq work all the way I would like it to.

@p-rintz
Copy link
Owner

p-rintz commented Jul 30, 2023

I have a question that Im a bit confused about.
From what I can see, accessing the 'ip.assigned_object.mac_address' object does not spawn a new API call.
It already is part of the API call that gets the IP itself.

A look-up-table doesnt seem to help there, since the API call for the IP is happening regardless.

Is the speed-up tied to the threading and not the lut?
Or am I misunderstanding something?
I dont currently have a larger Netbox instance to test this on.

@carsten-AEI
Copy link
Author

I have a question that Im a bit confused about. From what I can see, accessing the 'ip.assigned_object.mac_address' object does not spawn a new API call. It already is part of the API call that gets the IP itself.

Interesting. When it was slow, I ran tcpdump in the background and print in the main code and it seemed to correlate.

A look-up-table doesnt seem to help there, since the API call for the IP is happening regardless.

Yes and no. In my testing, getting all LUT information in one go was much faster the individual calls. Weird.

Is the speed-up tied to the threading and not the lut? Or am I misunderstanding something? I dont currently have a larger Netbox instance to test this on.

LUT brought the major speed-up, threading "only" a factor of 2 at the end.

I"m currently on vacation, but maybe I'll try to setup a larger "fake" netbox instance and test some more.

@p-rintz
Copy link
Owner

p-rintz commented Aug 1, 2023

The call that I meant (that is happening regardless) is in this line.

I started a debugger and set various breakpoints (including in pynetbox) to see when the call was happening, and the mac address always seemed available right from the start of the for loop.

I might simply be missing something, but Im trying to understand what exactly is happening and when 😃

@carsten-AEI
Copy link
Author

I have not forgotten this PR but way too much on my plate at the moment.

Odd think, yesterday I created a mock-up netbox system with 1000 servers and 3 IPs/system and both main tree and the PR branch were equally fast (13s on my laptop). Thus, I'll need to dig deeper why our evolved netbox instance is behaving differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants