Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pping #7

Merged
merged 14 commits into from
Feb 4, 2021
Merged

Add pping #7

merged 14 commits into from
Feb 4, 2021

Conversation

simosund
Copy link
Contributor

This is the initial version of pping that I showed @tohojo and @netoptimizer yesterday (but with a little bit of cleanup). It only supports IPv4 at the moment (IPv6 support is on the TODO list).

Regarding the issue you saw with it reporting RTTs of around 100ms when adding just 30ms latency (on the veth0 interface in the test environment), I'm not sure what's causing it. After the first couple of packets it seems to approach 100ms for any delay <100ms, and the correct value for delays >100ms. I get the same issue when running Pollere's pping as well as my bpftrace pping implementation.

@tohojo
Copy link
Member

tohojo commented Jan 12, 2021

A few high-level comments:

  • You should be able to use the existing functionality to reuse maps in libbpf instead of rolling your own. To do this, you'll need to switch to BTF-defined maps; then you can set the pinning attribute in the map definition, and all you'll have to do is set the pin_root_path attribute in bpf_object_open_opts when calling bpf_object__open_file(). See the XDP code file in the traffic-pacing-edt folder for an example of the BTF map definition format; you'll need to add __uint(pinning, LIBBPF_PIN_BY_NAME); to that to enable pinning.
  • Please reformat the C code to the kernel style. See https://www.kernel.org/doc/html/latest/process/coding-style.html - there's a .clang-format file in the kernel source tree that may help; Emacs also has a linux C style mode.
  • For loading XDP programs, you should probably use libxdp (in https://github.com/xdp-project/xdp-tools). That way this will work alongside other XDP programs, and for this use, loading the program should be quite simple as well. I guess we should add some code to this repository to make it simple to include libxdp; will look into that (so feel free to wait for that before switching).
  • Before we merge this, you should rewrite the history to fix up the commit messages a bit. At a minimum, you should add a Signed-off-by tag by each commit to signify that you are submitting all this as open source (see https://developercertificate.org/). But if you want to improve commit messages in general and/or squash commits together to form logical units, that's totally fine as well. In general, getting into the habit of writing commits in the style where the first line is a short heading (that can be displayed in one line of 80-100 characters), and more text is in a separate paragraph below, is a good idea (notice how GitHub also tries to display commits like that but fails because you put the entire commit message on a single line in some places).
  • Please add SPDX license identifier tags to the source files that don't have it.

@simosund
Copy link
Contributor Author

Thank you for your comments @tohojo. I will probably be busy with some other things the next couple of days, but after that I'll get back to addressing these points.

@tohojo
Copy link
Member

tohojo commented Jan 13, 2021

Alright, that's totally fine of course. You can just force-push to the existing branch to update the PR, but please leave a comment as well when you do that - Github doesn't always send out notifications on branch updates...

@simosund simosund force-pushed the add_pping branch 2 times, most recently from f7b4d04 to 8981833 Compare January 18, 2021 10:18
@simosund
Copy link
Contributor Author

Hi @tohojo,
I've made the switch over to BTF-defined maps now (for the XDP-program at least), and it is kinda working. I am however having issues getting the map fd from userspace (which I need for the periodic cleanup of the map). After just opening the object, the maps do not seem to have gotten a valid file descriptor yet (-1), and after loading the object, the name of the ts_start map disappears for some reason. This is the output I get from running the program currently (as of commit c4b75a9):

Maps after opening object
Map name: ts_start, fd: -1
Map name: rtt_events, fd: -1
 
Maps after loading object
Map name: , fd: 5
Map name: rtt_events, fd: 6

Any advise on how to handle this? My current workaround is to simply take the fd of the first map from the object, but that does not feel like a very good long term solution.

pping/pping.c Outdated Show resolved Hide resolved
@tohojo
Copy link
Member

tohojo commented Jan 20, 2021

Hi @tohojo,
I've made the switch over to BTF-defined maps now (for the XDP-program at least), and it is kinda working. I am however having issues getting the map fd from userspace (which I need for the periodic cleanup of the map). After just opening the object, the maps do not seem to have gotten a valid file descriptor yet (-1), and after loading the object, the name of the ts_start map disappears for some reason. This is the output I get from running the program currently (as of commit c4b75a9):

Maps after opening object
Map name: ts_start, fd: -1
Map name: rtt_events, fd: -1
 
Maps after loading object
Map name: , fd: 5
Map name: rtt_events, fd: 6

Any advise on how to handle this? My current workaround is to simply take the fd of the first map from the object, but that does not feel like a very good long term solution.

libbpf will reload the name from the kernel when reusing the fd, so if there's no name that would be because your kernel doesn't have one? Which kernel version are you running? And what output does bpftool map show?

@simosund
Copy link
Contributor Author

libbpf will reload the name from the kernel when reusing the fd, so if there's no name that would be because your kernel doesn't have one? Which kernel version are you running? And what output does bpftool map show?

I'm running kernel version 5.4.0-60 (under Ubuntu 20.04.1LTS), and the output of bpftool map is:

1: hash  flags 0x0
        key 16B  value 16B  max_entries 16384  memlock 1576960B
7: perf_event_array  name rtt_events  flags 0x0
        key 4B  value 4B  max_entries 8  memlock 4096B

Also tried it on a VM running Ubuntu 20.10 with kernel version 5.8.0-38, with the same result.

I tc version tc-utility, iproute2-ss200127 (and tc-utility, iproute2-ss200602 on the VM) if that matters, as the map is pinned by tc.

@tohojo
Copy link
Member

tohojo commented Jan 21, 2021 via email

@simosund
Copy link
Contributor Author

Ah, 'tc' doesn't have BTF support so it can't read the name of a
BTF-defined map. With the changes you made it shouldn't be necessary to
pin it with TC, though. You can just call bpf_map__pin() and libbpf will
pin it in the same path as it would reuse it from. You can check with
bpf_map__is_pinned() if it's necessary to pin or if it already was.

No, my tc (or iproute2) version doesn't have libbpf support, so therefore the tc-bpf program doesn't use a BTF-defined map but rather struct bpf_elf_map. I guess I could do something similar as in traffic-pacing-edt/edt_pacer_vlan.c to use BTF-defined maps if they're supported.

But you are completely right in that I don't need to pin the map with TC. I tried simply switching order so that the XDP-program which is loaded with libbpf is loaded first (and thus pins the map if it doesn't already exist), and then TC gets to reuse that map instead and the name is preserved.

I'm not sure if I understand why I would need to "manually" pin the map with bpf_map__pin() though. Just leaving libbpf to handle that automatically based on the pinning attribute in the BTF-defined map seems to work fine? As far as I understand it, libbpf handles the automatic pinning similarly to iproute2, in that it will automatically reuse the map if it already exists, or otherwise pin it.

@tohojo
Copy link
Member

tohojo commented Jan 21, 2021 via email

@simosund
Copy link
Contributor Author

Hi again @tohojo. As of commit 5a40220 I think I've addressed all points from your initial high-level comments, with exception of using libxdp (think I'll wait with that until you've made it easy to include from this repository). So if you have the time, it would be great if you could take another look at this and see if there are any other issues. Otherwise I'll probably start working on adding IPv6 support.

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall in pretty good shape; these are mostly smallish nits and style issues :)

@@ -0,0 +1,85 @@
#!/bin/bash
#
# Author: Jesper Dangaaard Brouer <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably add some indication here that you modified the script (which I'm assuming you did?); either adding yourself, or a comment that you modified it, or turning the 'author' into a series of copyright lines. @netoptimizer any opinion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment here that I've made extended it by adding a "section" option (and have also changed a default fallback value, although this comment doesn't state that, but could be rectified).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely just a copy of my script.
My plan is to generalize this script and place it one level up.
It is ugly that we have several copies of the same script, but it is fine for this PR... I should be assigned to clean it up (maybe create a github issue to track this?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #9 for this...

pping/functions.sh Show resolved Hide resolved
pping/parameters.sh Show resolved Hide resolved
pping/pping.c Outdated Show resolved Hide resolved
pping/pping.c Outdated Show resolved Hide resolved
pping/pping_helpers.h Show resolved Hide resolved
pping/pping_helpers.h Outdated Show resolved Hide resolved
pping/pping_helpers.h Outdated Show resolved Hide resolved
pping/pping_helpers.h Outdated Show resolved Hide resolved
pping/pping.c Show resolved Hide resolved
pping/pping.c Outdated Show resolved Hide resolved
pping/pping.c Outdated Show resolved Hide resolved
pping/pping.c Outdated Show resolved Hide resolved
Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few runtime issues as well, now that I ran out of code style issues ;)

Also, a couple of points about commit messages:

  • The git formatter expects a completely blank line between the subject (first line) and the rest of the text.
  • Please break lines at 80 characters so it doesn't look weird when looking at 'git log' in a console.
  • Try to keep the language imperative in the changes of the code ("change something" instead of "I changed something" or "this patch changes something")

So instead of:

Minor refactor and fixed weird spacing
Have refactored tc_bpf_load and tc_bpf_clear to use a common run_program function which does the fork+execv.
Have also fixed some weird spacing in pping_helpers.h, and ran clang-format with the kernel source tree .clang-format on the other source files which resultet in some minor changes.
    
Signed-off-by: Simon Sundberg <[email protected]>

this could be something like:

Minor refactor and whitespace fixes
    
Refactor tc_bpf_load and tc_bpf_clear to use a common run_program
function which does the fork+execv.
    
Also fix some weird spacing in pping_helpers.h, and fix some formatting
issues, using clang-format with the kernel source tree .clang-format on the
whole tree.
    
Signed-off-by: Simon Sundberg <[email protected]>

pping/pping.c Show resolved Hide resolved
pping/pping.c Show resolved Hide resolved
@simosund
Copy link
Contributor Author

Found a few runtime issues as well, now that I ran out of code style issues ;)

Also, a couple of points about commit messages:

  • The git formatter expects a completely blank line between the subject (first line) and the rest of the text.
  • Please break lines at 80 characters so it doesn't look weird when looking at 'git log' in a console.
  • Try to keep the language imperative in the changes of the code ("change something" instead of "I changed something" or "this patch changes something")

So, would you like me to rebase in order to rewrite the commit messages in this style, or is messing with the history at this point an even worse sin?

@tohojo
Copy link
Member

tohojo commented Jan 27, 2021

So, would you like me to rebase in order to rewrite the commit messages in this style, or is messing with the history at this point an even worse sin?

Yeah, please go ahead and rewrite history. As long as it is pre-merge this is fine; and in fact often encouraged. The final version of the commits is what will be preserved in the history for posterity, so the important thing is the readability of those.

For kernel submissions this also means folding fixes into the original patches. We're not as strict with that in this repository (as you can probably tell by looking at the history), so you don't have to do that for this PR. But fixing the commit messages is more important as that will go into the history, and right now git log --oneline looks... less than ideal... on this branch :)

If you do want to play with reshuffling commits and folding fixes into other patches (for this or for future series), I can recommend stgit for managing this: https://github.com/stacked-git/stgit

Oh, and if you're not already using magit in Emacs, I'd recommend taking a look at that as well; that also has a plugin for stgit...

@simosund
Copy link
Contributor Author

simosund commented Feb 2, 2021

Hi @tohojo,
Have rewritten all the commit messages now, so they're hopefully ok (or at least better). Didn't think about it until now, but I see that for example the headers of all commit messages to ex. traffic-pacing-edt are prefixed with "traffic-pacing-edt:". Is that something I should have done (but with "pping:" then)?

Have also added the check for if the program is run as root or not (stealing your code snippet) and attempt to create the /sys/fs/bpf/tc folder before the map is pinned.

So feel free to take another look at it and see if there's anything else that needs fixing (or if the fixes themselves need fixing...).

@simosund
Copy link
Contributor Author

simosund commented Feb 2, 2021

Also, thanks for the tip about magit, seems pretty nice.

pping/pping.c Show resolved Hide resolved
@tohojo
Copy link
Member

tohojo commented Feb 3, 2021

Have rewritten all the commit messages now, so they're hopefully ok (or at least better). Didn't think about it until now, but I see that for example the headers of all commit messages to ex. traffic-pacing-edt are prefixed with "traffic-pacing-edt:". Is that something I should have done (but with "pping:" then)?

Looks much better! But yeah, now that you mention it, prefixing the subjects with pping: would be good - makes it obvious which component a commit modifies when browsing the history of the whole repository. We haven't been completely consistent with this, but we really should :)

So feel free to take another look at it and see if there's anything else that needs fixing (or if the fixes themselves need fixing...).

Other than the above and the comment on mkdir() I think we're done nitpicking this for now ;) Good job!

@simosund
Copy link
Contributor Author

simosund commented Feb 3, 2021

Looks much better! But yeah, now that you mention it, prefixing the subjects with pping: would be good - makes it obvious which component a commit modifies when browsing the history of the whole repository. We haven't been completely consistent with this, but we really should :)

Can rewrite the commits prepending pping: to the header. Also learned that commit messages should normally keep lines below 72 characters (instead of 80), so that git can indent it a bit and still keep it below 80, so will probably redo the word-wrapping based on 72 instead of 80 characters.

Add a XDP program to parse TCP timestamps and a simple loader

Signed-off-by: Simon Sundberg <[email protected]>
The XDP program pushes the calculated RTTs to userspace through the
perf-buffer and the userspace program polls it to print them out

Signed-off-by: Simon Sundberg <[email protected]>
Split and rename files so there is one userspace program (pping) and
two kernel-space ones (one for XDP and one for TC-BPF).

Copy the shell script for loading the TC-BPF program from
traffic-pacing-edt folder, but add support for loading a specific
section.

The XDP and TC-BPF programs do not share the ts_start map, so program
does not work.

Signed-off-by: Simon Sundberg <[email protected]>
Make tc pin the ts_start map when loading the TC-BPF program, and
rewrite XDP loader to reuse map pinned by tc.

Also add comment with TODO list in pping.c.

Testing pping by adding a delay through a netem qdisc in the test
environment shows that the reported RTT will approach 100ms for any
delay lower than 100ms, but the correct RTT for any delay over
100ms. Root cause is unknown, but Pollere's original pping
implementation (as well as a bpftrace based pping implementation)
shows the same issue. This issue has not been observed when running on
real interfaces without a netem qdisc.

Signed-off-by: Simon Sundberg <[email protected]>
Format the code using the .clang-format from the kernel source tree,
with a few manual tweaks here and there. Also, remove the TODO list
from comment of pping.c and instead put it in TODO.md.

Signed-off-by: Simon Sundberg <[email protected]>
Format the header files in the Linux kernel style (missed in previous
commit). Also fix a formating error in TODO.md that cause empty
checkboxes to not display correctly.

Signed-off-by: Simon Sundberg <[email protected]>
Make loader use libbpf's existing functionality for reusing pinned
maps. The name for map not kept by tc, so cannot get fd of map by
name. Use fd of first encountered map as temporary workaround.

Signed-off-by: Simon Sundberg <[email protected]>
Switch order so XDP program loads first, so the ts_start map is
automatically pinned by libbpf (solves issue with tc not preserving
the name of the map).

Unload the TCP-BPF program (or rather remove the entire clsact qdisc
it is attached to) using bpg_egress_loader script once program
exits. Also unpin ts_start map on program shutdown.

Signed-off-by: Simon Sundberg <[email protected]>
Copy setup from traffic-pacing-edt to use BTF-defined map if configure
detects that iproute2 has libbpf support, otherwise fall back on
bpf_elf_map. Also fix a minor bug with setting default value for SEC
in bpf_egress_loader.sh.

Signed-off-by: Simon Sundberg <[email protected]>
Split the print statements for RTTs into two parts to avoid inet_ntoa
overwriting one of the IP-addresses (causing both source and
destitionation address to appear the same). Also flip the order of
source and destination to be the same as Pollere's pping.

Signed-off-by: Simon Sundberg <[email protected]>
Perform various fixes and tweaks:
- Rename several defines to make them more informative
- Remove unrolling of loop in BPF programs
- Reuse defines for program sections between userspace and kernel
  space programs
- Perform fork+exec to run bpf_egress_loader script instead of
  system()
- Add comment to copied scripts indicating I've modified them
- Add pping.h and pping_helpers.h as dependencies in Makefile

Also, add a brief description of what PPing is and how it works to
README

Signed-off-by: Simon Sundberg <[email protected]>
Refactor tc_bpf_load and tc_bpf_clear to use a common run_program
function which does the fork+execv.

Enclose compound statement defines in parenthesis.

Removed argument CLOCK_MONOTONIC from callers to parameterless
function get_time_ns().

Also fix some weird spacing in pping_helpers.h, and fix some
formatting issues, using clang-format with the kernel source tree
.clang-format on the whole tree.

Signed-off-by: Simon Sundberg <[email protected]>
Create the /sys/fs/bpf/tc folder if it does not exist. Also check if
pping is run as root, otherwise inform user that it must run as root.

Libbpf will attempt to create the /sys/fs/bpf/tc/globals directory
when pinning the map, however it will not do so recursivly (so will
fail if /sys/fs/bpf/tc does not exist). So as a temporary solution,
attempt to create /sys/fs/bpf/tc (however, if sys/fs/bpf is not
mounted this will still fail).

Signed-off-by: Simon Sundberg <[email protected]>
@simosund
Copy link
Contributor Author

simosund commented Feb 4, 2021

So @tohojo, now I've finally gotten around to refactoring mkdir_if_noexist() (squshed it with the previous commit) and fixing up all the commit messages (prepending pping and wrapping at 72 instead of 80 characters). So hopefully it should be ready to go now, and I can move on to adding some new features instead.

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, looks good - nice work! :)

@tohojo tohojo merged commit ea11864 into xdp-project:master Feb 4, 2021
@simosund simosund deleted the add_pping branch February 8, 2021 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants