Feature request: mapping mode #466

PenelopeFudd · 2023-02-27T14:20:28Z

Hi;

I'm trying to map my network using MTR, and wondered if there was a command line option that would send the minimum number of packets required in order to identify each hop.

For instance, I'm currently using the command line mtr -r -c 5 -j -n -i 0.1 -G 0.1 $host, which sends five requests and exits.

Some hops return ???, which seems to indicate that the router is throttling the number of ICMP responses coming back. One solution is to increase the number of packets sent, but for those hops that sent a response on the first try, the additional packets are a waste. Additionally, some hops won't respond at all, so we should give up after a certain number of tries.

A more advanced mapping algorithm would involve caching, and starting with higher hop numbers first. When you get the ip address of a hop that's in the cache, you could skip the rest of the smaller hops, under the assumption that there is only one route from that hop, and it's already in the cache. For networks that violate that assumption, users wouldn't want to use the cache option.

In my particular case, I'm trying to compare the results produced by using three different VPN servers in different parts of the world, and so I'd use different cache files for each of the maps.

Thanks for considering this, and for producing MTR in general!

The text was updated successfully, but these errors were encountered:

rewolff · 2023-02-27T14:35:32Z

I think that this might be done by writing a new application that uses mtr-packet.

On the other hand "if host-is-known, don't send packet" option would be easy to add to mtr.

My first thought was to preload from the cache and then consider those "known" but then you wouldn't find out when to revert to real probing.

Probing in the reverse direction is complex. If you start probing at 32 hops, then you'll find "reached target in 32, 31, 30, 29 ... hops", wasting way more than when you start at the other side.

I get the hint that you're going to automate this and send a LOT of probes. I'd suggest you pass --min-hops (or whatever it is) to skip probing the first few hops once you learn that you'll get the same responses for those. Aim to find an overlap in the first hop (or two) you probe and schedule for retrying with no first hop if you don't get the first hop you expected.

The -c option already limits the amount of probes. If we simply don't probe once we've found a hit, that'd be quite a lot along the way to what you'd want.

PenelopeFudd · 2023-02-27T15:07:02Z

Yes, exactly; making the program stop probing when it's found a hit is a straightforward optimization; basically leave the -c option as it is, but add another option to enable this.

I think your idea of preloading from a cache would work: the user would either decide to use a different cache file when they know the network changed, throwing away the cache if it's too old, or the cache would perhaps need to be spot-checked on startup to see if it still describes reality. Actually, rechecking the entire cache would be a useful function too, just to see if the network has been altered since the cache was made, although this starts to wander into "network monitoring" territory. :-)

To reduce wasted packets, it might be possible to examine the TTL on the first "reached target" response packet and use a heuristic to decide how far away the host is, or perhaps do a binary search (1,2,4,8,... or 32, 16, 24, 20, 18...).

matt-kimball · 2023-02-27T15:11:41Z

If you'd like to write the logic for this process in Python, you might consider using mtr-packet-python: https://github.com/matt-kimball/mtr-packet-python

rewolff · 2023-02-27T15:54:38Z

the option to start at xxx is called: first-ttl

rewolff · 2023-02-27T16:02:21Z

The "cache" doesn't work.

I know for my connection, the first 7 hops leave me in Amsterdam and from there, things usually start to diverge. But that's likely "5 hops" for you, etc. And tracing to say newyork or LA may end up with 10 hops the same as "last trace" (i.e. new york) but then the next one to france (which you may know by hostname: not obvious it is in france) might start to differ at 7 again.

besides, I might every once in a while trace to a host that is hosted on the network of my provider, i.e. it might branch off earlier than those 7 hops.

So... I don't like mtr "guessing" things. It should just be a tool and when it reports something, that should be exactly what it measured.

PenelopeFudd · 2023-03-09T20:21:35Z

Oh, I agree; when I usually use mtr, I want it to report what it sees right now, not what it remembers (or imagines). That's what diagnostic tools like mtr are designed for: debugging the problems of the moment.

However, when I'm trying to map my network (using mtr to reach every host I've got, and then using Graphviz to draw a tree diagram), my nearby routers throttle the TTL exceeded messages, and mtr starts showing "???" for the first few hops. In addition, those first hops are placing a significant load on my network but returning the same information, which is a waste.

For long traces that go halfway around the world, any hops that are in "the cloud" are of little interest because they won't be part of the map I'm generating. The routes converge once they leave the cloud at the far datacenter (for example), so it's not a problem if the cloud hops aren't 100% correct.

rewolff · 2023-03-10T07:51:01Z

Then you get to indicate to mtr that you don't need probes sent to hops 1-10.

PenelopeFudd · 2023-03-10T20:37:15Z

Here's what I came up with. It uses jq and sed, and outputs lines suitable for inclusion in a Graphviz "dot" file:

#!/bin/bash
CACHE=~/.cache/mtr-cached
DEFAULT_MAX=20

if [[ "$1" == "" ]]; then
  echo "Usage: ${0##*/} host"
  echo "  This does a traceroute combined with a cache."
  echo "  The cache is in $CACHE"
  exit 1
fi

if [[ ! -e "$CACHE" ]]; then
  echo "Creating $CACHE"
  mkdir -p "$CACHE"
fi

function do_mtr() {
  if [[ "$EUID" == 0 ]]; then # go faster:
    mtr -r -c 5 -j -n -i 0.1 -G 0.1 "$@"
  else
    mtr -r -c 5 -j -n -i 1 -G 0.1 "$@"
  fi
}

function get_hops() {
  do_mtr "$@" | jq -r '.report.hubs[].host | sub("^[?][?][?]$";"")'
}

function trace_one() {
  DESTINATION="$1"
  route=()
  NEXTHOP="${DESTINATION}"
  for x in $(seq "$DEFAULT_MAX" -1 2); do
    if [[ "$NEXTHOP" != "" && -e "$CACHE/$NEXTHOP" ]]; then
      THISHOP="${NEXTHOP}"
      NEXTHOP=$(cat "$CACHE/$THISHOP")
    else
      hops=( $(get_hops -f $(( x - 1 )) -m $x "$DESTINATION" ) )

      if [[ "${#hops[@]}" == 2 ]]; then
        NEXTHOP="${hops[0]}"
        THISHOP="${hops[1]}"
        echo "${NEXTHOP}" > "$CACHE/$THISHOP"
      fi
    fi
    route[$x]="$THISHOP"
    route[$(( x - 1 ))]="$NEXTHOP"
  done

  routespace=""
  sep=""
  for x in $(seq 1 "$DEFAULT_MAX"); do
    routespace="${routespace}${sep}${route[$x]}"
    sep=" "
    if [[ "${route[$x]}" == "${DESTINATION}" ]]; then break; fi
  done
  # Append destination to last hop:
  routespace="${routespace}(${DESTINATION})"
  # Delete destination if same as last hop:
  routespace=$(echo "$routespace" | sed 's/ \([0-9.][0-9.]*\)[(]\1[)]/ \1/')

  # Turn list of hops into dot syntax:
  echo "${routespace}" | sed 's/^/"/; s/$/";/; s/  */" -> "/g'
}

for n in "$@"; do
  trace_one "$n"
done

Example output:

$ mtr-cached.sh 8.8.8.8

"209.205.110.1" -> "209.205.90.122" -> "142.250.160.222" -> "142.251.229.137" -> "142.251.48.211" -> "8.8.8.8";

PenelopeFudd · 2023-03-10T20:39:45Z

It's not the greatest, but it conveys the gist of what I'm trying to achieve.

yvs2014 · 2023-03-12T07:01:53Z

@PenelopeFudd > nearby routers throttle the TTL exceeded messages, and mtr starts showing "???"

If that's only about to skip first nearby routers, filter '???' hops out, and leave unique ones,
what about in one long line without caching?

$ mtr -f3 -rnC -c5 yahoo.com | tail +2 | cut -d',' -f6 | grep -v '???' | uniq | sed 's/\(.*\)/"\1"/g' | sed ':a; N; $!ba; s/\n/ -> /g; s/$/;/'
"91.210.251.25" -> "91.210.248.77" -> "216.66.93.221" -> "184.104.192.165" -> "72.52.92.166" -> "184.104.197.54" -> "198.32.118.24" -> "209.191.64.155" -> "74.6.227.61" -> "74.6.122.65" -> "74.6.123.242" -> "74.6.98.138" -> "74.6.143.25";

update: plus AS path

$ mtr -y0 -f3 -rnC -c5 yahoo.com | tail +2 | cut -d',' -f7 | grep -v '???' | uniq | sed 's/\(.*\)/"\1"/g' | sed ':a; N; $!ba; s/\n/ -> /g; s/$/;/'
"AS48438" -> "AS6939" -> "AS10310" -> "AS36646";

PenelopeFudd · 2023-03-12T08:18:08Z

That's 100% efficient (no hops seen twice) for distant (more than 3 hops away) widely-diverging hosts (i.e. after 3 hops they've got no hosts in common).

I'd like to generalize the solution a bit more, making only the assumption that if the current hop has been seen before, the nearer hops will be identical. It won't be 100% efficient, because you won't know if a hop has been seen before until you see it a second time, but I don't see a better way.

As noted by rewolff, that assumption doesn't always hold true in the cloud, which is one reason why this caching shouldn't be turned on by default. Of course, it doesn't matter if the hops in the cloud change, unless we're mapping the cloud.

The other reason is that diagnostic tools should report what the situation is now, not what the cache says it is (or rather, was). Here, we're trying to use a diagnostic tool as a mapping tool, so that's ok.

yvs2014 · 2023-03-12T09:45:49Z

if the current hop has been seen before, the nearer hops will be identical

What does it mean identical in this context?
Possible identity (besides ip-address) could be ASN, route-prefix, org-name, or difference of average-delays with some post-processing.
For example, how a path looks like with prefix/asn/org comparing to addresses and delays

rewolff · 2023-03-12T10:20:04Z

The original assumption in mtr is that you don't know the TTL to the destination, and also want to find out the path a packet will travel. Thus probing from the start of the travels and stopping when you hit the destination is most efficient.

But if you know the TTL to the destination, then you could just as well probe the path in descending order.

So... how about a tool that guesses the TTL to the destination by simply sending it a ping and looking at the TTL of the result? If you guess the starting TTL of the packet, you can calculate the number of hops.

Then we can implement: "If you find a host in the cache, stop probing", and "probe paths starting at the highest TTL".

My guess is that most hosts will nowadays use 64 (IIRC). But we need to check that. So what we'd need is a tool that sends a ping and reports the TTL of the returned packet and then starts mtr with min TTL set to one less than the guessed number and max-TTL set to that number. With the two probes that mtr sends you will often get your answer: Did we guess correctly.

I propose to randomly select an IPV4 address and try this on that host. With all IPV4 addresses having been exhausted, we should have a better than 50% chance of hitting a real host, right? Hmmm.... tested: I pinged 10 random IP addresses and all I got was ONE error. This disputes with 99.9% certainty the hypothesis: the chances are > 50%....
So I don't know how to go about finding "willing" IP4V addresses to participate in our study.

rewolff · 2023-03-12T12:59:22Z

What does it mean identical in this context?

What Charles wants is to make a map of a subset of IPV4 (or -6?) expecting it to be mostly static. What he expects to be static is the info that can be derived from the IP address like the secondary info that MTR can look up for you. He doesn't care about delay times. Those are also assumed to be "mostly static". (e.g. when I trace a server for the domain of a USA college/university I always get first hop .2 ms (local network), then 11 ms to my provider's first host, then a few more ms to amsterdam and possibly 20ms across the ocean. (mit teleported to amsterdam as I was writing this! ... Static my a$$. )

yvs2014 · 2023-03-12T15:22:15Z

expecting it to be mostly static

Just thought that the more general information, the more static it is and less likely to be changed. Like for the path above:
-- about 15 hops in ip-addresses
-- 8 hops counted in route-prefixes
-- 4 hops in ASNs
-- 3 hops in org-names
-- 2 hops in country-codes

but yeah it doesn't reduce the number of sending packets, so maybe it's not very suitable in this case

PenelopeFudd · 2023-03-12T19:00:53Z

Thank you, you've helped me I realize I wasn't thinking about the bigger picture. When I thought of "making a map", I was only considering ipV4 addresses, and ignoring ipV6 addresses, round-trip times, route-prefixes, ASNs, org-names or country-codes. It would be neat to colour the edges or place the nodes based on those values.

Also, when I said "static", I was thinking about how, if you trace the route to the same destination two times, you'd get the same list of hops both times. Another assumption was that, if the routes to two different destinations go through the same hop, then all the nearer hops would be the same as well. These assumptions do not hold true in all cases, but when you're mapping a network that doesn't use multi-path protocols in the parts you're interested in, they work.

There was one more assumption that may or may not make a difference: if you trace a route to a destination, and then trace a route to each of the hops along that route, you'll get the same hops. For example, if the first trace is "A -> B -> C -> D", I'd assume that the route to "C" would be "A -> B -> C", but it may very well be "A -> E -> F -> C". That's why it's probably good to treat destinations and hops differently in the cache.

rewolff · 2023-03-12T19:26:36Z

Why do you think that "destination" vs "hop" would make a difference, but you are assuming dest-A vs dest-B would lead to the same intermediate hops???

You're making (big) assumptions: that the network doesn't change much. This means that if the map is

A-B-C-D-E
.......C-F-G

you're assuming that A->G also goes through B as soon as you see "C" show up.

Then you might as well assume that if you trace to C, it also goes through B.

PenelopeFudd · 2023-03-12T19:48:12Z

I'm saying "destination" and "hop" might make a difference because I'm not sure which interface routers might report in their ICMP TTL exceeded message (admin interface?), and because of programs like fakeroute that can twist the answer to be anything. It's an assumption that's probably not important, but I thought I'd throw it out there for completeness.

And yes, I'm assuming the parts of the network I'm interested in don't change at all between when I start using the cache and when I finish, which I'm hoping will be about 5 minutes or less.

The alternative to using a cache is just to retry hops that don't respond a few more times.

yvs2014 · 2023-03-13T01:03:07Z

@PenelopeFudd

when you're mapping a network that that doesn't use multi-path protocols
not sure which interface routers might report in

traceroute (or mtr) shows ip-addresses of routers from sender's side. It's usually that ip-address of network interface through which traffic is routed to a sender back. So, changes in backward direction can impact to.

The idea to dynamically reduce ttl set in probes to destination during some short period of time (i.e. probing only not responded hops at this time) looks quite reasonable at least in some cases, sorry I haven't got this idea before, thought it's about tracing transit hops directly.

yvs2014 · 2023-03-14T17:29:56Z

I've tried to add some cache mode inside mtr (to older version so far) with the next algorithm:
if n-th hop is 'up' and seen less than 'cache-timeout' period, then don't ping that hop. Then after 'cache-timeout', probe it again, and so on.
In few tests, looks like it works in this "no ping to known hops" mode. On the picture it is seen in "Snt" column

yvs2014 · 2023-03-15T05:37:32Z

adaptation of this cache-mode mentioned in previous message to current mtr
https://github.com/yvs2014/mtr085/tree/mtr20230314-with-cachemode

PenelopeFudd · 2023-03-15T06:10:06Z

Looks good, I'll have to give it a try! :-)

PenelopeFudd · 2023-03-15T06:45:08Z

Yep, it does what it needs to do!

It tries to ping each hop until it gets an answer or runs out of (-c) cycles.
When the cache expires for a given hop, it pings again.

The next level of optimization would be caching the same data so that multiple copies of mtr could use it in parallel, or have mtr accept a list of hosts to trace in parallel. We'd still have to come up with a way to decide if any of the cached data contains the answer that you'd get if you pinged a given hop. Not easy. :-/

yvs2014 · 2023-03-15T07:21:24Z

caching the same data so that multiple copies of mtr could use it in parallel

it's far enough out of scope mtr itself, and a bit complicated because of different context of those copies and their data. I.e. no idea about this kind of synchronization. Maybe someone else will find it out.

PenelopeFudd · 2023-03-15T20:45:33Z

I just want to do parallel mtr -j {} :::: inputlist.txt > output.json and not flood our poor router if I can help it. :-)
(BTW, Gnu Parallel rocks!)

yvs2014 · 2023-03-16T05:38:26Z

not flood our poor router

a bit calculations with the longest imaginable path (30 hops):
-- without cache it's 30 packets per second (30 pps) on permanent basis
-- with cache when 1/6 of hops are silent - 5 pps
both these numbers are not big even for the poorest router in the world, so it's all about how many entries in that inputlist.txt, or something else

On the aim to reduce generated traffic with already available options: with default 60sec "cache" - it's 1sec for unresponded packets and 60sec for responded ones. If you set '-i 10 --use-cache --cache-timeout 300', it will be 50 times less (0.1pps per target for the above example). And if absolute majority of unresponded packets are designed to nearby routers, then cutting them off from a path can drastically decrease the number of pings (almost to negligible comparing to prenumbers). I.e. something like '-f3 -i10 --use-cache' would be not too bad option (and a poor router on the hop#1 will be a little less poor, not?)

PenelopeFudd · 2023-03-17T01:38:39Z

Oh, I meant that the input list has a couple hundred hosts in it, and starting up that many copies of mtr in parallel would end up getting no response from the first hop for a majority of them. So the -f 3 option would be of great benefit, and I guess I can fix the ??? entries in post-production. :-)

Thanks for the caching changes!

yvs2014 · 2023-03-17T07:37:26Z

@PenelopeFudd You're welcome
in addition I've combined two cache options in one (-X/--cache SECONDS), maybe it's a bit more common with a short option

@rewolff Could this cache mode be interesting to other people?
In the branch master...yvs2014:mtr085:mtr20230314-with-cachemode
first commit to fix wait-time calculation according first-ttl, other ones are cachemode related

yvs2014 mentioned this issue Jun 23, 2023

Using the -f option changes the packet rate. #479

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: mapping mode #466

Feature request: mapping mode #466

PenelopeFudd commented Feb 27, 2023

rewolff commented Feb 27, 2023

PenelopeFudd commented Feb 27, 2023

matt-kimball commented Feb 27, 2023

rewolff commented Feb 27, 2023

rewolff commented Feb 27, 2023

PenelopeFudd commented Mar 9, 2023

rewolff commented Mar 10, 2023

PenelopeFudd commented Mar 10, 2023 •

edited

Loading

PenelopeFudd commented Mar 10, 2023

yvs2014 commented Mar 12, 2023 •

edited

Loading

PenelopeFudd commented Mar 12, 2023

yvs2014 commented Mar 12, 2023

rewolff commented Mar 12, 2023

rewolff commented Mar 12, 2023

yvs2014 commented Mar 12, 2023

PenelopeFudd commented Mar 12, 2023

rewolff commented Mar 12, 2023 •

edited

Loading

PenelopeFudd commented Mar 12, 2023

yvs2014 commented Mar 13, 2023

yvs2014 commented Mar 14, 2023

yvs2014 commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

yvs2014 commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

yvs2014 commented Mar 16, 2023

PenelopeFudd commented Mar 17, 2023

yvs2014 commented Mar 17, 2023

Feature request: mapping mode #466

Feature request: mapping mode #466

Comments

PenelopeFudd commented Feb 27, 2023

rewolff commented Feb 27, 2023

PenelopeFudd commented Feb 27, 2023

matt-kimball commented Feb 27, 2023

rewolff commented Feb 27, 2023

rewolff commented Feb 27, 2023

PenelopeFudd commented Mar 9, 2023

rewolff commented Mar 10, 2023

PenelopeFudd commented Mar 10, 2023 • edited Loading

PenelopeFudd commented Mar 10, 2023

yvs2014 commented Mar 12, 2023 • edited Loading

PenelopeFudd commented Mar 12, 2023

yvs2014 commented Mar 12, 2023

rewolff commented Mar 12, 2023

rewolff commented Mar 12, 2023

yvs2014 commented Mar 12, 2023

PenelopeFudd commented Mar 12, 2023

rewolff commented Mar 12, 2023 • edited Loading

PenelopeFudd commented Mar 12, 2023

yvs2014 commented Mar 13, 2023

yvs2014 commented Mar 14, 2023

yvs2014 commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

yvs2014 commented Mar 15, 2023

PenelopeFudd commented Mar 15, 2023

yvs2014 commented Mar 16, 2023

PenelopeFudd commented Mar 17, 2023

yvs2014 commented Mar 17, 2023

PenelopeFudd commented Mar 10, 2023 •

edited

Loading

yvs2014 commented Mar 12, 2023 •

edited

Loading

rewolff commented Mar 12, 2023 •

edited

Loading