Skip to content

Conversation

@rcgoodfellow
Copy link
Contributor

@rcgoodfellow rcgoodfellow commented Jan 8, 2026

This PR adds support for adding IPv4 routes over IPv6 nexthops. This is needed for BGP unnumbered where nexthops are always IPv6 link local addresses.

There are no new tables added to this PR. Rather IPv6 variants of the exiting IPv4 actions on IPv4 router index table are added.

The API has a version bump to accommodate creating IPv4 routes over IPv6 nexhops as well is listing and fetching IPv4 routes.

This code has been tested in Maghemite CI where we have an mgd daemon feeding unnumbered routes to dendrite and they are installed successfully here.

@rcgoodfellow rcgoodfellow added this to the 18 milestone Jan 8, 2026
@rcgoodfellow rcgoodfellow marked this pull request as ready for review January 9, 2026 23:02
Copy link
Collaborator

@Nieuwejaar Nieuwejaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you some tests for this? It's not something that's going to get exercised in a normal racklette or a4x2 run.

wd=`pwd`
export WS=$wd
STARTUP_TIMEOUT=${STARTUP_TIMEOUT:=15}
STARTUP_TIMEOUT=${STARTUP_TIMEOUT:=45}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen this be an issue before. Are you seeing this in CI or somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it regularly when running the tests locally on a development VM.

path = "/route/ipv4",
versions = ..VERSION_V4_OVER_V6_ROUTES
}]
async fn route_ipv4_list_v2(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this v2? What was v1?

I've taken a very different approach to route versioning in the PR I sent you, which I think is cleaner - especially with a larger-scale change like I had to make. Basically all the v1 compatibility stuff gets put in a v1 module. I think this will also make it easier to pull out stuff when we decide to drop support for an older version.

I'm not wedded to that implementation style necessarily, but we do need to get our version scheme straight. Right now both PRs represent the first stab at backwards compatibility, and we chose different version numbers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I have too many PRs in my head right now. I meant to point you at the underlay/isolation branch: https://github.com/oxidecomputer/dendrite/blob/isolation/dpd-api/src/lib.rs#L69

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I've oriented things in this direction now.

display the hex of each packet body as well.
- `DENDRITE_TEST_TIMEOUT`: The amount of time to wait for any single test's
network traffic to complete, specified in milliseconds. The default is 500,
network traffic to complete, specified in milliseconds. The default is 1500,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never (almost never?) hit this in CI. When running on the colo, 500ms works for everything except a single multicast test. I'd rather not change the default if it's just failing on a single machine somewhere, since it seems like it could tripe the time it takes to run the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've hit what I believe is this several times in CI recently, i would even characterize it as hitting often. It also happens for me regularly on a development VM.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect any timing issues you hit in CI were #172.

@rcgoodfellow
Copy link
Contributor Author

Thanks @Nieuwejaar. I've added a packet test. That packet test surfaced some issues that I fixed.

This code will probably be on the default path for a4x2 as the datacenter topology will probably move to BGP unnumbered.

I've been noticing random packet test failures in CI so I started to
investigate locally. What I've found is that some tests are sometimes
taking longer than 500ms, over 1000ms in some cases, to get through the
simulator, so we are not waiting long enough to get our expected packets
in tests. This commit bumps the packet test wait time from 500ms to
1500ms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants