Fix service mesh auto config after migration to gRPC#756
Conversation
|
Hey team, this change should be invisible, but it's probably worth mentioning in release notes, where do you recommend I mention this? Do you figure any additional testing is needed? |
|
I think the best place we have for notes about changes specific to deployment types are these old pages: https://sourcegraph.com/docs/admin/updates/kubernetes#notes-2 Pretty good testing imo, although I wouldn't mind if @keegancsmith laid eyes upon this change |
keegancsmith
left a comment
There was a problem hiding this comment.
Please announce it somewhere. It could actually break if someone, but I think in practice that is so tiny that we should rather fix it like you are doing. Thanks!
|
|
> It could actually break if someone, but I think in practice
> that is so tiny that we should rather fix it like you are
> doing.
@keegancsmith I think I'm missing something after "if someone,"
am I?
@marcleblanc2 if someone targetted the names / configured istio
(or something else) based on the names? I think that is possible
but unlikely / easy to fix for the admin.
|
Noted, thank you! Deep Search provided similar feedback, not likely, but some self-hosted customers may require manual intervention as part of their upgrade:
I'll include these in an update to https://sourcegraph.com/docs/admin/updates/kubernetes#notes-2 |
|
Hey @DaedalusG, I've applied the same fixes to the remaining services which the customer found broken, and to match, and have made some minor formatting adjustments for readability. I've re-requested your review in case you'd like to. |
|
PR for doc update here: sourcegraph/docs#1407 |
- Customer was blocked on an issue with our Helm chart + Istio - Istio uses the first part of a port name, and service definitions, to configure its service mesh sidecar proxy containers, based on the expected traffic type; docs: https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/#explicit-protocol-selection - The `http-` prefix in the port name wasn't updated when we switched to gRPC, so Istio was auto-configuring for http, and failing to send the gRPC traffic - There was an old `unused` port in the gitserver service definition, which I couldn't find any history for, it's been there for 7+ years, so I added gRPC port 3178 in its place - Making these updates automagically fixes Istio issues, no longer requiring the Envoy patch in [charts/sourcegraph/examples/envoy](https://github.com/sourcegraph/deploy-sourcegraph-helm/tree/main/charts/sourcegraph/examples/envoy) - Helpful background on Istio here https://istio.io/latest/docs/ops/deployment/architecture/ ### Checklist - [x] Follow the [manual testing process](https://github.com/sourcegraph/deploy-sourcegraph-helm/blob/main/TEST.md) - [x] Update [changelog](https://github.com/sourcegraph/deploy-sourcegraph-helm/blob/main/charts/sourcegraph/CHANGELOG.md) - [x] Update [Kubernetes update doc](https://docs.sourcegraph.com/admin/updates/kubernetes) This change should be invisible, except for customers using the Envoy patch for gitserver, so it's worth mentioning in release notes, where do you recommend I mention this? ### Test plan Developed and tested with a customer, on their self-hosted instance with Istio Tested on a k3s instance: - Deployed the 6.9.0 release - Upgraded the Helm install with this branch [marc-test-fix-gitserver-ports-for-istio](https://github.com/sourcegraph/deploy-sourcegraph-helm/tree/marc-test-fix-gitserver-ports-for-istio), which is the same code changes, but on top of the v6.9.0 release commit instead of main, to test for any issues for existing customers as they upgrade - Configured a code host - Cloned some repos - Ran some searches - Checked pod logs - No errors / issues Tested on an AWS EKS instance <br> Backport 0074a7e from #756 ## Changelog - Update port names to re-enable service mesh auto config, after migration to gRPC - Self-hosted customers using our Helm chart and a Kubernetes service mesh (ex. Istio / Envoy) who had to add an Istio [EnvoyFilter](https://github.com/sourcegraph/deploy-sourcegraph-helm/tree/main/charts/sourcegraph/examples/envoy) to enable trailers on gitserver may need to remove it, if gitserver can no longer communicate with other pods after upgrading Co-authored-by: Marc <7050295+marcleblanc2@users.noreply.github.com>
Customer was blocked on an issue with our Helm chart + Istio
Istio uses the first part of a port name, and service definitions, to configure its service mesh sidecar proxy containers, based on the expected traffic type; docs: https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/#explicit-protocol-selection
The
http-prefix in the port name wasn't updated when we switched to gRPC, so Istio was auto-configuring for http, and failing to send the gRPC trafficThere was an old
unusedport in the gitserver service definition, which I couldn't find any history for, it's been there for 7+ years, so I added gRPC port 3178 in its placeMaking these updates automagically fixes Istio issues, no longer requiring the Envoy patch in charts/sourcegraph/examples/envoy
Helpful background on Istio here https://istio.io/latest/docs/ops/deployment/architecture/
Checklist
This change should be invisible, except for customers using the Envoy patch for gitserver, so it's worth mentioning in release notes, where do you recommend I mention this?
Test plan
Developed and tested with a customer, on their self-hosted instance with Istio
Tested on a k3s instance:
Tested on an AWS EKS instance