Skip to content

calico-typha fails to start when encountering malformed network policies #11529

@dzacball

Description

@dzacball

Expected Behavior

Calico Typha should start successfully even when malformed NetworkPolicies exist.

Current Behavior

When a NetworkPolicy contains an invalid port specification such as:

ports:
  - "[443 9000]"

Typha fails and rejects the entire list of policies and remains stuck in NotReady. This cascades into calico-node also staying NotReady, effectively breaking pod networking.

Example Typha log:

Failed to perform list of current data during resync
ListRoot="/calico/resources/v3/projectcalico.org/networkpolicies"
error=invalid name for named port ([443 9000])

Possible Solution

  • stop validating the named port name at parse time (and make sure we validate it at the next step instead)
  • add CRD/CEL validation to reject invalid CRs

Steps to Reproduce (for bugs)

  1. Edit a Calico NetworkPolicy via the CRD API group (crd.projectcalico.org), bypassing Calico validation.
  2. Insert a malformed port value, e.g. "[443 9000]".
  3. Restart Typha or wait for resync.
  4. Observe that Typha becomes stuck in NotReady and logs the parsing error.

Context

A cluster experienced a full networking outage because Typha and calico-node could not become Ready due to a single malformed policy.

Your Environment

  • Calico version: v3.29.6
  • Calico dataplane: iptables

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions