fix: tls+opentext ports for zookeeper -> only tls (2281)#204
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces plaintext ZooKeeper client port with a TLS-only port and adds a CA-backed certificate generation flow producing keystores/truststores for ZooKeeper and a Kafka client; docker-compose now requires and mounts generated certs and uses TLS ports; docs healthcheck updated to use TLS port; tests updated to get saga_id from SAGA_STARTED event; SSE event construction delegated to adapter-based validation. Changes
Sequence DiagramsequenceDiagram
actor Operator
participant CertGen as Certificate Generator
participant CA as Certificate Authority
participant ZK as ZooKeeper
participant Kafka as Kafka
participant Store as Keystore/Truststore
Operator->>CertGen: trigger cert generation
CertGen->>CA: create CA key & self-signed cert
CA-->>CertGen: CA cert
CertGen->>CA: submit ZooKeeper CSR
CA-->>CertGen: ZooKeeper cert (CA-signed)
CertGen->>CA: submit Kafka client CSR
CA-->>CertGen: Kafka client cert (CA-signed)
CertGen->>Store: build PKCS#12 -> JKS keystore & truststore for ZooKeeper
CertGen->>Store: build PKCS#12 -> JKS keystore & truststore for Kafka client
Operator->>ZK: start ZooKeeper with TLS config (port 2281) and mounted keystores
ZK->>Store: load keystore & truststore
Operator->>Kafka: start Kafka with TLS config and client keystore
Kafka->>Store: load client keystore & truststore
Kafka->>ZK: connect via TLS/mTLS (2281)
ZK-->>Kafka: TLS handshake & client verification
Operator->>ZK: healthcheck (TLS/metrics port)
ZK-->>Operator: metrics/health response
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
2 issues found across 4 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/zookeeper/generate-certs.sh">
<violation number="1" location="backend/zookeeper/generate-certs.sh:5">
P2: Incomplete existence check: truststores are not verified. If the script is interrupted between keystore and truststore creation, a re-run will skip generation, leaving the system without truststores and causing TLS connection failures.</violation>
<violation number="2" location="backend/zookeeper/generate-certs.sh:11">
P1: The new CA private key (`ca.key`) and `kafka-client.key` will be made world-readable by the existing `chmod 644 /certs/*`. A leaked CA key allows signing arbitrary certificates trusted by both Zookeeper and Kafka. Consider restricting private key permissions to 600 and only setting 644 on certificates and keystores.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/zookeeper/generate-certs.sh (1)
33-44:⚠️ Potential issue | 🟡 MinorAdd explicit
-deststoretype JKSto ensure the keystore format matches the filename and intended format.The
keytool -importkeystorecommand without-deststoretypeuses the JDK's default keystore type, which is PKCS12 on Java 9+ (JEP 229). Confluent Platform images use Java 17 or 21 (both defaulting to PKCS12), so without the explicit flag, the resulting keystore will be PKCS12 despite the.jksfilename. Add-deststoretype JKSto ensure predictable behavior and match the filename's format intent.Suggested fix
keytool -importkeystore \ -deststorepass zookeeper_keystore_password \ -destkeypass zookeeper_keystore_password \ -destkeystore /certs/zookeeper.keystore.jks \ + -deststoretype JKS \ -srckeystore /certs/zookeeper.p12 \ -srcstoretype PKCS12 \ -srcstorepass zookeeper_keystore_password \ -alias zookeeper \ -noprompt🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/zookeeper/generate-certs.sh` around lines 33 - 44, The keytool import step currently relies on the JDK default keystore type and may produce a PKCS12 keystore despite the .jks filename; update the keytool -importkeystore invocation (the command that references -destkeystore /certs/zookeeper.keystore.jks and -srckeystore /certs/zookeeper.p12) to include the explicit flag -deststoretype JKS so the destination keystore format matches the .jks intent.docker-compose.yaml (1)
256-259:⚠️ Potential issue | 🟡 MinorRemove the unused
zoo.cfgfrom the mounted./backend/zookeeper/confdirectory.The Confluent
cp-zookeeperimage generates its ZooKeeper configuration dynamically fromZOOKEEPER_*environment variables into/etc/kafka/zookeeper.propertiesat startup, not from azoo.cfgfile. The mountedzoo.cfgis ignored and creates a maintenance trap where changes to it will have no effect. However, the directory mount is still needed forlog4j.properties, which configures ZooKeeper's logging.Recommendation: Either remove
zoo.cfgfrom the directory and document that onlylog4j.propertiesis mounted, or movezoo.cfgelsewhere to clarify that it's not actively used.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 256 - 259, The mounted ./backend/zookeeper/conf directory currently contains an unused zoo.cfg which is ignored by the cp-zookeeper image (it generates /etc/kafka/zookeeper.properties from ZOOKEEPER_* env vars); remove zoo.cfg from ./backend/zookeeper/conf (or move it to a separate docs/ or backups/ directory) and update any documentation to state that only log4j.properties in that mount is required for ZooKeeper logging, keeping the volumes entry that mounts ./backend/zookeeper/conf:/etc/kafka/conf:ro solely for log4j.properties.
🧹 Nitpick comments (5)
backend/zookeeper/conf/zoo.cfg (1)
8-11: Hardcoded keystore/truststore passwords in config file.Passwords like
zookeeper_keystore_passwordandzookeeper_truststore_passwordare committed in plaintext. For a local dev stack this is pragmatic, but if this config is ever used in a non-dev environment, these should be injected via environment variables or secrets management. Consider adding a comment noting these are dev-only values.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/zookeeper/conf/zoo.cfg` around lines 8 - 11, The config currently contains hardcoded secrets for ssl.keyStore.password and ssl.trustStore.password; change these to read from environment variables or a secrets manager (e.g., replace the literal values with references like an env variable) so passwords are injected at runtime, and add a clear comment next to ssl.keyStore.location / ssl.trustStore.location indicating the current values are dev-only placeholders and must not be used in non-dev environments.backend/zookeeper/generate-certs.sh (2)
5-8: Skip-check doesn't cover truststores.If a previous run partially failed after creating keystores but before creating truststores, re-running the script would skip generation entirely, leaving a broken PKI state. Consider checking all four output files, or alternatively, check only the CA cert (the first artifact produced).
💡 Suggested improvement
-if [ -f /certs/zookeeper.keystore.jks ] && [ -f /certs/kafka-client.keystore.jks ]; then +if [ -f /certs/zookeeper.keystore.jks ] && [ -f /certs/zookeeper.truststore.jks ] && \ + [ -f /certs/kafka-client.keystore.jks ] && [ -f /certs/kafka-client.truststore.jks ]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/zookeeper/generate-certs.sh` around lines 5 - 8, The skip-check currently only verifies /certs/zookeeper.keystore.jks and /certs/kafka-client.keystore.jks, which can falsely skip when truststores are missing; update the generation guard in generate-certs.sh to either verify all four expected output files (e.g., /certs/zookeeper.keystore.jks, /certs/kafka-client.keystore.jks, /certs/zookeeper.truststore.jks, /certs/kafka-client.truststore.jks) or, more simply and robustly, check for the CA certificate file produced first (the CA cert path used earlier in the script) and only skip when that CA cert exists, so partial runs don't leave PKI in a broken state.
10-21: Add Subject Alternative Name (SAN) extensions to certificates for best practice.The generated certificates use CN (e.g.,
CN=zookeeper) but lack SAN entries. While Java 17 will fall back to CN for DNS hostname matching, SAN is the modern standard per RFC 2818 and future-proofs the certificates against stricter hostname verification policies. Adding SAN is especially important if the environment later enforces stricter verification or uses IP-based connections (Java requires SAN for IP matching and does not fall back to CN).Example fix for ZooKeeper cert (apply similarly for Kafka client)
openssl x509 -req -days 365 -in /certs/zookeeper.csr \ -CA /certs/ca.crt -CAkey /certs/ca.key -CAcreateserial \ - -out /certs/zookeeper.crt + -out /certs/zookeeper.crt \ + -extfile <(printf "subjectAltName=DNS:zookeeper,DNS:localhost")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/zookeeper/generate-certs.sh` around lines 10 - 21, The certificates lack Subject Alternative Names (SANs); update generate-certs.sh so the zookeeper CSR and final certificate include SAN entries (e.g., DNS:zookeeper and any relevant IPs). Concretely, modify the openssl req invocation that creates /certs/zookeeper.csr to add SANs (use -addext "subjectAltName=DNS:zookeeper,IP:127.0.0.1" or an -config/-req_ext section), and ensure the openssl x509 signing step that writes /certs/zookeeper.crt includes the same SANs via -extfile/-extensions (or -addext if available) so the issued /certs/zookeeper.crt carries the subjectAltName extension. Ensure the CA cert generation remains unchanged.docker-compose.yaml (2)
285-286: Kafka mounts the entirezookeeper_certsvolume, including ZooKeeper's private key.Both
zookeeper(line 259) andkafka(line 335) mount the samezookeeper_certsvolume. This gives Kafka read access tozookeeper.keyandca.key, which it doesn't need — Kafka only requireskafka-client.keystore.jksandkafka-client.truststore.jks.For a dev setup this is acceptable, but in production you'd want separate volumes or a sidecar that copies only the needed files. Worth a TODO comment at minimum.
Also applies to: 335-335
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 285 - 286, zookeeper currently exposes the entire zookeeper_certs volume (including zookeeper.key and ca.key) to kafka; restrict kafka to only the certs it needs by creating a separate volume (e.g., zookeeper_public_certs) or a sidecar that copies only kafka-client.keystore.jks and kafka-client.truststore.jks into a kafka-only volume, update the kafka service to mount that new volume instead of zookeeper_certs, and update the zookeeper service or its init job (zookeeper-certgen) to populate both volumes (private keys stay in zookeeper_certs, public keystores go to zookeeper_public_certs); add a TODO comment near the services noting this separation for production.
227-231: Hardcoded TLS passwords in docker-compose environment.Keystore/truststore passwords (
zookeeper_keystore_password,kafka_keystore_password, etc.) are hardcoded across the compose file,zoo.cfg, andgenerate-certs.sh. For a local dev stack this is pragmatic; for any non-dev use, these should be externalized (e.g., Docker secrets,.envfile, orsecrets:top-level element).Also applies to: 297-300
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 227 - 231, Replace hardcoded keystore/truststore passwords by externalized secrets or environment variables: change the environment keys ZOOKEEPER_SSL_KEYSTORE_PASSWORD and ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD (and the analogous kafka_* variables referenced elsewhere) in the compose service definitions to reference Docker secrets or an env var (e.g., use secrets: and file-based secret mounts or ${ZOOKEEPER_SSL_KEYSTORE_PASSWORD} from an env_file), update the secret definitions at top-level and ensure secrets are created at deploy time; then update any places in zoo.cfg and generate-certs.sh that currently embed the literal passwords to read them from the corresponding secret file or environment variable (e.g., read from /run/secrets/<name> or $ENV_VAR), and ensure .env with real passwords is excluded from VCS and documented for local dev.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/zookeeper/generate-certs.sh`:
- Line 74: The script currently sets permissions for all files in /certs to 644
which makes private keys world-readable; update the permission logic in
generate-certs.sh so that private key files (ca.key, zookeeper.key,
kafka-client.key) are set to 600 (owner read/write only) while leaving
certificate/public files (e.g., *.crt, *.pem) at 644; locate the chmod
invocation (the line containing "chmod 644 /certs/*") and replace it with
targeted chmod calls or a selective loop that applies chmod 600 to those key
filenames and chmod 644 to the rest.
---
Outside diff comments:
In `@backend/zookeeper/generate-certs.sh`:
- Around line 33-44: The keytool import step currently relies on the JDK default
keystore type and may produce a PKCS12 keystore despite the .jks filename;
update the keytool -importkeystore invocation (the command that references
-destkeystore /certs/zookeeper.keystore.jks and -srckeystore
/certs/zookeeper.p12) to include the explicit flag -deststoretype JKS so the
destination keystore format matches the .jks intent.
In `@docker-compose.yaml`:
- Around line 256-259: The mounted ./backend/zookeeper/conf directory currently
contains an unused zoo.cfg which is ignored by the cp-zookeeper image (it
generates /etc/kafka/zookeeper.properties from ZOOKEEPER_* env vars); remove
zoo.cfg from ./backend/zookeeper/conf (or move it to a separate docs/ or
backups/ directory) and update any documentation to state that only
log4j.properties in that mount is required for ZooKeeper logging, keeping the
volumes entry that mounts ./backend/zookeeper/conf:/etc/kafka/conf:ro solely for
log4j.properties.
---
Nitpick comments:
In `@backend/zookeeper/conf/zoo.cfg`:
- Around line 8-11: The config currently contains hardcoded secrets for
ssl.keyStore.password and ssl.trustStore.password; change these to read from
environment variables or a secrets manager (e.g., replace the literal values
with references like an env variable) so passwords are injected at runtime, and
add a clear comment next to ssl.keyStore.location / ssl.trustStore.location
indicating the current values are dev-only placeholders and must not be used in
non-dev environments.
In `@backend/zookeeper/generate-certs.sh`:
- Around line 5-8: The skip-check currently only verifies
/certs/zookeeper.keystore.jks and /certs/kafka-client.keystore.jks, which can
falsely skip when truststores are missing; update the generation guard in
generate-certs.sh to either verify all four expected output files (e.g.,
/certs/zookeeper.keystore.jks, /certs/kafka-client.keystore.jks,
/certs/zookeeper.truststore.jks, /certs/kafka-client.truststore.jks) or, more
simply and robustly, check for the CA certificate file produced first (the CA
cert path used earlier in the script) and only skip when that CA cert exists, so
partial runs don't leave PKI in a broken state.
- Around line 10-21: The certificates lack Subject Alternative Names (SANs);
update generate-certs.sh so the zookeeper CSR and final certificate include SAN
entries (e.g., DNS:zookeeper and any relevant IPs). Concretely, modify the
openssl req invocation that creates /certs/zookeeper.csr to add SANs (use
-addext "subjectAltName=DNS:zookeeper,IP:127.0.0.1" or an -config/-req_ext
section), and ensure the openssl x509 signing step that writes
/certs/zookeeper.crt includes the same SANs via -extfile/-extensions (or -addext
if available) so the issued /certs/zookeeper.crt carries the subjectAltName
extension. Ensure the CA cert generation remains unchanged.
In `@docker-compose.yaml`:
- Around line 285-286: zookeeper currently exposes the entire zookeeper_certs
volume (including zookeeper.key and ca.key) to kafka; restrict kafka to only the
certs it needs by creating a separate volume (e.g., zookeeper_public_certs) or a
sidecar that copies only kafka-client.keystore.jks and
kafka-client.truststore.jks into a kafka-only volume, update the kafka service
to mount that new volume instead of zookeeper_certs, and update the zookeeper
service or its init job (zookeeper-certgen) to populate both volumes (private
keys stay in zookeeper_certs, public keystores go to zookeeper_public_certs);
add a TODO comment near the services noting this separation for production.
- Around line 227-231: Replace hardcoded keystore/truststore passwords by
externalized secrets or environment variables: change the environment keys
ZOOKEEPER_SSL_KEYSTORE_PASSWORD and ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD (and the
analogous kafka_* variables referenced elsewhere) in the compose service
definitions to reference Docker secrets or an env var (e.g., use secrets: and
file-based secret mounts or ${ZOOKEEPER_SSL_KEYSTORE_PASSWORD} from an
env_file), update the secret definitions at top-level and ensure secrets are
created at deploy time; then update any places in zoo.cfg and generate-certs.sh
that currently embed the literal passwords to read them from the corresponding
secret file or environment variable (e.g., read from /run/secrets/<name> or
$ENV_VAR), and ensure .env with real passwords is excluded from VCS and
documented for local dev.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docker-compose.yaml (1)
221-234: Duplicate 4lw whitelist configuration.
-Dzookeeper.4lw.commands.whitelist=*is set inKAFKA_OPTS(line 224) and again asZOOKEEPER_4LW_COMMANDS_WHITELIST: '*'(line 234). The env-var form is the canonical way for the Confluent image; the JVM property inKAFKA_OPTSis redundant. Also, whitelisting all commands (*) is more permissive than needed — in production you'd typically restrict tomntr, ruok, srvr, stator the subset you actually use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 221 - 234, Remove the redundant 4lw whitelist JVM property from KAFKA_OPTS (remove "-Dzookeeper.4lw.commands.whitelist=*") and rely on the canonical env var ZOOKEEPER_4LW_COMMANDS_WHITELIST for the Confluent image; also tighten ZOOKEEPER_4LW_COMMANDS_WHITELIST from '*' to a restricted comma-separated list such as "mntr,ruok,srvr,stat" (or the minimal subset your deployment requires) to avoid permitting all 4lw commands.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker-compose.yaml`:
- Around line 226-231: Replace hardcoded JKS passwords by parameterizing the
compose file to use environment substitution with defaults (like how
MONGO_ROOT_PASSWORD is handled): change ZOOKEEPER_SSL_KEYSTORE_PASSWORD and
ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD to reference env vars (e.g.,
${ZOOKEEPER_SSL_KEYSTORE_PASSWORD:-default}) and do the same for the
kafka_keystore_password/kafka_truststore_password variables; also update the
cert generation script to read those same environment variables so generated
keystore/truststore passwords stay in sync with the compose configuration.
---
Nitpick comments:
In `@docker-compose.yaml`:
- Around line 221-234: Remove the redundant 4lw whitelist JVM property from
KAFKA_OPTS (remove "-Dzookeeper.4lw.commands.whitelist=*") and rely on the
canonical env var ZOOKEEPER_4LW_COMMANDS_WHITELIST for the Confluent image; also
tighten ZOOKEEPER_4LW_COMMANDS_WHITELIST from '*' to a restricted
comma-separated list such as "mntr,ruok,srvr,stat" (or the minimal subset your
deployment requires) to avoid permitting all 4lw commands.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docker-compose.yaml (1)
268-273:⚠️ Potential issue | 🟠 Major
nc -zhealthcheck only verifies TCP reachability — TLS misconfiguration still reports healthy.
nc -z localhost 2281opens a TCP connection and immediately closes it without performing a TLS handshake. ZooKeeper's Netty listener accepts the TCP SYN before any certificate exchange occurs, so this check passes even when the keystore or truststore is misconfigured and Kafka's mTLS negotiation would fail. Since the deployment enforcesZOOKEEPER_SSL_CLIENT_AUTH: needand Kafka connects withKAFKA_ZOOKEEPER_SSL_CLIENT_ENABLE: 'true', a healthy healthcheck followed by TLS handshake failure creates a harder-to-diagnose crash loop.The suggested openssl s_client probe below is executable: openssl is available in the image, and the PEM certificates generated by the certgen service are mounted at
/etc/kafka/certs/in the ZooKeeper container.💡 Stronger healthcheck alternative (openssl is available in the image)
healthcheck: - test: ["CMD-SHELL", "nc -z localhost 2281"] + test: ["CMD-SHELL", "echo '' | openssl s_client -connect localhost:2281 -CAfile /etc/kafka/certs/ca.crt -cert /etc/kafka/certs/zookeeper.crt -key /etc/kafka/certs/zookeeper.key -quiet 2>&1 | grep -q 'Verify return code: 0'"] interval: 3s timeout: 5s retries: 15 start_period: 5sThis probe verifies the TLS handshake completes successfully, catching misconfigured keystores before Kafka attempts to connect.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 268 - 273, The current healthcheck block uses "test: [\"CMD-SHELL\", \"nc -z localhost 2281\"]" which only verifies TCP reachability; replace the test command in the healthcheck (the "healthcheck" stanza / "test" field) to run an openssl TLS handshake probe (e.g. using "openssl s_client") against localhost:2281 and point it to the certs mounted at /etc/kafka/certs/ so the handshake and client auth are validated; keep the rest of the healthcheck settings (interval, timeout, retries, start_period) but ensure the new test returns non-zero on TLS failure so the container is marked unhealthy when keystore/truststore or mTLS is misconfigured.
🧹 Nitpick comments (2)
docker-compose.yaml (2)
278-282:zookeeper-certgendependency in Kafka'sdepends_onis redundant.Kafka already depends on
zookeeper: service_healthy, andzookeeperitself depends onzookeeper-certgen: service_completed_successfully. By the time ZooKeeper is healthy, cert generation has already finished, so the explicitzookeeper-certgenentry in Kafka'sdepends_onis unreachable dead config.♻️ Proposed cleanup
depends_on: - zookeeper-certgen: - condition: service_completed_successfully zookeeper: condition: service_healthy🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 278 - 282, Remove the redundant depends_on entry for zookeeper-certgen under the Kafka service: since kafka already depends on zookeeper with condition service_healthy and the zookeeper service itself depends on zookeeper-certgen with condition service_completed_successfully, the explicit zookeeper-certgen dependency in Kafka is dead config; update the Kafka service's depends_on block to only reference zookeeper (condition: service_healthy) and remove the zookeeper-certgen entry to simplify the compose file.
222-224:KAFKA_OPTSduplicates settings already expressed through typed env vars.
ZOOKEEPER_SERVER_CNXN_FACTORY(line 223) already translates toserverCnxnFactory=…inzoo.cfg; the-Dzookeeper.serverCnxnFactory=…JVM system property inKAFKA_OPTS(line 224) sets the same thing via a different mechanism. LikewiseZOOKEEPER_4LW_COMMANDS_WHITELIST: '*'(line 234) duplicates the-Dzookeeper.4lw.commands.whitelist=*flag already inKAFKA_OPTS. The env-var approach is idiomatic forcp-zookeeperand the JVM property duplication can be removed to reduce confusion.♻️ Proposed cleanup
KAFKA_OPTS: '-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory - -Dzookeeper.4lw.commands.whitelist=* -Djava.security.auth.login.config=/etc/kafka/secrets/kafka_jaas.conf'Remove the
serverCnxnFactoryand4lwflags fromKAFKA_OPTS, retaining only the JAAS config system property there (which cannot be expressed as aZOOKEEPER_env var).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yaml` around lines 222 - 224, KAFKA_OPTS currently duplicates settings already provided via ZOOKEEPER_SERVER_CNXN_FACTORY and ZOOKEEPER_4LW_COMMANDS_WHITELIST; remove the redundant JVM system properties "-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory" and "-Dzookeeper.4lw.commands.whitelist=*" from the KAFKA_OPTS value and leave only the JAAS config property "-Djava.security.auth.login.config=/etc/kafka/secrets/kafka_jaas.conf", keeping ZOOKEEPER_SERVER_CNXN_FACTORY and ZOOKEEPER_4LW_COMMANDS_WHITELIST env vars as the authoritative configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker-compose.yaml`:
- Around line 304-306: Set KAFKA_ZOOKEEPER_SET_ACL to 'true' so ZooKeeper znodes
get ACLs when brokers authenticate via the existing ZOOKEEPER_SSL_CLIENT_AUTH
and KAFKA_OPTS (SASL/mTLS) settings; update the docker-compose env value for
KAFKA_ZOOKEEPER_SET_ACL and ensure the cluster operator runs the
zookeeper-security-migration tool to apply ACLs to existing znodes before
rolling brokers; also verify KAFKA_AUTHORIZER_CLASS_NAME and KAFKA_SUPER_USERS
are configured consistently with ACL-based authorization.
- Around line 211-213: The cp-zookeeper:7.8.2 image requires
ZOOKEEPER_CLIENT_PORT to be set even when using ZOOKEEPER_SECURE_CLIENT_PORT, so
add ZOOKEEPER_CLIENT_PORT (e.g., "2181") to the environment alongside
ZOOKEEPER_SECURE_CLIENT_PORT in the docker-compose service for
confluentinc/cp-zookeeper:7.8.2, do not expose or bind that plaintext port in
the compose ports mapping, and ensure ZooKeeper is configured to disable the
plaintext listener (clientPort) at the application level so only the secure
listener (ZOOKEEPER_SECURE_CLIENT_PORT) is active.
---
Outside diff comments:
In `@docker-compose.yaml`:
- Around line 268-273: The current healthcheck block uses "test: [\"CMD-SHELL\",
\"nc -z localhost 2281\"]" which only verifies TCP reachability; replace the
test command in the healthcheck (the "healthcheck" stanza / "test" field) to run
an openssl TLS handshake probe (e.g. using "openssl s_client") against
localhost:2281 and point it to the certs mounted at /etc/kafka/certs/ so the
handshake and client auth are validated; keep the rest of the healthcheck
settings (interval, timeout, retries, start_period) but ensure the new test
returns non-zero on TLS failure so the container is marked unhealthy when
keystore/truststore or mTLS is misconfigured.
---
Duplicate comments:
In `@docker-compose.yaml`:
- Around line 226-231: The compose snippet currently hardcodes
ZOOKEEPER_SSL_KEYSTORE_PASSWORD and ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD; change
this to pull secrets or environment variables instead (e.g., reference external
Docker secrets or use env_file/${VAR_NAME} substitution) and remove the literal
passwords. Update the service to use secret names or variables (retain the env
var keys ZOOKEEPER_SSL_KEYSTORE_PASSWORD and ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD
so callers remain the same) and add corresponding secret definitions or .env
entries so the actual passwords are not stored in the repo.
- Around line 290-294: The docker-compose currently hardcodes
keystore/truststore passwords in the env vars
KAFKA_ZOOKEEPER_SSL_KEYSTORE_PASSWORD and
KAFKA_ZOOKEEPER_SSL_TRUSTSTORE_PASSWORD (and similarly any other plaintext
creds); remove the literal values and wire these to a secure source instead —
e.g., use Docker secrets (declare secrets and reference them in the service), or
read from an external .env that is excluded from version control, or mount a
credential file and reference it, and update the compose service to use the
secret names (and ensure the secret files/secret store are provisioned and not
checked in).
---
Nitpick comments:
In `@docker-compose.yaml`:
- Around line 278-282: Remove the redundant depends_on entry for
zookeeper-certgen under the Kafka service: since kafka already depends on
zookeeper with condition service_healthy and the zookeeper service itself
depends on zookeeper-certgen with condition service_completed_successfully, the
explicit zookeeper-certgen dependency in Kafka is dead config; update the Kafka
service's depends_on block to only reference zookeeper (condition:
service_healthy) and remove the zookeeper-certgen entry to simplify the compose
file.
- Around line 222-224: KAFKA_OPTS currently duplicates settings already provided
via ZOOKEEPER_SERVER_CNXN_FACTORY and ZOOKEEPER_4LW_COMMANDS_WHITELIST; remove
the redundant JVM system properties
"-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory"
and "-Dzookeeper.4lw.commands.whitelist=*" from the KAFKA_OPTS value and leave
only the JAAS config property
"-Djava.security.auth.login.config=/etc/kafka/secrets/kafka_jaas.conf", keeping
ZOOKEEPER_SERVER_CNXN_FACTORY and ZOOKEEPER_4LW_COMMANDS_WHITELIST env vars as
the authoritative configuration.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/sse/sse_service.py`:
- Around line 149-154: The dict currently spreads **msg.data after the explicit
"event_type": msg.event_type which allows msg.data["event_type"] to overwrite
msg.event_type; change the merge order in the return passed to
_sse_event_adapter.validate_python so that **msg.data comes first and then
"event_type": msg.event_type (keeping execution_id and result as-is) to ensure
the explicit msg.event_type wins; update the call site where
_sse_event_adapter.validate_python is invoked (the return statement building the
dict) to reorder keys accordingly.
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/app/db/repositories/saga_repository.py">
<violation number="1" location="backend/app/db/repositories/saga_repository.py:88">
P2: `atomic_cancel_saga` bypasses the revision check but doesn’t bump `revision_id`, so concurrent `save_saga()` calls can still succeed after cancellation and mutate a cancelled saga. Update the revision as part of this atomic update to preserve optimistic concurrency guarantees.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|



Summary by cubic
Switch Kafka to single-node KRaft and remove ZooKeeper; bind external ports to localhost. Make saga cancellation atomic with a revision bump and tighten SSE event typing; update compose, CI/CD, and docs.
New Features
Bug Fixes
Written for commit cb84794. Summary will update on new commits.
Summary by CodeRabbit
Chores
Documentation
Tests
Chores