Skip to content

Conversation

@mgmeier
Copy link
Contributor

@mgmeier mgmeier commented Dec 16, 2025

Description

This PR addresses issue #6389. It aims to mitigate unresponsiveness or even crashes of the Node's internal PrometheusSimple metrics backend that may be caused by factors in the host environment.

It contains the following changes to the backend, improving robustness:

  • Make the backend auto-retry/-restart when startup fails or its thread receives an exception
  • Trace start and stop events of the backend to provide basic observability
  • Be more eager in reaping dangling socket connections
  • Fine-tune the backend's DoS protection settings somewhat

It has no impact on Node configuration or CLI arguments.

Checklist

  • Commit sequence broadly makes sense and commits have useful messages
  • New tests are added if needed and existing tests are updated. These may include:
    • golden tests
    • property tests
    • roundtrip tests
    • integration tests
      See Runnings tests for more details
  • Any changes are noted in the CHANGELOG.md for affected package
  • The version bounds in .cabal files are updated
  • CI passes. See note on CI. The following CI checks are required:
    • Code is linted with hlint. See .github/workflows/check-hlint.yml to get the hlint version
    • Code is formatted with stylish-haskell. See .github/workflows/stylish-haskell.yml to get the stylish-haskell version
    • Code builds on Linux, MacOS and Windows for ghc-9.6 and ghc-9.12
  • Self-reviewed the diff

@mgmeier mgmeier force-pushed the mkarg/restart-prometheussimple branch 2 times, most recently from 71a1ac4 to 603c46b Compare December 17, 2025 08:29
@mgmeier mgmeier marked this pull request as ready for review December 17, 2025 08:34
@mgmeier mgmeier requested review from a team as code owners December 17, 2025 08:34
Copy link
Contributor

@Russoul Russoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mgmeier mgmeier force-pushed the mkarg/restart-prometheussimple branch from 603c46b to d67d0e3 Compare December 18, 2025 13:03
@mgmeier mgmeier added this pull request to the merge queue Dec 18, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 18, 2025
@mgmeier mgmeier added this pull request to the merge queue Dec 18, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 18, 2025
@mgmeier mgmeier added this pull request to the merge queue Dec 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 19, 2025
@mgmeier mgmeier added this pull request to the merge queue Dec 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 19, 2025
@mgmeier mgmeier added this pull request to the merge queue Dec 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 19, 2025
@mgmeier mgmeier added this pull request to the merge queue Dec 19, 2025
Merged via the queue into master with commit 986d049 Dec 19, 2025
25 checks passed
@mgmeier mgmeier deleted the mkarg/restart-prometheussimple branch December 19, 2025 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants