Multivariate detector by abaranov25 · Pull Request #52 · sintel-dev/sigllm

abaranov25 · 2025-12-10T18:21:26Z

Resolve #57
Added a multivariate detector pipeline with various formatting methods.

…K-means branch

sarahmish · 2026-02-08T22:51:52Z

sigllm/primitives/forecasting/huggingface.py

        raw=False,
        samples=1,
        padding=0,
+        multivariate_allowed_symbols = [],


add multivariate_allowed_symbols to the docstrings above

sarahmish · 2026-02-08T22:53:05Z

sigllm/primitives/formatting/digit_interleave.py

@@ -0,0 +1,72 @@
+from .multivariate_formatting import MultivariateFormattingMethod


we rely on absolute imports rather than relative in our packaging:

Suggested change

from .multivariate_formatting import MultivariateFormattingMethod

from sigllm.primitives.formatting.multivariate_formatting import MultivariateFormattingMethod

sarahmish · 2026-02-08T23:00:24Z

sigllm/primitives/formatting/digit_interleave.py

+from .multivariate_formatting import MultivariateFormattingMethod
+import numpy as np


typically we follow the following structure for imports:

# python inherent libraries (e.g. import os) # 3rd party libraries (e.g. import numpy) # this library (e.g. import sigllm)

this is google python style coding, so in your case it will be:

import numpy as np from sigllm.primitives.formatting.multivariate_formatting import MultivariateFormattingMethod

sarahmish · 2026-02-08T23:01:49Z

sigllm/primitives/formatting/digit_interleave.py

+if __name__ == "__main__":
+    method = DigitInterleave(digits_per_timestamp=3)
+    method.test_multivariate_formatting_validity(verbose=False)
+    errs, y_hat, y = method.run_pipeline(return_y_hat=True)
+    print(errs)
+    print(y_hat)
+    print(y)


after you finish testing, this can be removed.

sarahmish · 2026-02-08T23:07:22Z

sigllm/primitives/formatting/multivariate_formatting.py

+        })
+
+
+    def run_pipeline(self, data=create_test_data(), 


what's the purpose of this method? It can be removed or moved to utils since it doesn't belong in formatting

sarahmish · 2026-02-08T23:11:01Z

tutorials/pipelines/detector-pipeline.ipynb

can you remove this file from the PR? I don't think it's related.

sarahmish · 2026-02-08T23:11:29Z

tutorials/pipelines/multivariate-pipeline.ipynb

rename it to multivariate-detector-pipeline

Can you make it an end-to-end tutorial of using the pipeline? In addition to the new formatting, you can have a full detection process and show the anomalies.

sarahmish

Overall the PR is great in terms of functionality but still needs to be cleaned up, I have a few comments about different aspects.

1. Unittests

Unittest are a great way to ensure the validity of your code and making sure that overtime the function is behaving as expected even if a underlying dependency changes its behavior, we can immediately catch it when we have solid unittests.

There are existing tests provided under tests/primitives that you can mimic to create your tests, typically I like there to be 3 blocks in a function:

def test_example():
    # setup, here you create your variables and instances.

    # run, here you run the function you want to test.

    # assert, here you check that the expected value matches the output.

This will make readability of the test function easier.

2. Docstrings

A couple of things should be considered regarding the docstrings for this and other PRs as well:

The - can be removed when listing the Args, so line starts directly with the argument name.
A Returns block should be added listing the return type and a description in the next line.
A blank line should always exist between the first line and the rest, and also before Args and Returns

Here's the recommended way of having docstrings.

"""Short description in a single line ending with a dot.

Longer description that can span across multiple lines. Longer
description that can span across multiple lines. Longer description
that can span across multiple lines.

Args:
    arg_name (arg_type):
        argument description.
    arg_name (arg_type):
        Argument description that spans across multiple lines. Argument
        description that spans across multiple lines. Argument description
        that spans across multiple lines.

Returns:
    return_type:
        description of the returned objects

3. Unnecessary files

This applies to this PR only. tutorials/pipelines/detector-pipeline.ipynb should not be changed in this PR.

sarahmish · 2026-02-23T12:31:36Z

sigllm/primitives/forecasting/huggingface.py

        padding (int):
            Additional padding token to forecast to reduce short horizon predictions.
            Default to `0`.
    """


multivariate_allowed_symbols should be added to the Args docstrings here

sarahmish · 2026-02-23T12:38:40Z

sigllm/primitives/formatting/json_format.py

+        results_by_step = {step: [] for step in steps_ahead}
+
+        for window in X:
+            step_samples = {step: [] for step in steps_ahead}


If you want to setup a dictionary with an empty list, you can do so using:

from collections import defaultdict step_samples = defaultdict(list)

Then any key in the dictionary will have an empty list by default.

sarahmish · 2026-02-23T12:46:39Z

sigllm/primitives/formatting/utils.py

+    })
+
+
+def test_multivariate_formatting_validity(method, verbose=False):


This function, along with create_test_data, can be moved to unit tests since that's what it's doing.
Create a file in test/primitives/formatting/test_{file_name}.py e.g. test/primitives/formatting/test_json_format.py.

There you can test if format_as_string and format_as_integer are returning the expecting output. I would strongly recommend doing this for every formatting method.

AllenBaranov added 3 commits December 8, 2025 01:17

Added multivariate detector pipeline with formatting methods

088592f

Add verbose flag to formatting methods and clean up comments

df34b4d

Added multi-step-ahead predictions and disentangled this branch from …

24c96d1

…K-means branch

sarahmish reviewed Feb 8, 2026

View reviewed changes

AllenBaranov added 4 commits February 9, 2026 02:53

Addressing comments for PR

317620e

Addressing PR Comments

8159622

Tutorial Notebook + trunc behavior

f4ea7f1

Added multivariate dataset to tutorial notebook

cf4c192

abaranov25 requested a review from sarahmish February 17, 2026 23:05

Fixed lints

8864f14

abaranov25 self-assigned this Feb 19, 2026

Merge remote-tracking branch 'origin/master' into Multivariate-Detector

6468a71

sarahmish reviewed Feb 23, 2026

View reviewed changes

AllenBaranov added 9 commits March 2, 2026 02:07

Added unit tests

46ed9ce

Fixed lints

6ef46d7

Removing unrelated tutorial from PR

7cb05ba

Addressing PR comments:

d5653cf

Addressing PR comments

57b5325

restoring detector pipeline

b998546

Fixing lint issues

2c015f0

Ran tutorial notebook to completion

056ffa9

Slight change to docstrings

36b82b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multivariate detector#52

Multivariate detector#52
abaranov25 wants to merge 18 commits intosintel-dev:masterfrom
abaranov25:Multivariate-Detector

abaranov25 commented Dec 10, 2025 •

edited by sarahmish

Loading

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish Feb 8, 2026

Uh oh!

sarahmish left a comment

Uh oh!

sarahmish Feb 23, 2026

Uh oh!

sarahmish Feb 23, 2026

Uh oh!

sarahmish Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,72 @@
		from .multivariate_formatting import MultivariateFormattingMethod

	from .multivariate_formatting import MultivariateFormattingMethod
	from sigllm.primitives.formatting.multivariate_formatting import MultivariateFormattingMethod

		from .multivariate_formatting import MultivariateFormattingMethod
		import numpy as np

		})


		def test_multivariate_formatting_validity(method, verbose=False):

Conversation

abaranov25 commented Dec 10, 2025 • edited by sarahmish Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sarahmish left a comment

Choose a reason for hiding this comment

1. Unittests

2. Docstrings

3. Unnecessary files

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abaranov25 commented Dec 10, 2025 •

edited by sarahmish

Loading