Skip to content

fix: overhaul tests #286

@jakelorocco

Description

@jakelorocco

We currently have a lot of duplicate tests that perform unnecessary requests and generations. We should refactor our tests to allow easier use of mocked inputs / outputs along with de-duplicating the amount of tests between the backends.

We should have a few types of tests:

  • Conformance tests- for each backend, we should ensure that we receive the expected response format for a given input (ie structured outputs, tool requests, etc...)
  • Backend unit tests- for each backend, we should mock these response values and ensure that our backends process these responses as expected
  • Std library tests- these resemble regular unit tests; if generation is required, we should mock that generation

We should also allow mocking to be disabled so that the full test suite can be run with real requests.

These changes should decouple our testing of the standard library / backend functionality from the tests that actually check LLM generation. This should allow us to actually run unit tests for backends during our regular test suite and only require special testing pipelines for the conformance tests.

Sub-issues

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions