Skip to content

Conversation

@jcrichlake
Copy link
Collaborator

@jcrichlake jcrichlake commented Dec 19, 2025

Summary

Changes proposed

What was added, updated, or removed in this PR.

This PR adds the library that auto generates the pydantic objects, a script for calling said library, and a readme tracking the work done and future work to take place as part of this effort.

There is also a baseline for property based testing that we can build off of as part of future work for this objective.

Context for reviewers

Testing instructions, background context, more in-depth details of the implementation, and anything else you'd like to call out or ask reviewers. Explain how the changes were verified.

You may have to run a new poetry install as part of this change

Additional information

Screenshots, GIF demos, code examples or output to help show the changes working as expected.

@github-actions github-actions bot added python Issue or PR related to Python tooling sdk Issue or PR related to our SDKs py-sdk Related to Python SDK labels Dec 19, 2025
@jcrichlake jcrichlake changed the title [Issue #412] Adding library for auto-generation and documentation of script Do not merge [Issue #412] Adding library for auto-generation and documentation of script Dec 23, 2025
@jcrichlake jcrichlake changed the title Do not merge [Issue #412] Adding library for auto-generation and documentation of script [Issue #412] Adding library for auto-generation and documentation of script Jan 8, 2026
poetry run python ./validate.py <PATH TO YAML FILE >

## Issues to be Resolved
The following issues are from comparing the generated schemas against the existing hand built schemas. Note this is not an exhaustive list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are possibly show stopper issues. The PR should not be merged until it can be proven that spec-compliant schemas can effectively be generated from code.

For reference, issues like these were what caused me to abandon the effort to auto-generate marshmallow schemas several months ago. My experience was that tooling and automation could generate 99.x% of the schemas but the last handful of edge cases were intractable, and it turned into a huge time sink to solve the edge cases.

YMMV but let's not commit new scaffolding and dependencies to the repo until it is proven to work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the generated schemas? Do they work with the example API implementations? (e.g. simpler-grants-protocol/examples/ca-opportunity-example)

The example API implementations are the main use case for CommonGrants pydantic schemas, therefore acceptance criteria for this or any auto-generation scaffolding or libraries should include validation against one or all of those use cases: generate the schemas, import them into example API, run cg check-spec, confirm no errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thinking was to commit this baseline to the repo in an isolated way that wouldn't impact the existing hand built schemas and their implementations. This would give us a foundation to build off of while not having the completion of this feature become an all consuming task at the expense of other planned work.

The generated schemas are not committed to the repo since that would clog up the codebase with 103 additional files for something that isn't complete and could lead to confusion with the existing schemas. The idea is that for now anyone who wanted to view the schema could run the bash script and it would create the schemas in the generated directory that the readme exists in.

@jcrichlake
Copy link
Collaborator Author

Type Checking Discovery

Steps Taken

  1. Update the common-grants-sdk setting in /templates/fast-api/pyproject.toml comment line 10 uncomment line 11
  2. !! Add the following script to your local codebase and run it to add title to the JSONSchema.yaml files
#!/bin/bash

# Script to set the 'title' attribute in each YAML schema file
# to match the filename (without the .yaml extension)

SCHEMAS_DIR="${1:-$(dirname "$0")/../../public/schemas/yaml}"

# Check if the directory exists
if [ ! -d "$SCHEMAS_DIR" ]; then
  echo "Error: Directory not found: $SCHEMAS_DIR"
  exit 1
fi

echo "Processing YAML files in: $SCHEMAS_DIR"

# Find .yaml files only at the root level (not in subdirectories)
find "$SCHEMAS_DIR" -maxdepth 1 -name "*.yaml" -type f | while read -r file; do
  # Get the filename without extension
  filename=$(basename "$file" .yaml)
  
  # Check if file already has a title line
  if grep -q "^title:" "$file"; then
    # Update existing title
    sed -i '' "s/^title:.*$/title: $filename/" "$file"
    echo "Updated title in: $file"
  else
    # Add title after $id line
    if grep -q '^\$id:' "$file"; then
      sed -i '' "/^\$id:/a\\
title: $filename
" "$file"
      echo "Added title to: $file"
    else
      # If no $id line, add title after $schema line
      if grep -q '^\$schema:' "$file"; then
        sed -i '' "/^\$schema:/a\\
title: $filename
" "$file"
        echo "Added title after \$schema in: $file"
      else
        # Prepend title to the file
        sed -i '' "1i\\
title: $filename
" "$file"
        echo "Prepended title to: $file"
      fi
    fi
  fi
done

echo "Done processing all YAML files."```
3.  Run the generate_models.sh script
4.  Update the `__init__.py` inside of `lib/python-sdk/common_grants_sdk/schemas/pydantic/generated` to export the generated models
5.  Update the `__init__.py` inside of `templates/fast-api/src/common_grants/schemas/__init__.py` to point to `common_grants_sdk.schemas.pydantic.generated`
6. !! Run make check-types, this will return an error


## Gaps Identified
1. We are missing the title field inside the JSONSchema YAML files, this should be fixed from the typespec generation but as a workaround for verifying our auto-generation a temp script will be included to add this field to a separate copy of the jsonSchema.yaml files 
2. The manual pydantic models have small variations in their naming from the JSONSchema (ArrayOperator vs ArrayOperators). As a workaround to this we will try and alias these to minimize the impact of the changes while we are testing. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

py-sdk Related to Python SDK python Issue or PR related to Python tooling sdk Issue or PR related to our SDKs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SDK versioning] Support toggling versions in Python SDK

3 participants