Skip to content

Fix HookedTransformerConfig rotary_base types#1231

Open
brendanlong wants to merge 1 commit intoTransformerLensOrg:mainfrom
brendanlong:brendanlong/hooked-transformer-rope-base-type
Open

Fix HookedTransformerConfig rotary_base types#1231
brendanlong wants to merge 1 commit intoTransformerLensOrg:mainfrom
brendanlong:brendanlong/hooked-transformer-rope-base-type

Conversation

@brendanlong
Copy link
Copy Markdown
Contributor

@brendanlong brendanlong commented Apr 3, 2026

Description

rotary_base is frequently set to floats in the code but was typed as
an int, causing beartype errors if the configs get loaded in a test:

HF confgs' allegedly always have rope_theta as a float:
https://github.com/huggingface/transformers/blob/c38b2fb78eaedd4261a0e446f7976345cd1c7f1b/src/transformers/modeling_rope_utils.py#L645

But sometimes they're actually ints, and beartype doesn't consider int
to be a subtype of float:
beartype/beartype#66

This updates the type to Union[float, int] to be accurate while keeping
beartype happy.

See:

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@brendanlong
Copy link
Copy Markdown
Contributor Author

I'm tempted to add a TypedDict for the configs so this kind of mismatch would be caught by mypy instead of being a runtime error during tests, and I also noticed that there's a bunch of missing type annotations on function return types in loading_from_pretrained.py but wanted to avoid making a massive PR and wasn't sure if those improvements would actually be helpful.

@brendanlong brendanlong force-pushed the brendanlong/hooked-transformer-rope-base-type branch 2 times, most recently from ccce13e to 6fec5c4 Compare April 3, 2026 06:10
rotary_base is frequently set to floats in the code but was typed as
an int, causing beartype errors if the configs get loaded in a test:
https://github.com/TransformerLensOrg/TransformerLens/blob/9c5a2a81674d5bcefa641c816b66e9827ccdf637/transformer_lens/loading_from_pretrained.py#L1984

HF confgs' allegedly always have rope_theta as a float:
https://github.com/huggingface/transformers/blob/c38b2fb78eaedd4261a0e446f7976345cd1c7f1b/src/transformers/modeling_rope_utils.py#L645

But sometimes they're actually ints, and beartype doesn't consider int
to be a subtype of float:
beartype/beartype#66

This updates the type to Union[float, int] to be accurate while keeping
beartype happy.
@brendanlong brendanlong force-pushed the brendanlong/hooked-transformer-rope-base-type branch from 6fec5c4 to a36e7c9 Compare April 3, 2026 06:11
@brendanlong
Copy link
Copy Markdown
Contributor Author

I could change the type to just float but we'd have to sprinkle float() casts everywhere since ints are valid in HF configs despite the float type, but they're invalid in beartype. We can't centralize this in HookedTransformerConfig since we'd need to do the cast before pass it into that class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant