Feat/new parsing layer#20
Merged
Merged
Conversation
440696b to
8f35428
Compare
Normalises an *ast.CommentGroup into a slice of positioned Lines: comment markers stripped, leading-asterisk style handled, verbatim YAML bodies preserved between fences. Each Line records its absolute file:line:col so downstream diagnostics and (future) LSP positions stay anchored to the source. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Stateful lexer over the preprocessed []Line surface. Emits positioned tokens classifying annotation headers, keyword left-hand-sides, structured values, list markers, prose runs, and verbatim YAML bodies. Keyword vocabulary lives in keywords.go and covers every swagger:* keyword surface. The lexer is free of regular expressions. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Top-down parser consuming the lexer's token stream. Dispatches into a typed Block family — SchemaBlock, ParameterBlock, ResponseBlock, OperationBlock, MetaBlock, RouteBlock, NameBlock, ModelBlock, EnumBlock — each exposing typed accessors for its annotation arg, keyword properties, body, and embedded sub-languages. NewParser returns a Parser with optional WithDiagnosticSink. The ParseAll entry point yields one Block per annotation on the comment group; ParseBlock is the convenience for the primary one. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Walker is the visitor that drives parsed Blocks through caller-supplied callbacks (one per keyword class, plus a diagnostic sink). Builders register their typed handlers once and let the Walker schedule them in source order. Diagnostic types carry severity + positioned source span for downstream consumption (typically the consumer's OnDiagnostic callback). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Thin YAML decoder used by the grammar's typed-extensions surface (x-* vendor extensions) and by operation / meta body unmarshal. Dedent and per-line position preservation are kept so error sites map back to the original source. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Parses the multi-line body grammar nested under swagger:route — parameters, responses, consumes, produces, schemes, security — and surfaces it via synthetic grammar.Block construction so downstream builders reuse the schema/parameters/response code paths without route-specific branches. Companion security parser handles inline security-requirement lines (e.g. `oauth2: [read, write]`) used by both routes and the spec meta block. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
common.Builder holds the per-decl state embedded by every concrete builder: scanner context, active declaration, parsed-block cache (memoised by *ast.CommentGroup pointer), diagnostic accumulator, post-decl queue with per-Builder dedup, and the slog logger. MakeRef writes `$ref: "#/definitions/<name>"` onto a SwaggerTypable target. AppendPostDecl enqueues the referenced decl on the post-decl queue for the orchestrator's discovery loop. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Reusable Walker keyword handlers shared across schema, parameters, responses, headers: Number, Integer, UniqueBool, PatternString, Extension, and the parameter-level / schema-level dispatch factories. Built once here so each per-decl builder wires them into its Walker rather than re-implementing. keywords.go catalogs the SimpleSchema-allowed subset for the parameter and header surfaces. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
CoerceEnum, ParseDefault, IsLegalForType: convert grammar-level values to Go-typed Swagger payloads after checking the keyword is legal for the field's underlying Go type. Used by the Walker handlers to reject malformed input early with positioned diagnostics rather than silently producing invalid spec. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
SwaggerSchemaForType resolves a go/types.Type to its Swagger representation (basic types, named types, slices, maps, pointers, interfaces). Identity and assertion helpers, plus the ItemsTypable / ItemsValidations ifaces adapters that let array-of-array descent share Walker handlers with the top-level schema builder. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Builds OAI2 Swagger schemas from discovered Go types, consuming grammar-parsed comment blocks. Supports the full Schema surface (allOf compounds for $ref overrides, struct / embedded field walks, named-type aliases, enum inlining, special-type recognition for `error` and `time.Time`) and the SimpleSchema mode — the cut-down keyword surface used by non-body parameters and response headers. Field walker dispatches per-call-site classifiers in a single pass over the doc; allOf $ref overrides are wrapped so vendor extensions surface at the outer compound; DescWithRef toggles the description-on-$ref behaviour; SkipExtensions suppresses x-go-* vendor extensions. Invalid constructs emit a diagnostic (not consumed for now). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Parses swagger:parameters and swagger:response struct declarations, dispatching each field through the schema builder in full or SimpleSchema mode based on the in: location. Parameters and responses share Walker dispatch, the Typable adapter, and the doc-signal classifier. Body parameters and body responses delegate fully to the schema builder; non-body parameters and response headers use SimpleSchema; swagger:file gates the file-upload form. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Parses swagger:operation annotations. The annotation body is YAML; the parser delegates to parsers/yaml for unmarshal then maps the resulting structure onto spec.Operation, surfacing parameter / response / consumer / producer / scheme / security properties to the orchestrator. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Parses swagger:route annotations: header line (method + path + operation ID + tags), then a multi-line body grammar consumed via parsers/routebody. Surfaces parameters, responses, consumers, producers, schemes, and security to the orchestrator. Synonym to swagger:operation but radically different in shape. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Top-level orchestrator: parses swagger:meta to seed Info / Schemes / Host / BasePath / Consumes / Produces / Security / SecurityDefinitions / Tags / extensions. Then drives the discovery loop — visiting each per-decl Builder's post-decl queue, re-deduping at consumption time, and assembling the final *spec.Swagger document returned by the public Run entry point. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Adapts the existing scanner to the grammar surface with minimal change. Wires Options.OnDiagnostic into the per-decl Builders, exposes the ScanCtx FileSet so grammar parsers produce position-stable diagnostics, and surfaces the scanner-level annotation classifiers (ExtractAnnotation, ModelOverride, ResponseOverride, ParametersOverride) under internal/parsers/. Note: a handful of regular expressions remain in the scanner — annotation discovery, model / response / parameters override matching, route / operation path-annotation tokenisation. Removing these is deferred to a forthcoming scanner-focused change. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
The keystone that warrants this refactor: every change is validated
by comparing produced spec JSON against captured golden fixtures,
regenerable in bulk via UPDATE_GOLDEN=1 go test ./...
internal/scantest — load helpers and CompareOrDumpJSON
internal/integration — black-box scanner runs across fixture trees
fixtures/goparsing — historic corpus (classification, petstore,
go118/go119/go123 variants, invalid inputs)
fixtures/enhancements — one sub-tree per isolated branch-coverage
scenario (swagger-type-array, alias-expand, allof-edges,
named-basic, interface-methods, …)
fixtures/bugs — minimised repros for specific upstream IDs
fixtures/integration/golden — captured spec.Swagger JSON
Quirks surfaced and fixed during the refactor, now reflected in the
goldens:
- Enum/const detection: comma-list whitespace trim; no-matching-
const warning; stale x-go-enum-desc cleared on inline override.
- Parameter/response parity: explicit `in: header` default;
diagnostic on invalid `in:` values; $ref on response headers
suppressed (SetRef no-ops under non-body mode); swagger:file
gated to in:body with diagnostic on misuse; buildFieldAlias
gate brought to parity with parameters; unexported-field skip
logged for parity with parameters.
- Multi-line bodies: YAML-list sub-parser handles nested route
body blocks; annotation terminator must start at line-start.
- Description accumulation: leading-space artifact stripped on
routebody response descriptions when description: leads a
multi-token line.
- $ref / description coexistence: DescWithRef revived as the
description-on-$ref toggle; $ref-with-overrides wrapped as
allOf compounds so vendor extensions surface on the outer
compound; SkipExtensions=true produces bare $ref for
description-only $ref'd fields.
- Schema validation gating: type-aware shape check rejects
keywords illegal for the field's underlying Go type, with
positioned diagnostics rather than silent invalid spec.
- Post-decl propagation: map-body sub-build propagates its
PostDeclarations back to the outer Builder so referenced types
reach the orchestrator's discovery loop.
Remaining quirks tracked as known issues for follow-up.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Four documents under ./docs:
annotations.md — author cheatsheet: every swagger:* annotation
with argument shape, Go code samples, fixture
pointers, and a compatibility matrix.
keywords.md — per-keyword reference card with value shapes,
annotation contexts, and aliases.
sub-languages.md — embedded mini-languages: flex-list, route
body, response body, YAML extensions, security
requirements, contact/license inline forms.
grammar.md — formal EBNF: preprocessor, lexer, parser,
Walker, diagnostics, and what the grammar does
not cover.
Hugo frontmatter (title + weight) is present in anticipation of a
future docs site; the documents render fine as plain markdown
today.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
aae1738 to
1e298a5
Compare
Catch-all for non-code housekeeping that lands with this branch:
go.mod / go.sum — version bumps; no new dependencies introduced.
.golangci.yml — linter config updates.
.gitattributes — line-ending normalisation hints.
.claude/CLAUDE.md — project notes refreshed to reflect the new
package layout.
internal/parsers/grammar/README.md — long-form maintainer notes
for the grammar package (parallels the
per-package READMEs added under builders/).
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
1e298a5 to
65b35cf
Compare
…Model Mechanical migration of the eight remaining FindModel call sites in the parameters and responses builders to the explicit pair GetModel (pure read) + AppendPostDecl (queueing for the orchestrator's discovery loop). FindModel's implicit registration in ExtraModels — which fires on the FindDecl-fallback path — surprises readers and would pull stdlib types like time.Time / json.RawMessage into top-level definitions when they should be inlined where referenced. No golden impact: the current fixture corpus reaches these paths only through schema.applyStdlibSpecials, which short-circuits stdlib types before any model lookup. The change closes the hole defensively rather than as a regression fix. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Adds an alias-to-unannotated-target fixture exercised under RefAliases=true. The body of WitnessParams and WitnessResponse is a Go alias whose RHS is an unannotated struct — exactly the shape that, under the old FindModel, would have triggered the implicit ExtraModels registration on the FindDecl-fallback path. Verified A/B against 21148c1^: the golden is byte-identical under the deprecated FindModel and the explicit GetModel + AppendPostDecl pair. PlainTarget reaches spec.definitions through the orchestrator's discovery loop in both cases. The fixture locks the equivalence and serves as institutional memory. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change type
Please select: 🆕 New feature or enhancement|🔧 Bug fix'|📃 Documentation update
Short description
Fixes
Full description
Checklist