fix: restore lenient variable coercion for built-in scalars in GraphQL runtime#655
fix: restore lenient variable coercion for built-in scalars in GraphQL runtime#655vsseixaso wants to merge 1 commit into
Conversation
…L runtime
graphql-js 14.5+ (bundled with @vtex/api since v6) made variable coercion
strict for the built-in `Int`, `Float`, and `String` scalars: `Int.parseValue`
rejects strings, `Float.parseValue` rejects strings, and `String.parseValue`
rejects numbers. This broke apps migrating from `node@4.x` (which used
`@vtex/api@^3` with graphql 0.13, where these conversions were implicit) to
`node@7.x` — well-formed clients started receiving errors like:
Variable "$categoryId" got invalid value "20"; Expected type Int.
Int cannot represent non-integer value: "20"
Pre-coerce variable values against the operation's variable definitions
before calling `execute()`, replicating the lenient conversions of
graphql-js 0.13 (string of finite integer → Int, number → String, etc.).
The traversal walks NonNull, List, and InputObject wrappers so nested
fields are coerced as well. Custom scalars and enums are left untouched.
Values that cannot be safely converted (e.g. `"abc"` for Int, floats with
fractional parts for Int, arrays for scalars) are passed through unchanged,
so graphql-js still produces the same native error.
Disable with `VTEX_API_DISABLE_LENIENT_VARIABLE_COERCION=true`.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
@mendescamara dropped an interesting suggestion offline that we're evaluating as an alternative to (or in addition to) the pre-processor in this PR. Posting it here so the analysis lives next to the code and we can decide together. The alternative: schema-side
|
| Input | parseInt(value, 10) |
graphql 0.13 (Number + integer check) |
What we want |
|---|---|---|---|
"20.5" |
20 (silently truncates) |
throws | throws |
"20abc" |
20 (stops at non-digit) |
throws | throws |
"3e2" |
3 (not 300) |
300 |
300 |
true |
NaN |
1 |
1 |
[20] |
20 (Array.toString) |
throws | throws |
The first row is the one that worries me most — silently truncating "20.5" to 20 would mask a real client bug instead of surfacing it. Using Number(value) + Number.isInteger() + Int32 range check would address this and match 0.13.x semantics exactly. Easy to fix in either approach.
Where the schema patch would clearly be the better tool
If we ever decide to also relax inline literals (category(id: 20.5) written directly in the query string, no variables), parseLiteral is the right hook and the schema-side approach is the natural fit. That's a separate, narrower scope — the prod logs don't show any such cases today, so it's not blocking this PR, but it's a good reason to keep the schema-side approach on the table.
Suggested next step
Happy to take the schema-side route if we can find a way to scope the mutation per schema (or convince ourselves the global mutation is acceptable). Until then, the pre-processor in this PR is a conservative way to stop the bleeding without touching shared state. WDYT?
| rootValue: null, | ||
| schema, | ||
| variableValues, | ||
| variableValues: coercedVariables, |
There was a problem hiding this comment.
Vamos precisar testar, mas acho que esse approach não vai contemplar os casos de variáveis não literais inline, quando o valor é informado diretamente na Query sem usar o parâmetro variables 🤔 (ex. user(id: "123"))
Diferente de quando passa por variável:
// A query permanece estática (não muda)
const GET_USER = `
query GetUser($userId: ID!) {
user(id: $userId) {
name
}
}
`;
// Os valores são passados separadamente
const variables = { userId: "123" };
client.query({ query: GET_USER, variables });
There was a problem hiding this comment.
Um outro approach seria fazer o Override dos Scalars Nativos via Schema Transform
Após makeExecutableSchema em schema/index.ts, manipular o _typeMap da schema para substituir os scalars built-in
There was a problem hiding this comment.
Validei aqui e inline literal não é coberto mesmo. E também o inline já era rejeitado no graphql 0.13, peguei os erros do log de prod e todos passam as variáveis pelo objeto variables, nenhum usa inline literal. Então o escopo atual resolve os casos reais, sem regressão.
Fiz um teste mandando uma query com inline literal pra mostrar pra identificar a diferença na mensagem de erro:
{
"errors": [{
"message": "Expected type Int, found \"20\".",
"stack": "GraphQLError: Expected type Int, found \"20\".\n at isValidScalar (.../ValuesOfCorrectType.js:159)\n at validate (.../validate.js:73)\n at parseAndValidateQueryToSchema (.../query.js:28)"
}]
}
Erro quando vem com Variable:
{
"errors": [{
"message": "Variable \"$categoryId\" got invalid value \"20\"; Expected type Int. Int cannot represent non-integer value: \"20\"",
"stack": "TypeError: Int cannot represent non-integer value: \"20\"\n at GraphQLScalarType.coerceInt [as parseValue] (.../scalars.js:55)\n at coerceInputValueImpl (.../coerceInputValue.js:127)\n at coerceVariableValues (.../values.js:119)\n at runHttpQuery (.../run.js:9)"
}]
}
No caso, a decisão fica sendo sobre querer suportar algo além do graphql 0.13 ou se fazer "Override dos Scalars Nativos via Schema Transform" teria algum ganho adicional
There was a problem hiding this comment.
Top! Então na versão antiga (graphql 0.13) já tinha essa proteção para o inline?
To super confortável de seguir com esse approach 👊🏻

Summary
Restore the pre-
graphql@14.5lenient coercion of GraphQL variables for the built-inInt,Float, andStringscalars in the runtime that ships with@vtex/api. Apps migrating fromnode@4.x(which used@vtex/api@^3withgraphql@^0.13.2) tonode@7.x(@vtex/api@^7withgraphql@^14.5.8) started seeing GraphQL validation errors for requests that previously worked, even though no resolver, schema, or client code had changed.This patch is a runtime-only fix in
node-vtex-api— no schema changes, no app changes, and no API surface changes.Problem
graphql-js14.5+ madecoerceInt,coerceFloat, andcoerceStringstrict. Before 14.5, those scalars usedNumber(value)andString(value)and accepted these implicit conversions; in 14.5+ they throw immediately, e.g.:These errors come from
coerceVariableValuesdeep insidegraphql-js, called from this very runtime'srun.tsmiddleware. They reproduce reliably by hitting an app's internal__graphqlroute directly (i.e. without going throughvtex.graphql-server, which already does its own pre-coercion before proxying).The same requests succeeded under
@vtex/api@^3becausegraphql@0.13.2'scoerceIntdidNumber(value)on a string andcoerceStringdidString(value)on a number. The migration fromnode@4.x→node@7.xmandates@vtex/api@^7, which transitively bumps graphql-js to^14.5.8, surfacing this regression for every app onnode@7.xwhose clients send numeric strings forInt(or vice versa).How the fix works
A new pure utility
coerceVariableValuesLenient(schema, document, variables, operationName)is invoked from the existingrunmiddleware beforeexecute():OperationDefinitionNodefor the request (resolvesoperationNamewhen there are multiple operations, no-ops on ambiguity).VariableDefinitionNode, resolves the AST type to aGraphQLInputTypeviatypeFromAST.NonNull,List, andInputObjectwrappers, applying lenient coercion at the leaves only when the target is a built-in scalar:Int← string of a finite, in-range integer; booleanFloat← string of a finite number; booleanString← number, booleanID← number → stringCustom scalars and enums are left untouched. Values that are not safely convertible (e.g.
"abc"forInt,"20.5"forInt, arrays for scalars, objects forString, integers outside Int32 range) are passed through unchanged so that graphql-js still emits its native error. We do not silently mask invalid client input.The whole pre-processor is wrapped in a try/catch that falls back to the original variables on any internal failure, so the worst case is identical to today's behavior.
Disabling
Set the env var to opt out without redeploying:
Files changed
src/service/worker/runtime/graphql/utils/coerceVariablesLenient.tssrc/service/worker/runtime/graphql/utils/coerceVariablesLenient.test.tssrc/service/worker/runtime/graphql/middlewares/run.tsexecute()CHANGELOG.md[Unreleased]entryCompatibility & safety
GraphQLInt,GraphQLString, etc. continue to be the standard built-ins; resolvers receive the same JS types as before (numberforInt,stringforString).typeFromAST,isScalarType,isListType,isNonNullType,isInputObjectType).coerceVariableValuesafter our pass; we only feed it a value of the expected JS type when a safe conversion exists.Test plan
$categoryId: Int = "20",ShippingItem.quantity = 1,ShippingItem.seller = 1,ShippingItemsent as a single object instead of a list,geoCoordinates: [String]sent as[Float],postalCode: String = 1000,pageSize: Int = "2")."abc","","20.5", arrays, objects, ints outside Int32 range) are passed through untouched.[Int!]!, nestedInputObjects, unknown fields preserved.operationNamedisambiguation, single-operation default, multi-operation no-op).VTEX_API_DISABLE_LENIENT_VARIABLE_COERCIONfeature flag.execute()call confirming the production error disappears with the fix and reappears without it.LRUCache/axios/metricsfiles — the same 4 fail onmasterwithout this change).Rollout plan
7.3.2release.node@7.xpick it up viayarn upgrade @vtex/api@^7.3.2.VTEX_API_DISABLE_LENIENT_VARIABLE_COERCION=trueto the affected runner — instant, no redeploy. Or revert the commit and ship7.3.3.Made with Cursor