JS: Do not taint whole array when storing into ArrayElement #18790

asgerf · 2025-02-14T15:07:21Z

When a flow summary uses ArrayElement in its output column, we'd previously generate a taint step in addition to a store step into the array. This PR removes that rule, so that reading from ArrayElement generates a taint step, but storing into ArrayElement does not.

The rule was there in order to be consistent with how things worked with the old data flow library, where arrays were generally considered tainted if they contained a tainted element, but we don't really do that anymore. This PR makes a couple of semi-related changes in order to recover from observed FPs/FNs resulting from the first commit.

The DCA run looks good:

4 fixed FPs
1 lost result from ExceptionXss which is due to that query having an AP limit of 1, which is no longer enough to find the flow. The result was not exploitable; it seemed like the sort of FP some users might have appreciated and fixed anyway, but on the whole it's not critical to preserve it.
A minor but clean speedup: Average of 3.5% speedup with 8% and 9% speedup on the two slowest projects.

No changes in actual alerts

This flow was lost since the existing model of concat() boxes its return value in ArrayElement. There is no explicit model of Buffer.concat.

Not all of lodash, just the callbacks we already modeled plus a few easy ones

getAPropertyWrite() contains getALocalSource() under the the hood. Don't rely on that to find the successor of a mutation.

javascript/ql/lib/semmle/javascript/frameworks/LodashUnderscore.qll

erik-krogh

Looks good, just two minor questions.

erik-krogh · 2025-02-18T13:40:01Z

javascript/ql/lib/semmle/javascript/internal/flow_summaries/AmbiguousCoreMethods.qll

 }
+
+class ToString extends SummarizedCallable {
+  ToString() { this = "Object#toString / Array#toString" }


Is this a model of Object#toString? It only handles array elements.

I was a bit torn about what to name the summary. It applies to all method calls to .toString() so I thought it would look confusing if someone was to look at the call graph and see all toString calls resolving to Array#toString. It makes it clear that this summary is responsible for toString calls in general.

The "correct" model for Object#toString is an empty set of flow summaries so there's just nothing to do for that case.

erik-krogh · 2025-02-18T15:24:34Z

javascript/ql/lib/semmle/javascript/frameworks/LodashUnderscore.qll

+
+    override predicate propagatesFlow(string input, string output, boolean preservesValue) {
+      input = "Argument[0].ArrayElement" and
+      output = ["Argument[1].Parameter[1]", "ReturnValue"] and


Suggested change

output = ["Argument[1].Parameter[1]", "ReturnValue"] and

output = ["Argument[1].Parameter[0]", "ReturnValue"] and

I found this usage example: _.sortBy(users, [function(o) { return o.user; }]);
Seems the relevant value is the first parameter.

Also, you can specify an array of functions, but I'm not sure we need to model that.

Good catch, fixed the parameter index. I was just trying to reach parity with the old model, but without jump steps.

We need a more scalable way to make tests for libraries like lodash. I actually started writing tests for these but realised I was basically writing the same code twice in two different forms and would be likely to repeat the same mistakes in both places. I tried a rather simple use of copilot auto completion to generate tests, but it did not work well.

JS: Do not store into arrays implicitly

283954d

github-actions bot added the JS label Feb 14, 2025

asgerf added 10 commits February 17, 2025 10:36

JS: Update changes to nodes/edges/subpaths

d79f429

No changes in actual alerts

JS: Add test for flow through Buffer.concat

e8d1703

This flow was lost since the existing model of concat() boxes its return value in ArrayElement. There is no explicit model of Buffer.concat.

JS: Model Array#toString

d87534c

JS: Add test with implicit array stringification

a74b203

JS: Handle Array.prototype.toString calls

33ab7db

JS: Handle a few other stringification contexts

352924f

JS: Add a negative test

08b9d93

JS: Convert some exception steps to legacy

4e325d9

JS: Port lodash callback steps to flow summaries

6e074c3

Not all of lodash, just the callbacks we already modeled plus a few easy ones

JS: Target post-update node instead of getALocalSource

a54f0a7

getAPropertyWrite() contains getALocalSource() under the the hood. Don't rely on that to find the successor of a mutation.

github-advanced-security bot found potential problems Feb 17, 2025

View reviewed changes

javascript/ql/lib/semmle/javascript/frameworks/LodashUnderscore.qll Fixed Show fixed Hide fixed

asgerf added 2 commits February 17, 2025 20:30

JS: Accept some unproblematic consistency warnings

c958702

JS: Linter fix

e610683

asgerf changed the title ~~JS: Do not store into arrays implicitly~~ JS: Do not taint whole array when storing into ArrayElement Feb 18, 2025

JS: Change note

82a4b17

github-actions bot added the documentation label Feb 18, 2025

asgerf marked this pull request as ready for review February 18, 2025 08:44

asgerf requested a review from a team as a code owner February 18, 2025 08:44

erik-krogh previously approved these changes Feb 18, 2025

View reviewed changes

JS: Fix model of _.sortBy

7486742

asgerf dismissed erik-krogh’s stale review via 7486742 February 18, 2025 15:56

JS: Handle array of sorting criteria

804a1a6

erik-krogh approved these changes Feb 18, 2025

View reviewed changes

asgerf merged commit 58c8b5f into github:main Feb 19, 2025
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JS: Do not taint whole array when storing into ArrayElement #18790

JS: Do not taint whole array when storing into ArrayElement #18790

Uh oh!

asgerf commented Feb 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

erik-krogh left a comment

Uh oh!

erik-krogh Feb 18, 2025

Uh oh!

asgerf Feb 18, 2025

Uh oh!

erik-krogh Feb 18, 2025

Uh oh!

asgerf Feb 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	output = ["Argument[1].Parameter[1]", "ReturnValue"] and
	output = ["Argument[1].Parameter[0]", "ReturnValue"] and

JS: Do not taint whole array when storing into ArrayElement #18790

JS: Do not taint whole array when storing into ArrayElement #18790

Uh oh!

Conversation

asgerf commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

erik-krogh left a comment

Choose a reason for hiding this comment

Uh oh!

erik-krogh Feb 18, 2025

Choose a reason for hiding this comment

Uh oh!

asgerf Feb 18, 2025

Choose a reason for hiding this comment

Uh oh!

erik-krogh Feb 18, 2025

Choose a reason for hiding this comment

Uh oh!

asgerf Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asgerf commented Feb 14, 2025 •

edited

Loading

asgerf Feb 18, 2025 •

edited

Loading