Skip to content

Conversation

@asgerf
Copy link
Contributor

@asgerf asgerf commented Feb 14, 2025

When a flow summary uses ArrayElement in its output column, we'd previously generate a taint step in addition to a store step into the array. This PR removes that rule, so that reading from ArrayElement generates a taint step, but storing into ArrayElement does not.

The rule was there in order to be consistent with how things worked with the old data flow library, where arrays were generally considered tainted if they contained a tainted element, but we don't really do that anymore. This PR makes a couple of semi-related changes in order to recover from observed FPs/FNs resulting from the first commit.

The DCA run looks good:

  • 4 fixed FPs
  • 1 lost result from ExceptionXss which is due to that query having an AP limit of 1, which is no longer enough to find the flow. The result was not exploitable; it seemed like the sort of FP some users might have appreciated and fixed anyway, but on the whole it's not critical to preserve it.
  • A minor but clean speedup: Average of 3.5% speedup with 8% and 9% speedup on the two slowest projects.

@github-actions github-actions bot added the JS label Feb 14, 2025
No changes in actual alerts
This flow was lost since the existing model of concat() boxes its return value in ArrayElement. There is no explicit model of Buffer.concat.
Not all of lodash, just the callbacks we already modeled plus a few easy ones
getAPropertyWrite() contains getALocalSource() under the the hood. Don't rely on that to find the successor of a mutation.
@asgerf asgerf changed the title JS: Do not store into arrays implicitly JS: Do not taint whole array when storing into ArrayElement Feb 18, 2025
@asgerf asgerf marked this pull request as ready for review February 18, 2025 08:44
@asgerf asgerf requested a review from a team as a code owner February 18, 2025 08:44
erik-krogh
erik-krogh previously approved these changes Feb 18, 2025
Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just two minor questions.

}

class ToString extends SummarizedCallable {
ToString() { this = "Object#toString / Array#toString" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a model of Object#toString? It only handles array elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit torn about what to name the summary. It applies to all method calls to .toString() so I thought it would look confusing if someone was to look at the call graph and see all toString calls resolving to Array#toString. It makes it clear that this summary is responsible for toString calls in general.

The "correct" model for Object#toString is an empty set of flow summaries so there's just nothing to do for that case.


override predicate propagatesFlow(string input, string output, boolean preservesValue) {
input = "Argument[0].ArrayElement" and
output = ["Argument[1].Parameter[1]", "ReturnValue"] and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output = ["Argument[1].Parameter[1]", "ReturnValue"] and
output = ["Argument[1].Parameter[0]", "ReturnValue"] and

I found this usage example: _.sortBy(users, [function(o) { return o.user; }]);
Seems the relevant value is the first parameter.

Also, you can specify an array of functions, but I'm not sure we need to model that.

Copy link
Contributor Author

@asgerf asgerf Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed the parameter index. I was just trying to reach parity with the old model, but without jump steps.

We need a more scalable way to make tests for libraries like lodash. I actually started writing tests for these but realised I was basically writing the same code twice in two different forms and would be likely to repeat the same mistakes in both places. I tried a rather simple use of copilot auto completion to generate tests, but it did not work well.

@asgerf asgerf merged commit 58c8b5f into github:main Feb 19, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants