Skip to content

Conversation

@tausbn
Copy link
Contributor

@tausbn tausbn commented Oct 31, 2025

⚠️ Should be reviewed commit-by-commit! ⚠️

This PR removes all points-to related imports from python.qll. For now, the imports have just been made private, on the off-chance that actual removal would affect the behaviour of the optimiser. At a later date, these can be removed fully.

The remaining uses of LegacyPointsTo inside the python/ql/lib directory are in the following files:

ql/lib/analysis/DefinitionTracking.qll
ql/lib/semmle/python/Metrics.qll
ql/lib/semmle/python/SelfAttribute.qll
ql/lib/semmle/python/SpecialMethods.qll
ql/lib/semmle/python/dataflow/old/Configuration.qll
ql/lib/semmle/python/dataflow/old/Files.qll
ql/lib/semmle/python/dataflow/old/Implementation.qll
ql/lib/semmle/python/dataflow/old/StateTracking.qll
ql/lib/semmle/python/dataflow/old/TaintTracking.qll
ql/lib/semmle/python/dependencies/Dependencies.qll
ql/lib/semmle/python/dependencies/DependencyKind.qll
ql/lib/semmle/python/dependencies/TechInventory.qll
ql/lib/semmle/python/libraries/Zope.qll
ql/lib/semmle/python/objects/ObjectAPI.qll
ql/lib/semmle/python/types/Builtins.qll
ql/lib/semmle/python/types/ClassObject.qll
ql/lib/semmle/python/types/Exceptions.qll
ql/lib/semmle/python/types/FunctionObject.qll
ql/lib/semmle/python/types/Object.qll
ql/lib/semmle/python/types/Properties.qll
ql/lib/semmle/python/values/StringAttributes.qll

of these, the ones in /types and /object form the main points-to analysis itself, and /dataflow/old is long since deprecated. The remaining modules have a fairly deep use of points-to, and so removal was not feasible (but none of them are imported from python.qll at this point, so leaving them in place should be fine).

not exists(ExprWithPointsTo left, ExprWithPointsTo right, Value val |
comp.compares(left, op, right) and
exists(ImmutableLiteral il | il.getLiteralValue() = val)
exists(ImmutableLiteral il | il = val.(ConstantObjectInternal).getLiteral())

Check warning

Code scanning / CodeQL

Expression can be replaced with a cast Warning

The assignment in the exists(..) is redundant.
@tausbn tausbn force-pushed the tausbn/python-remove-top-level-points-to-imports branch from 685b2db to d643079 Compare November 25, 2025 16:08
@tausbn tausbn force-pushed the tausbn/python-remove-top-level-points-to-imports branch 3 times, most recently from 6f767db to e532fb6 Compare November 25, 2025 16:55
For now, these have just been made into `private` imports. After doing
this, I went through all of the (now not compiling) files and added in
private imports to the modules that they actually depended on.

I also added an explicit import of `LegacyPointsTo` (even though it may
be unnecessary) in cases where the points-to dependency was somewhat
surprising (and one we want to get rid of). This was primarily inside
the various SSA layers.

For modules inside `semmle.python.{types, objects, pointsto}` I did not
bother, as these are fairly clearly related to points-to.
This frees `Class.qll`, `Exprs.qll`, and `Function.qll` from the
clutches of points-to. For the somewhat complicated setup with
`getLiteralObject` (an abstract method), I opted for a slightly ugly but
workable solution of just defining a predicate on `ImmutableLiteral`
that inlines each predicate body, special-cased to the specific instance
to which it applies.
These methods were in `pointsto.Base` but did not actually interact with
the points-to machinery directly, so they were easy to move out.
One might argue that these should be rewritten entirely to use more
modern APIs, but for now I'll be content with just having them compile
properly.
Mostly just adding `private import LegacyPointsTo`. Sometimes getting
rid of other imports that are superceded by that module.
Gets rid of a bunch of predicates relating to reachability (which
depended on the modelling of exceptions, which uses points-to), moving
them to `LegacyPointsTo`. In the process, we gained a new class
`BasicBlockWithPointsTo`.
Turns out the `ImportTime` module (despite living in
`semmle.python.types` does not actually depend on points-to, so some of
the `LegacyPointsTo` imports could be replaced or removed.
For whatever reason, the CFG node for exceptions and exception groups
was placed with the points-to code. (Probably because a lot of the
predicates depended on points-to.)

However, as it turned out, two of the SSA modules only depended on
non-points-to properties of these nodes, and so it was fairly
straightforward to remove the imports of `LegacyPointsTo` for those
modules.

In the process, I moved the aforementioned CFG node types into
`Flow.qll`, and changed the classes in the `Exceptions` module to the
`...WithPointsTo` form that we introduced elsewhere.
The `Builtins` module is deeply entwined with points-to, so it would be
nice to not have this dependence. Happily, the only thing we used
`Builtin` for was to get the names of known builtins, and for this we
already maintain such a set of names in
`dataflow.new.internal.Builtins`.
@tausbn tausbn force-pushed the tausbn/python-remove-top-level-points-to-imports branch from e532fb6 to 5b47fcb Compare November 26, 2025 12:30
Happily, this was not as deeply entwined as it looked at first glance.
Gets rid of the `getMetrics` methods on the `Function`, `Class`, and
`Module` classes. To access the metrics, one must first import the
`LegacyPointsTo` module, and then either change the type to
`{Function,Class,Module}Metrics` or cast to the appropriate type.
@tausbn tausbn force-pushed the tausbn/python-remove-top-level-points-to-imports branch from 9358997 to e490cb6 Compare November 26, 2025 17:07
In hindsight, having a `.getMetrics()` method that just returns `this`
is somewhat weird. It's possible that it predates the existence of the
inline cast, however.
@tausbn tausbn force-pushed the tausbn/python-remove-top-level-points-to-imports branch from e490cb6 to c6ad438 Compare November 26, 2025 21:58
@tausbn tausbn marked this pull request as ready for review November 27, 2025 12:32
@tausbn tausbn requested a review from a team as a code owner November 27, 2025 12:32
Copilot AI review requested due to automatic review settings November 27, 2025 12:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes all points-to analysis related imports from the top-level python.qll module by making them private. Users who depend on points-to functionality must now explicitly import the new LegacyPointsTo module. This is a breaking change that affects how various classes and predicates are accessed - for example, Function.getFunctionObject() is no longer directly available and requires using FunctionWithPointsTo from the LegacyPointsTo module.

Key Changes:

  • Made points-to related imports private in python.qll
  • Created new LegacyPointsTo module that exposes points-to functionality
  • Refactored test files, queries, and library code to import LegacyPointsTo explicitly
  • Moved predicates and methods from base classes into specialized extension classes (e.g., FunctionWithPointsTo, SsaVariableWithPointsTo)
  • Restructured code organization to separate points-to dependent functionality

Reviewed changes

Copilot reviewed 284 out of 284 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/ql/lib/python.qll Made points-to and related imports private
python/ql/lib/LegacyPointsTo.qll New module exposing points-to functionality with extension classes
python/ql/lib/semmle/python/*.qll Removed points-to methods from base classes, moved to extension classes
python/ql/lib/semmle/python/types/*.qll Added necessary imports, removed dependencies from exposed modules
python/ql/lib/semmle/python/essa/*.qll Moved helper predicates to appropriate modules
python/ql/test/**/*.ql Added private import LegacyPointsTo to all affected test queries
python/ql/src/**/*.ql Added private import LegacyPointsTo to all affected source queries
python/ql/examples/**/*.ql Added private import LegacyPointsTo to example snippets

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for taking this on, Taus! To my untrained snake eyes this looks good, but it would probably be good to have another pair of snake eyes on it as well before we merge it. I guess we have to wait for DCA to stop misbehaving anyway

tausbn and others added 2 commits November 27, 2025 16:47
This makes things a bit cleaner.

After this, the only non-private (and non-`LegacyPointsTo`) imports of
`semmle.python.{types,objects,pointsto}.*` are in
`semmle.python.objects.ObjectInternal`, which is reasonable, as that is
the entry point for the entire internal object API.
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
@tausbn
Copy link
Contributor Author

tausbn commented Dec 2, 2025

There doesn't seem to be any impact where performance is concerned.

Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, congratulations on getting this far!
I had one question, but it is not blocking.
I think the solution for ImmutableLiterals is perfectly acceptable.

private import semmle.python.dataflow.new.internal.ImportStar
private import semmle.python.dataflow.new.TypeTracking
private import semmle.python.dataflow.new.internal.DataFlowPrivate
private import semmle.python.essa.SsaDefinitions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this necessary now and not before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this is not entirely obvious!

That module was reexported by semmle.python.pointsto.Base (see below) which was imported in python.qll, but now that import has (of course) been made private.

@tausbn tausbn merged commit 1b51938 into main Dec 5, 2025
19 checks passed
@tausbn tausbn deleted the tausbn/python-remove-top-level-points-to-imports branch December 5, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants