Skip to content

Commit ceb316d

Browse files
authored
Merge pull request #527 from markshannon/python-security-change-note
Collated python change notes
2 parents 61f5c2e + 4f5cfbc commit ceb316d

File tree

1 file changed

+94
-0
lines changed

1 file changed

+94
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# Improvements to Python analysis
2+
3+
4+
## General improvements
5+
6+
> Changes that affect alerts in many files or from many queries
7+
> For example, changes to file classification
8+
9+
### Representation of the control flow graph
10+
11+
The representation of the control flow graph (CFG) has been modified to better reflect the semantics of Python.
12+
13+
The following statement types no longer have a CFG node for the statement itself, as their sub-expressions already contain all the
14+
semantically significant information:
15+
16+
* `ExprStmt`
17+
* `If`
18+
* `Assign`
19+
* `Import`
20+
21+
For example, the CFG for `if cond: foo else bar` now starts with the CFG node for `cond`.
22+
23+
For the following statement types, the CFG node for the statement now follows the CFG nodes of its sub-expressions to better reflect the semantics:
24+
25+
* `Print`
26+
* `TemplateWrite`
27+
* `ImportStar`
28+
29+
For example the CFG for `print foo` (in Python 2) has changed from `print -> foo` to `foo -> print`, better reflecting the runtime behavior.
30+
31+
32+
The CFG for the `with` statement has been re-ordered to more closely reflect the semantics.
33+
For the `with` statement:
34+
```python
35+
with cm as var:
36+
body
37+
```
38+
The order of the CFG changes from:
39+
40+
<with>
41+
cm
42+
var
43+
body
44+
45+
to:
46+
47+
cm
48+
<with>
49+
var
50+
body
51+
52+
A new predicate `Stmt.getAnEntryNode()` has been added to make it easier to write reachability queries involving statements.
53+
54+
55+
## New queries
56+
57+
| **Query** | **Tags** | **Purpose** |
58+
|-----------------------------|-----------|--------------------------------------------------------------------|
59+
| Information exposure through an exception (`py/stack-trace-exposure`) | security, external/cwe/cwe-209, external/cwe/cwe-497 | Finds instances where information about an exception may be leaked to an external user. Enabled on LGTM by default. |
60+
61+
## Changes to existing queries
62+
63+
All taint-tracking queries now support visualization of paths in QL for Eclipse.
64+
Most security alerts are now visible on LGTM by default.
65+
66+
| **Query** | **Expected impact** | **Change** |
67+
|----------------------------|------------------------|------------------------------------------------------------------|
68+
| Assert statement tests the truth value of a literal constant (`py/assert-literal-constant`) | reliability, correctness | Checks whether an assert statement is testing the truth of a literal constant value. Not shown by default. |
69+
| Code injection (`py/code-injection`) | Supports path visualization and is now visible on LGTM by default | No change to expected results |
70+
| Deserializing untrusted input (`py/unsafe-deserialization`) | Supports path visualization | No change to expected results |
71+
| Encoding error (`py/encoding-error`) | Better alert location | Alert is now shown at the position of the first offending character, rather than at the top of the file. |
72+
| Missing call to \_\_init\_\_ during object initialization (`py/missing-call-to-init`) | Fewer false positive results | Results where it is likely that the full call chain has not been analyzed are no longer reported. |
73+
| Reflected server-side cross-site scripting (`py/reflective-xss`) | Supports path visualization and is now visible on LGTM by default | No change to expected results |
74+
| SQL query built from user-controlled sources (`py/sql-injection`) | Supports path visualization and is now visible on LGTM by default | No change to expected results |
75+
| Uncontrolled data used in path expression (`py/path-injection`) | Supports path visualization and is now visible on LGTM by default | No change to expected results |
76+
| Uncontrolled command line (`py/command-line-injection`) | Supports path visualization and is now visible on LGTM by default | No change to expected results |
77+
| URL redirection from remote source (`py/url-redirection`) | Fewer false positive results and now supports path visualization | Taint is no longer tracked from the right hand side of binary expressions. In other words `SAFE + TAINTED` is now treated as safe. |
78+
79+
80+
## Changes to code extraction
81+
82+
* Improved scalability: Scaling is near linear to at least 20 CPU cores.
83+
* Five levels of logging can be selected: `ERROR`, `WARN`, `INFO`, `DEBUG` and `TRACE`. `WARN` is the stand-alone default, but `INFO` will be used when run by LGTM.
84+
* The `-v` flag can be specified multiple times to increase logging level by one per `-v`.
85+
* The `-q` flag has been added and can be specified multiple times to reduce the logging level by one per `-q`.
86+
* Log lines are now in the `[SEVERITY] message` style and never overlap.
87+
* Extractor now outputs the location of the first offending character when an EncodingError is encountered.
88+
89+
## Changes to QL libraries
90+
91+
* Taint tracking analysis now understands HTTP requests in the `twisted` library.
92+
93+
* The analysis now handles `isinstance` and `issubclass` tests involving the basic abstract base classes better. For example, the test `issubclass(list, collections.Sequence)` is now understood to be `True`
94+
* Taint tracking automatically tracks tainted mappings and collections, without you having to add additional taint kinds. This means that custom taints are tracked from `x` to `y` in the following flow: `l = [x]; y =l[0]`.

0 commit comments

Comments
 (0)