Skip to content

Commit 0ae98e7

Browse files
author
Dave Bartolomeo
committed
Merge remote-tracking branch 'github/master' into github/codeql-c-analysis-team/69_union
2 parents 94c2bba + 398678a commit 0ae98e7

File tree

136 files changed

+2537
-257
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

136 files changed

+2537
-257
lines changed

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
We welcome contributions to our CodeQL libraries and queries. Got an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Contributions to this project are [released](https://help.github.com/articles/github-terms-of-service/#6-contributions-under-repository-license) to the public under the [project's open source license](LICENSE).
44

5-
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [Writing CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
5+
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
66

77

88
## Submitting a new experimental query
@@ -32,7 +32,7 @@ If you have an idea for a query that you would like to share with other CodeQL u
3232

3333
For details, see the [guide on query metadata](docs/query-metadata-style-guide.md).
3434

35-
Make sure the `select` statement is compatible with the query `@kind`. See [Introduction to query files](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
35+
Make sure the `select` statement is compatible with the query `@kind`. See [About CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
3636

3737
3. **Formatting**
3838

change-notes/1.25/analysis-javascript.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
| Client-side URL redirect (`js/client-side-unvalidated-url-redirection`) | Fewer results | This query now recognizes additional safe patterns of doing URL redirects. |
4444
| Code injection (`js/code-injection`) | More results | More potential vulnerabilities involving NoSQL code operators are now recognized. |
4545
| Expression has no effect (`js/useless-expression`) | Fewer results | This query no longer flags an expression when that expression is the only content of the containing file. |
46+
| Hard-coded credentials (`js/hardcoded-credentials`) | More results | This query now recognizes hard-coded credentials sent via HTTP authorization headers. |
4647
| Incomplete URL scheme check (`js/incomplete-url-scheme-check`) | More results | This query now recognizes additional url scheme checks. |
4748
| Misspelled variable name (`js/misspelled-variable-name`) | Message changed | The message for this query now correctly identifies the misspelled variable in additional cases. |
4849
| Prototype pollution in utility function (`js/prototype-pollution-utility`) | More results | This query now recognizes additional utility functions as vulnerable to prototype polution. |
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Improvements to Python analysis
2+
3+
The following changes in version 1.25 affect Python analysis in all applications.
4+
5+
## General improvements
6+
7+
8+
## New queries
9+
10+
| **Query** | **Tags** | **Purpose** |
11+
|-----------------------------|-----------|--------------------------------------------------------------------|
12+
13+
14+
## Changes to existing queries
15+
16+
| **Query** | **Expected impact** | **Change** |
17+
|----------------------------|------------------------|------------------------------------------------------------------|
18+
19+
20+
## Changes to libraries
21+
22+
* Importing `semmle.python.web.HttpRequest` will no longer import `UntrustedStringKind` transitively. `UntrustedStringKind` is the most commonly used non-abstract subclass of `ExternalStringKind`. If not imported (by one mean or another), taint-tracking queries that concern `ExternalStringKind` will not produce any results. Please ensure such queries contain an explicit import (`import semmle.python.security.strings.Untrusted`).

config/sync-files.py

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -59,29 +59,41 @@ def file_checksum(filename):
5959
return hashlib.sha1(file_handle.read()).hexdigest()
6060

6161
def check_group(group_name, files, master_file_picker, emit_error):
62-
checksums = {file_checksum(f) for f in files}
62+
extant_files = [f for f in files if path.isfile(f)]
63+
if len(extant_files) == 0:
64+
emit_error(__file__, 0, "No files found from group '" + group_name + "'.")
65+
emit_error(__file__, 0,
66+
"Create one of the following files, and then run this script with "
67+
"the --latest switch to sync it to the other file locations.")
68+
for filename in files:
69+
emit_error(__file__, 0, " " + filename)
70+
return
71+
72+
checksums = {file_checksum(f) for f in extant_files}
6373

64-
if len(checksums) == 1:
74+
if len(checksums) == 1 and len(extant_files) == len(files):
75+
# All files are present and identical.
6576
return
6677

67-
master_file = master_file_picker(files)
78+
master_file = master_file_picker(extant_files)
6879
if master_file is None:
6980
emit_error(__file__, 0,
7081
"Files from group '"+ group_name +"' not in sync.")
7182
emit_error(__file__, 0,
7283
"Run this script with a file-name argument among the "
7384
"following to overwrite the remaining files with the contents "
74-
"of that file or run with the --latest switch to update each "
85+
"of that file, or run with the --latest switch to update each "
7586
"group of files from the most recently modified file in the group.")
76-
for filename in files:
87+
for filename in extant_files:
7788
emit_error(__file__, 0, " " + filename)
7889
else:
7990
print(" Syncing others from", master_file)
8091
for filename in files:
8192
if filename == master_file:
8293
continue
8394
print(" " + filename)
84-
os.replace(filename, filename + '~')
95+
if path.isfile(filename):
96+
os.replace(filename, filename + '~')
8597
shutil.copy(master_file, filename)
8698
print(" Backups written with '~' appended to file names")
8799

cpp/ql/src/semmle/code/cpp/ir/dataflow/DefaultTaintTracking.qll

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ private DataFlow::Node getNodeForSource(Expr source) {
6767
// to `gets`. It's impossible here to tell which is which, but the "access
6868
// to argv" source is definitely not intended to match an output argument,
6969
// and it causes false positives if we let it.
70+
//
71+
// This case goes together with the similar (but not identical) rule in
72+
// `nodeIsBarrierIn`.
7073
result = DataFlow::definitionByReferenceNode(source) and
7174
not argv(source.(VariableAccess).getTarget())
7275
)
@@ -202,7 +205,13 @@ private predicate nodeIsBarrier(DataFlow::Node node) {
202205

203206
private predicate nodeIsBarrierIn(DataFlow::Node node) {
204207
// don't use dataflow into taint sources, as this leads to duplicate results.
205-
node = getNodeForSource(any(Expr e))
208+
exists(Expr source | isUserInput(source, _) |
209+
node = DataFlow::exprNode(source)
210+
or
211+
// This case goes together with the similar (but not identical) rule in
212+
// `getNodeForSource`.
213+
node = DataFlow::definitionByReferenceNode(source)
214+
)
206215
}
207216

208217
cached

cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowDispatch.qll

Lines changed: 21 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -82,47 +82,45 @@ private module VirtualDispatch {
8282
exists(LoadInstruction load, GlobalOrNamespaceVariable var |
8383
var = src.asVariable() and
8484
other.asInstruction() = load and
85+
addressOfGlobal(load.getSourceAddress(), var) and
8586
// The `allowFromArg` concept doesn't play a role when `src` is a
8687
// global variable, so we just set it to a single arbitrary value for
8788
// performance.
8889
allowFromArg = true
89-
|
90-
// Load directly from the global variable
91-
load.getSourceAddress().(VariableAddressInstruction).getASTVariable() = var
92-
or
93-
// Load from a field on a global union
94-
exists(FieldAddressInstruction fa |
95-
fa = load.getSourceAddress() and
96-
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
97-
fa.getField().getDeclaringType() instanceof Union
98-
)
9990
)
10091
or
101-
// Flow from store to global variable. These cases are similar to the
102-
// above but have `StoreInstruction` instead of `LoadInstruction` and
103-
// have the roles swapped between `other` and `src`.
92+
// Flow from store to global variable.
10493
exists(StoreInstruction store, GlobalOrNamespaceVariable var |
10594
var = other.asVariable() and
10695
store = src.asInstruction() and
96+
storeIntoGlobal(store, var) and
10797
// Setting `allowFromArg` to `true` like in the base case means we
10898
// treat a store to a global variable like the dispatch itself: flow
10999
// may come from anywhere.
110100
allowFromArg = true
111-
|
112-
// Store directly to the global variable
113-
store.getDestinationAddress().(VariableAddressInstruction).getASTVariable() = var
114-
or
115-
// Store to a field on a global union
116-
exists(FieldAddressInstruction fa |
117-
fa = store.getDestinationAddress() and
118-
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
119-
fa.getField().getDeclaringType() instanceof Union
120-
)
121101
)
122102
)
123103
}
124104
}
125105

106+
pragma[noinline]
107+
private predicate storeIntoGlobal(StoreInstruction store, GlobalOrNamespaceVariable var) {
108+
addressOfGlobal(store.getDestinationAddress(), var)
109+
}
110+
111+
/** Holds if `addressInstr` is an instruction that produces the address of `var`. */
112+
private predicate addressOfGlobal(Instruction addressInstr, GlobalOrNamespaceVariable var) {
113+
// Access directly to the global variable
114+
addressInstr.(VariableAddressInstruction).getASTVariable() = var
115+
or
116+
// Access to a field on a global union
117+
exists(FieldAddressInstruction fa |
118+
fa = addressInstr and
119+
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
120+
fa.getField().getDeclaringType() instanceof Union
121+
)
122+
}
123+
126124
/**
127125
* A ReturnNode with its ReturnKind and its enclosing callable.
128126
*

cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,28 @@ private import DataFlowDispatch
88
* to the callable. Instance arguments (`this` pointer) are also included.
99
*/
1010
class ArgumentNode extends InstructionNode {
11-
ArgumentNode() { exists(CallInstruction call | this.getInstruction() = call.getAnArgument()) }
11+
ArgumentNode() {
12+
exists(CallInstruction call |
13+
instr = call.getAnArgument()
14+
or
15+
instr.(ReadSideEffectInstruction).getPrimaryInstruction() = call
16+
)
17+
}
1218

1319
/**
1420
* Holds if this argument occurs at the given position in the given call.
1521
* The instance argument is considered to have index `-1`.
1622
*/
1723
predicate argumentOf(DataFlowCall call, int pos) {
18-
this.getInstruction() = call.getPositionalArgument(pos)
24+
instr = call.getPositionalArgument(pos)
1925
or
20-
this.getInstruction() = call.getThisArgument() and pos = -1
26+
instr = call.getThisArgument() and pos = -1
27+
or
28+
exists(ReadSideEffectInstruction read |
29+
read = instr and
30+
read.getPrimaryInstruction() = call and
31+
pos = getArgumentPosOfSideEffect(read.getIndex())
32+
)
2133
}
2234

2335
/** Gets the call in which this node is an argument. */

cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll

Lines changed: 99 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ class Node extends TIRDataFlowNode {
5454
/** Gets the argument that defines this `DefinitionByReferenceNode`, if any. */
5555
Expr asDefiningArgument() { result = this.(DefinitionByReferenceNode).getArgument() }
5656

57-
/** Gets the parameter corresponding to this node, if any. */
57+
/** Gets the positional parameter corresponding to this node, if any. */
5858
Parameter asParameter() { result = this.(ExplicitParameterNode).getParameter() }
5959

6060
/**
@@ -156,44 +156,90 @@ class ExprNode extends InstructionNode {
156156
}
157157

158158
/**
159-
* A node representing a `Parameter`. This includes both explicit parameters such
160-
* as `x` in `f(x)` and implicit parameters such as `this` in `x.f()`
159+
* INTERNAL: do not use. Translates a parameter/argument index into a negative
160+
* number that denotes the index of its side effect (pointer indirection).
161+
*/
162+
bindingset[index]
163+
int getArgumentPosOfSideEffect(int index) {
164+
// -1 -> -2
165+
// 0 -> -3
166+
// 1 -> -4
167+
// ...
168+
result = -3 - index
169+
}
170+
171+
/**
172+
* The value of a parameter at function entry, viewed as a node in a data
173+
* flow graph. This includes both explicit parameters such as `x` in `f(x)`
174+
* and implicit parameters such as `this` in `x.f()`.
175+
*
176+
* To match a specific kind of parameter, consider using one of the subclasses
177+
* `ExplicitParameterNode`, `ThisParameterNode`, or
178+
* `ParameterIndirectionNode`.
161179
*/
162180
class ParameterNode extends InstructionNode {
163-
override InitializeParameterInstruction instr;
181+
ParameterNode() {
182+
// To avoid making this class abstract, we enumerate its values here
183+
instr instanceof InitializeParameterInstruction
184+
or
185+
instr instanceof InitializeIndirectionInstruction
186+
}
164187

165188
/**
166-
* Holds if this node is the parameter of `c` at the specified (zero-based)
167-
* position. The implicit `this` parameter is considered to have index `-1`.
189+
* Holds if this node is the parameter of `f` at the specified position. The
190+
* implicit `this` parameter is considered to have position `-1`, and
191+
* pointer-indirection parameters are at further negative positions.
168192
*/
169-
predicate isParameterOf(Function f, int i) { none() } // overriden by subclasses
193+
predicate isParameterOf(Function f, int pos) { none() } // overridden by subclasses
170194
}
171195

172-
/**
173-
* The value of a parameter at function entry, viewed as a node in a data
174-
* flow graph.
175-
*/
196+
/** An explicit positional parameter, not including `this` or `...`. */
176197
private class ExplicitParameterNode extends ParameterNode {
198+
override InitializeParameterInstruction instr;
199+
177200
ExplicitParameterNode() { exists(instr.getParameter()) }
178201

179-
override predicate isParameterOf(Function f, int i) { f.getParameter(i) = instr.getParameter() }
202+
override predicate isParameterOf(Function f, int pos) {
203+
f.getParameter(pos) = instr.getParameter()
204+
}
180205

181-
/** Gets the parameter corresponding to this node. */
206+
/** Gets the `Parameter` associated with this node. */
182207
Parameter getParameter() { result = instr.getParameter() }
183208

184209
override string toString() { result = instr.getParameter().toString() }
185210
}
186211

187-
private class ThisParameterNode extends ParameterNode {
212+
/** An implicit `this` parameter. */
213+
class ThisParameterNode extends ParameterNode {
214+
override InitializeParameterInstruction instr;
215+
188216
ThisParameterNode() { instr.getIRVariable() instanceof IRThisVariable }
189217

190-
override predicate isParameterOf(Function f, int i) {
191-
i = -1 and instr.getEnclosingFunction() = f
218+
override predicate isParameterOf(Function f, int pos) {
219+
pos = -1 and instr.getEnclosingFunction() = f
192220
}
193221

194222
override string toString() { result = "this" }
195223
}
196224

225+
/** A synthetic parameter to model the pointed-to object of a pointer parameter. */
226+
class ParameterIndirectionNode extends ParameterNode {
227+
override InitializeIndirectionInstruction instr;
228+
229+
override predicate isParameterOf(Function f, int pos) {
230+
exists(int index |
231+
f.getParameter(index) = instr.getParameter()
232+
or
233+
index = -1 and
234+
instr.getIRVariable().(IRThisVariable).getEnclosingFunction() = f
235+
|
236+
pos = getArgumentPosOfSideEffect(index)
237+
)
238+
}
239+
240+
override string toString() { result = "*" + instr.getIRVariable().toString() }
241+
}
242+
197243
/**
198244
* DEPRECATED: Data flow was never an accurate way to determine what
199245
* expressions might be uninitialized. It errs on the side of saying that
@@ -341,6 +387,18 @@ class DefinitionByReferenceNode extends InstructionNode {
341387
}
342388
}
343389

390+
/**
391+
* A node representing the memory pointed to by a function argument.
392+
*
393+
* This class exists only in order to override `toString`, which would
394+
* otherwise be the default implementation inherited from `InstructionNode`.
395+
*/
396+
private class ArgumentIndirectionNode extends InstructionNode {
397+
override ReadSideEffectInstruction instr;
398+
399+
override string toString() { result = "Argument " + instr.getIndex() + " indirection" }
400+
}
401+
344402
/**
345403
* A `Node` corresponding to a variable in the program, as opposed to the
346404
* value of that variable at some particular point. This can be used for
@@ -442,6 +500,31 @@ private predicate simpleInstructionLocalFlowStep(Instruction iFrom, Instruction
442500
or
443501
iTo.(PhiInstruction).getAnOperand().getDef() = iFrom
444502
or
503+
// A read side effect is almost never exact since we don't know exactly how
504+
// much memory the callee will read.
505+
iTo.(ReadSideEffectInstruction).getSideEffectOperand().getAnyDef() = iFrom and
506+
not iFrom.isResultConflated()
507+
or
508+
// Loading a single `int` from an `int *` parameter is not an exact load since
509+
// the parameter may point to an entire array rather than a single `int`. The
510+
// following rule ensures that any flow going into the
511+
// `InitializeIndirectionInstruction`, even if it's for a different array
512+
// element, will propagate to a load of the first element.
513+
//
514+
// Since we're linking `InitializeIndirectionInstruction` and
515+
// `LoadInstruction` together directly, this rule will break if there's any
516+
// reassignment of the parameter indirection, including a conditional one that
517+
// leads to a phi node.
518+
exists(InitializeIndirectionInstruction init |
519+
iFrom = init and
520+
iTo.(LoadInstruction).getSourceValueOperand().getAnyDef() = init and
521+
// Check that the types match. Otherwise we can get flow from an object to
522+
// its fields, which leads to field conflation when there's flow from other
523+
// fields to the object elsewhere.
524+
init.getParameter().getType().getUnspecifiedType().(DerivedType).getBaseType() =
525+
iTo.getResultType().getUnspecifiedType()
526+
)
527+
or
445528
// Treat all conversions as flow, even conversions between different numeric types.
446529
iTo.(ConvertInstruction).getUnary() = iFrom
447530
or

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ class ValueNumber extends TValueNumber {
5656
or
5757
this instanceof TInitializeParameterValueNumber and result = "InitializeParameter"
5858
or
59-
this instanceof TInitializeThisValueNumber and result = "InitializeThis"
59+
this instanceof TConstantValueNumber and result = "Constant"
6060
or
6161
this instanceof TStringConstantValueNumber and result = "StringConstant"
6262
or

0 commit comments

Comments
 (0)