Skip to content

Commit 10166be

Browse files
author
Max Schaefer
committed
JavaScript: Add new query DoubleEscaping.
1 parent 1c53222 commit 10166be

File tree

12 files changed

+332
-3
lines changed

12 files changed

+332
-3
lines changed

change-notes/1.20/analysis-javascript.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,10 @@
44

55
## New queries
66

7-
| **Query** | **Tags** | **Purpose** |
8-
|-----------|----------|-------------|
9-
| | | |
7+
| **Query** | **Tags** | **Purpose** |
8+
|-----------------------------------------------|------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
9+
| Double escaping or unescaping (`js/double-escaping') | correctness, security, external/cwe/cwe-116 | Highlights potential double escaping or unescaping of special characters, indicating a possible violation of [CWE-116](https://cwe.mitre.org/data/definitions/116.html). Results are shown on LGTM by default. |
10+
1011

1112
## Changes to existing queries
1213

javascript/config/suites/javascript/security

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
+ semmlecode-javascript-queries/Security/CWE-089/SqlInjection.ql: /Security/CWE/CWE-089
99
+ semmlecode-javascript-queries/Security/CWE-094/CodeInjection.ql: /Security/CWE/CWE-094
1010
+ semmlecode-javascript-queries/Security/CWE-116/IncompleteSanitization.ql: /Security/CWE/CWE-116
11+
+ semmlecode-javascript-queries/Security/CWE-116/DoubleEscaping.ql: /Security/CWE/CWE-116
1112
+ semmlecode-javascript-queries/Security/CWE-134/TaintedFormatString.ql: /Security/CWE/CWE-134
1213
+ semmlecode-javascript-queries/Security/CWE-209/StackTraceExposure.ql: /Security/CWE/CWE-209
1314
+ semmlecode-javascript-queries/Security/CWE-312/CleartextStorage.ql: /Security/CWE/CWE-312
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
6+
<overview>
7+
<p>
8+
Escaping meta-characters in untrusted input is an important technique for preventing injection
9+
attacks such as cross-site scripting. One particular example of this is HTML entity encoding,
10+
where HTML special characters are replaced by HTML character entities to prevent them from being
11+
interpreted as HTML markup. For example, the less-than character is encoded as <code>&amp;lt;</code>
12+
and the double-quote character as <code>&amp;quot;</code>.
13+
Other examples include backslash-escaping for including untrusted data in string literals and
14+
percent-encoding for URI components.
15+
</p>
16+
<p>
17+
The reverse process of replacing escape sequences with the characters they represent is known as
18+
unescaping.
19+
</p>
20+
<p>
21+
Note that the escape characters themselves (such as ampersand in the case of HTML encoding) play
22+
a special role during escaping and unescaping: they are themselves escaped, but also form part
23+
of the escaped representations of other characters. Hence care must be taken to avoid double escaping
24+
and unescaping: when escaping, the escape character must be escaped first, when unescaping it has
25+
to be unescaped last.
26+
</p>
27+
<p>
28+
If used in the context of sanitization, double unescaping may render the sanitization ineffective.
29+
Even if it is not used in a security-critical context, it may still result in confusing
30+
or garbled output.
31+
</p>
32+
</overview>
33+
34+
<recommendation>
35+
<p>
36+
Use a (well-tested) sanitization library if at all possible. These libraries are much more
37+
likely to handle corner cases correctly than a custom implementation. For URI encoding,
38+
you can use the standard `encodeURIComponent` and `decodeURIComponent` functions.
39+
</p>
40+
<p>
41+
Otherwise, make sure to always escape the escape character first, and unescape it last.
42+
</p>
43+
</recommendation>
44+
45+
<example>
46+
<p>
47+
The following example shows a pair of hand-written HTML encoding and decoding functions:
48+
</p>
49+
50+
<sample src="examples/DoubleEscaping.js" />
51+
52+
<p>
53+
The encoding function correctly handles ampersand before the other characters. For example,
54+
the string <code>me &amp; "you"</code> is encoded as <code>me &amp;amp; &amp;quot;you&amp;quot;</code>,
55+
and the string <code>&quot;</code> is encoded as <code>&amp;quot;</code>.
56+
</p>
57+
58+
<p>
59+
The decoding function, however, incorrectly decodes <code>&amp;amp;</code> into <code>&amp;</code>
60+
before handling the other characters. So while it correctly decodes the first example above,
61+
it decodes the second example (<code>&amp;quot;</code>) to <code>&quot;</code> (a single double quote),
62+
which is not correct.
63+
</p>
64+
65+
<p>
66+
Instead, the decoding function should decode the ampersand last:
67+
</p>
68+
69+
<sample src="examples/DoubleEscapingGood.js" />
70+
</example>
71+
72+
<references>
73+
<li>OWASP Top 10: <a href="https://www.owasp.org/index.php/Top_10-2017_A1-Injection">A1 Injection</a>.</li>
74+
<li>npm: <a href="https://www.npmjs.com/package/html-entities">html-entities</a> package.</li>
75+
<li>npm: <a href="https://www.npmjs.com/package/js-string-escape">js-string-escape</a> package.</li>
76+
</references>
77+
</qhelp>
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
/**
2+
* @name Double escaping or unescaping
3+
* @description When escaping special characters using a meta-character like backslash or
4+
* ampersand, the meta-character has to be escaped first to avoid double-escaping,
5+
* and conversely it has to be unescaped last to avoid double-unescaping.
6+
* @kind problem
7+
* @problem.severity warning
8+
* @precision high
9+
* @id js/double-escaping
10+
* @tags correctness
11+
* security
12+
* external/cwe/cwe-116
13+
* external/cwe/cwe-20
14+
*/
15+
16+
import javascript
17+
18+
/**
19+
* Holds if `rl` is a simple constant, which is bound to the result of the predicate.
20+
*
21+
* For example, `/a/g` has string value `"a"` and `/abc/` has string value `"abc"`,
22+
* while `/ab?/` and `/a(?=b)/` do not have a string value.
23+
*
24+
* Flags are ignored, so `/a/i` is still considered to have string value `"a"`,
25+
* even though it also matches `"A"`.
26+
*
27+
* Note the somewhat subtle use of monotonic aggregate semantics, which makes the
28+
* `strictconcat` fail if one of the children of the root is not a constant (legacy
29+
* semantics would simply skip such children).
30+
*/
31+
language[monotonicAggregates]
32+
string getStringValue(RegExpLiteral rl) {
33+
exists (RegExpTerm root | root = rl.getRoot() |
34+
result = root.(RegExpConstant).getValue()
35+
or
36+
result = strictconcat(RegExpTerm ch, int i |
37+
ch = root.(RegExpSequence).getChild(i) |
38+
ch.(RegExpConstant).getValue() order by i
39+
)
40+
)
41+
}
42+
43+
/**
44+
* Gets a predecessor of `nd` that is not an SSA phi node.
45+
*/
46+
DataFlow::Node getASimplePredecessor(DataFlow::Node nd) {
47+
result = nd.getAPredecessor() and
48+
not nd.(DataFlow::SsaDefinitionNode).getSsaVariable().getDefinition() instanceof SsaPhiNode
49+
}
50+
51+
/**
52+
* Holds if `metachar` is a meta-character that is used to escape special characters
53+
* into a form described by regular expression `regex`.
54+
*/
55+
predicate escapingScheme(string metachar, string regex) {
56+
metachar = "&" and regex = "&.*;"
57+
or
58+
metachar = "%" and regex = "%.*"
59+
or
60+
metachar = "\\" and regex = "\\\\.*"
61+
}
62+
63+
/**
64+
* A call to `String.prototype.replace` that replaces all instances of a pattern.
65+
*/
66+
class Replacement extends DataFlow::Node {
67+
RegExpLiteral pattern;
68+
69+
Replacement() {
70+
exists (DataFlow::MethodCallNode mcn | this = mcn |
71+
mcn.getMethodName() = "replace" and
72+
mcn.getArgument(0).asExpr() = pattern and
73+
mcn.getNumArgument() = 2 and
74+
pattern.isGlobal()
75+
)
76+
}
77+
78+
/**
79+
* Holds if this replacement replaces the string `input` with `output`.
80+
*/
81+
predicate replaces(string input, string output) {
82+
exists (DataFlow::MethodCallNode mcn |
83+
mcn = this and
84+
input = getStringValue(pattern) and
85+
output = mcn.getArgument(1).asExpr().getStringValue()
86+
)
87+
}
88+
89+
/**
90+
* Holds if this replacement escapes `char` using `metachar`.
91+
*
92+
* For example, during HTML entity escaping `<` is escaped (to `&lt;`)
93+
* using `&`.
94+
*/
95+
predicate escapes(string char, string metachar) {
96+
exists (string regexp, string repl |
97+
escapingScheme(metachar, regexp) and
98+
replaces(char, repl) and
99+
repl.regexpMatch(regexp)
100+
)
101+
}
102+
103+
/**
104+
* Holds if this replacement unescapes `char` using `metachar`.
105+
*
106+
* For example, during HTML entity unescaping `<` is unescaped (from
107+
* `&lt;`) using `<`.
108+
*/
109+
predicate unescapes(string metachar, string char) {
110+
exists (string regexp, string orig |
111+
escapingScheme(metachar, regexp) and
112+
replaces(orig, char) and
113+
orig.regexpMatch(regexp)
114+
)
115+
}
116+
117+
/**
118+
* Gets the previous replacement in this chain of replacements.
119+
*/
120+
Replacement getPreviousReplacement() {
121+
result = getASimplePredecessor*(this.(DataFlow::MethodCallNode).getReceiver())
122+
}
123+
124+
/**
125+
* Gets an earlier replacement in this chain of replacements that
126+
* performs an escaping.
127+
*/
128+
Replacement getAnEarlierEscaping(string metachar) {
129+
exists (Replacement pred | pred = this.getPreviousReplacement() |
130+
if pred.escapes(_, metachar) then
131+
result = pred
132+
else
133+
result = pred.getAnEarlierEscaping(metachar)
134+
)
135+
}
136+
137+
/**
138+
* Gets an earlier replacement in this chain of replacements that
139+
* performs a unescaping.
140+
*/
141+
Replacement getALaterUnescaping(string metachar) {
142+
exists (Replacement succ | this = succ.getPreviousReplacement() |
143+
if succ.unescapes(metachar, _) then
144+
result = succ
145+
else
146+
result = succ.getALaterUnescaping(metachar)
147+
)
148+
}
149+
}
150+
151+
from Replacement primary, Replacement supplementary, string message, string metachar
152+
where primary.escapes(metachar, _) and
153+
supplementary = primary.getAnEarlierEscaping(metachar) and
154+
message = "may double-escape '" + metachar + "' characters from $@"
155+
or
156+
primary.unescapes(_, metachar) and
157+
supplementary = primary.getALaterUnescaping(metachar) and
158+
message = "may produce '" + metachar + "' characters that are double-unescaped $@"
159+
select primary, "This replacement " + message + ".", supplementary, "here"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
module.exports.encode = function(s) {
2+
return s.replace(/&/g, "&amp;")
3+
.replace(/"/g, "&quot;")
4+
.replace(/'/g, "&apos;");
5+
};
6+
7+
module.exports.decode = function(s) {
8+
return s.replace(/&amp;/g, "&")
9+
.replace(/&quot;/g, "\"")
10+
.replace(/&apos;/g, "'");
11+
};
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
module.exports.encode = function(s) {
2+
return s.replace(/&/g, "&amp;")
3+
.replace(/"/g, "&quot;")
4+
.replace(/'/g, "&apos;");
5+
};
6+
7+
module.exports.decode = function(s) {
8+
return s.replace(/&quot;/g, "\"")
9+
.replace(/&apos;/g, "'")
10+
.replace(/&amp;/g, "&");
11+
};
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
| tst.js:2:10:4:33 | s.repla ... &amp;") | This replacement may double-escape '&' characters from $@. | tst.js:2:10:3:34 | s.repla ... apos;") | here |
2+
| tst.js:20:10:20:33 | s.repla ... g, "&") | This replacement may produce '&' characters that are double-unescaped $@. | tst.js:20:10:21:35 | s.repla ... , "\\"") | here |
3+
| tst.js:30:10:30:33 | s.repla ... g, "&") | This replacement may produce '&' characters that are double-unescaped $@. | tst.js:30:10:32:34 | s.repla ... g, "'") | here |
4+
| tst.js:47:7:47:30 | s.repla ... g, "&") | This replacement may produce '&' characters that are double-unescaped $@. | tst.js:48:7:48:32 | s.repla ... , "\\"") | here |
5+
| tst.js:53:10:53:33 | s.repla ... , '\\\\') | This replacement may produce '\\' characters that are double-unescaped $@. | tst.js:53:10:54:33 | s.repla ... , '\\'') | here |
6+
| tst.js:60:7:60:28 | s.repla ... '%25') | This replacement may double-escape '%' characters from $@. | tst.js:59:7:59:28 | s.repla ... '%26') | here |
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Security/CWE-116/DoubleEscaping.ql
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
function badEncode(s) {
2+
return s.replace(/"/g, "&quot;")
3+
.replace(/'/g, "&apos;")
4+
.replace(/&/g, "&amp;");
5+
}
6+
7+
function goodEncode(s) {
8+
return s.replace(/&/g, "&amp;")
9+
.replace(/"/g, "&quot;")
10+
.replace(/'/g, "&apos;");
11+
}
12+
13+
function goodDecode(s) {
14+
return s.replace(/&quot;/g, "\"")
15+
.replace(/&apos;/g, "'")
16+
.replace(/&amp;/g, "&");
17+
}
18+
19+
function badDecode(s) {
20+
return s.replace(/&amp;/g, "&")
21+
.replace(/&quot;/g, "\"")
22+
.replace(/&apos;/g, "'");
23+
}
24+
25+
function cleverEncode(code) {
26+
return code.replace(/</g, '&lt;').replace(/>/g, '&gt;').replace(/&(?![\w\#]+;)/g, '&amp;');
27+
}
28+
29+
function badDecode2(s) {
30+
return s.replace(/&amp;/g, "&")
31+
.replace(/s?ome|thin*g/g, "else")
32+
.replace(/&apos;/g, "'");
33+
}
34+
35+
function goodDecodeInLoop(ss) {
36+
var res = [];
37+
for (var s of ss) {
38+
s = s.replace(/&quot;/g, "\"")
39+
.replace(/&apos;/g, "'")
40+
.replace(/&amp;/g, "&");
41+
res.push(s);
42+
}
43+
return res;
44+
}
45+
46+
function badDecode3(s) {
47+
s = s.replace(/&amp;/g, "&");
48+
s = s.replace(/&quot;/g, "\"");
49+
return s.replace(/&apos;/g, "'");
50+
}
51+
52+
function badUnescape(s) {
53+
return s.replace(/\\\\/g, '\\')
54+
.replace(/\\'/g, '\'')
55+
.replace(/\\"/g, '\"');
56+
}
57+
58+
function badPercentEscape(s) {
59+
s = s.replace(/&/g, '%26');
60+
s = s.replace(/%/g, '%25');
61+
return s;
62+
}

javascript/ql/test/query-tests/Security/CWE-116/IncompleteSanitization.expected renamed to javascript/ql/test/query-tests/Security/CWE-116/IncompleteSanitization/IncompleteSanitization.expected

File renamed without changes.

0 commit comments

Comments
 (0)