Skip to content

Commit 04b5a79

Browse files
authored
Merge pull request github#3313 from esbena/js/typical-bad-sanitizer
New query: Incomplete HTML attribute sanitization
2 parents d28c4fb + c025089 commit 04b5a79

14 files changed

+629
-0
lines changed

change-notes/1.25/analysis-javascript.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
| **Query** | **Tags** | **Purpose** |
1212
|---------------------------------------------------------------------------------|-------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1313
| Cross-site scripting through DOM (`js/xss-through-dom`) | security, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities where existing text from the DOM is used as HTML. Results are not shown on LGTM by default. |
14+
| Incomplete HTML attribute sanitization (`js/incomplete-html-attribute-sanitization`) | security, external/cwe/cwe-20, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities due to incomplete sanitization of HTML meta-characters. Results are shown on LGTM by default. |
1415

1516
## Changes to existing queries
1617

javascript/config/suites/javascript/security

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
+ semmlecode-javascript-queries/Security/CWE-094/CodeInjection.ql: /Security/CWE/CWE-094
2020
+ semmlecode-javascript-queries/Security/CWE-094/UnsafeDynamicMethodAccess.ql: /Security/CWE/CWE-094
2121
+ semmlecode-javascript-queries/Security/CWE-116/IncompleteSanitization.ql: /Security/CWE/CWE-116
22+
+ semmlecode-javascript-queries/Security/CWE-116/IncompleteHtmlAttributeSanitization.ql: /Security/CWE/CWE-116
2223
+ semmlecode-javascript-queries/Security/CWE-116/DoubleEscaping.ql: /Security/CWE/CWE-116
2324
+ semmlecode-javascript-queries/Security/CWE-134/TaintedFormatString.ql: /Security/CWE/CWE-134
2425
+ semmlecode-javascript-queries/Security/CWE-201/PostMessageStar.ql: /Security/CWE/CWE-201
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
6+
<overview>
7+
<p>
8+
9+
Sanitizing untrusted input for HTML meta-characters is an important
10+
technique for preventing cross-site scripting attacks. Usually, this
11+
is done by escaping <code>&lt;</code>, <code>&gt;</code>,
12+
<code>&amp;</code> and <code>&quot;</code>. However, the context in which
13+
the sanitized value is used decides the characters that
14+
need to be sanitized.
15+
16+
</p>
17+
18+
<p>
19+
20+
As a consequence, some programs only sanitize
21+
<code>&lt;</code> and <code>&gt;</code> since those are the most
22+
common dangerous characters. The lack of sanitization for
23+
<code>&quot;</code> is problematic when an incompletely sanitized
24+
value is used as an HTML attribute in a string that
25+
later is parsed as HTML.
26+
27+
</p>
28+
29+
</overview>
30+
31+
<recommendation>
32+
33+
<p>
34+
35+
Sanitize all relevant HTML meta-characters when
36+
constructing HTML dynamically, and pay special attention to where the
37+
sanitized value is used.
38+
39+
</p>
40+
41+
</recommendation>
42+
43+
<example>
44+
45+
<p>
46+
47+
The following example code writes part of an HTTP request (which is
48+
controlled by the user) to an HTML attribute of the server response.
49+
50+
The user-controlled value is, however, not sanitized for
51+
<code>&quot;</code>. This leaves the website vulnerable to cross-site
52+
scripting since an attacker can use a string like <code>"
53+
onclick="alert(42)</code> to inject JavaScript code into the response.
54+
55+
</p>
56+
<sample src="examples/IncompleteHtmlAttributeSanitization.js" />
57+
58+
59+
<p>
60+
61+
Sanitizing the user-controlled data for
62+
<code>&quot;</code> helps prevent the vulnerability:
63+
64+
</p>
65+
66+
<sample src="examples/IncompleteHtmlAttributeSanitizationGood.js" />
67+
68+
</example>
69+
70+
<references>
71+
<li>
72+
OWASP:
73+
<a href="https://cheatsheetseries.owasp.org/cheatsheets/DOM_based_XSS_Prevention_Cheat_Sheet.html">DOM based
74+
XSS Prevention Cheat Sheet</a>.
75+
</li>
76+
<li>
77+
OWASP:
78+
<a href="https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html">XSS
79+
(Cross Site Scripting) Prevention Cheat Sheet</a>.
80+
</li>
81+
<li>
82+
OWASP
83+
<a href="https://owasp.org/www-community/Types_of_Cross-Site_Scripting">Types of Cross-Site</a>.
84+
</li>
85+
<li>
86+
Wikipedia: <a href="http://en.wikipedia.org/wiki/Cross-site_scripting">Cross-site scripting</a>.
87+
</li>
88+
</references>
89+
90+
</qhelp>
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
/**
2+
* @name Incomplete HTML attribute sanitization
3+
* @description Writing incompletely sanitized values to HTML
4+
* attribute strings can lead to a cross-site
5+
* scripting vulnerability.
6+
* @kind path-problem
7+
* @problem.severity warning
8+
* @precision high
9+
* @id js/incomplete-html-attribute-sanitization
10+
* @tags security
11+
* external/cwe/cwe-079
12+
* external/cwe/cwe-116
13+
* external/cwe/cwe-20
14+
*/
15+
16+
import javascript
17+
import DataFlow::PathGraph
18+
import semmle.javascript.security.dataflow.IncompleteHtmlAttributeSanitization::IncompleteHtmlAttributeSanitization
19+
import semmle.javascript.security.IncompleteBlacklistSanitizer
20+
21+
/**
22+
* Gets a pretty string of the dangerous characters for `sink`.
23+
*/
24+
string prettyPrintDangerousCharaters(Sink sink) {
25+
result =
26+
strictconcat(string s |
27+
s = describeCharacters(sink.getADangerousCharacter())
28+
|
29+
s, ", " order by s
30+
).regexpReplaceAll(",(?=[^,]+$)", " or")
31+
}
32+
33+
from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
34+
where cfg.hasFlowPath(source, sink)
35+
select sink.getNode(), source, sink,
36+
// this message is slightly sub-optimal as we do not have an easy way
37+
// to get the flow labels that reach the sink, so the message includes
38+
// all of them in a disjunction
39+
"Cross-site scripting vulnerability as the output of $@ may contain " +
40+
prettyPrintDangerousCharaters(sink.getNode()) + " when it reaches this attribute definition.",
41+
source.getNode(), "this final HTML sanitizer step"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
var app = require('express')();
2+
3+
app.get('/user/:id', function(req, res) {
4+
let id = req.params.id;
5+
id = id.replace(/<|>/g, ""); // BAD
6+
let userHtml = `<div data-id="${id}">${getUserName(id) || "Unknown name"}</div>`;
7+
// ...
8+
res.send(prefix + userHtml + suffix);
9+
});
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
var app = require('express')();
2+
3+
app.get('/user/:id', function(req, res) {
4+
let id = req.params.id;
5+
id = id.replace(/<|>|&|"/g, ""); // GOOD
6+
let userHtml = `<div data-id="${id}">${getUserName(id) || "Unknown name"}</div>`;
7+
// ...
8+
res.send(prefix + userHtml + suffix);
9+
});
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
/**
2+
* Provides classes and predicates for working with incomplete blacklist sanitizers.
3+
*/
4+
5+
import javascript
6+
7+
/**
8+
* An incomplete black-list sanitizer.
9+
*/
10+
abstract class IncompleteBlacklistSanitizer extends DataFlow::Node {
11+
/**
12+
* Gets a relevant character that is not sanitized by this sanitizer.
13+
*/
14+
abstract string getAnUnsanitizedCharacter();
15+
16+
/**
17+
* Gets the kind of sanitization this sanitizer performs.
18+
*/
19+
abstract string getKind();
20+
}
21+
22+
/**
23+
* Describes the characters represented by `rep`.
24+
*/
25+
string describeCharacters(string rep) {
26+
rep = "\"" and result = "quotes"
27+
or
28+
rep = "&" and result = "ampersands"
29+
or
30+
rep = "<" and result = "less-thans"
31+
or
32+
rep = ">" and result = "greater-thans"
33+
}
34+
35+
/**
36+
* A local sequence of calls to `String.prototype.replace`,
37+
* represented by the last call.
38+
*/
39+
class StringReplaceCallSequence extends DataFlow::CallNode {
40+
StringReplaceCallSequence() {
41+
this instanceof StringReplaceCall and
42+
not exists(getAStringReplaceMethodCall(this)) // terminal
43+
}
44+
45+
/**
46+
* Gets a member of this sequence.
47+
*/
48+
StringReplaceCall getAMember() { this = getAStringReplaceMethodCall*(result) }
49+
50+
/** Gets a string that is the replacement of this call. */
51+
string getAReplacementString() {
52+
// this is more restrictive than `StringReplaceCall::replaces/2`, in the name of precision
53+
getAMember().getRawReplacement().getStringValue() = result
54+
}
55+
56+
/** Gets a string that is being replaced by this call. */
57+
string getAReplacedString() { getAMember().getAReplacedString() = result }
58+
}
59+
60+
/**
61+
* A specialized version of `DataFlow::Node::getAMethodCall` that is
62+
* restricted to `StringReplaceCall`-nodes.
63+
*/
64+
private StringReplaceCall getAStringReplaceMethodCall(StringReplaceCall n) {
65+
n.getAMethodCall() = result
66+
}
67+
68+
/**
69+
* Provides predicates and classes for reasoning about HTML sanitization.
70+
*/
71+
module HtmlSanitization {
72+
private predicate fixedGlobalReplacement(StringReplaceCallSequence chain) {
73+
forall(StringReplaceCall member | member = chain.getAMember() |
74+
member.isGlobal() and member.getArgument(0) instanceof DataFlow::RegExpLiteralNode
75+
)
76+
}
77+
78+
/**
79+
* Gets a HTML-relevant character that is replaced by `chain`.
80+
*/
81+
private string getALikelyReplacedCharacter(StringReplaceCallSequence chain) {
82+
result = "\"" and
83+
(
84+
chain.getAReplacedString() = result or
85+
chain.getAReplacementString() = "&quot;" or
86+
chain.getAReplacementString() = "&#34;"
87+
)
88+
or
89+
result = "&" and
90+
(
91+
chain.getAReplacedString() = result or
92+
chain.getAReplacementString() = "&amp;" or
93+
chain.getAReplacementString() = "&#40;"
94+
)
95+
or
96+
result = "<" and
97+
(
98+
chain.getAReplacedString() = result or
99+
chain.getAReplacementString() = "&lt;" or
100+
chain.getAReplacementString() = "&#60;"
101+
)
102+
or
103+
result = ">" and
104+
(
105+
chain.getAReplacedString() = result or
106+
chain.getAReplacementString() = "&gt;" or
107+
chain.getAReplacementString() = "&#62;"
108+
)
109+
}
110+
111+
/**
112+
* An incomplete sanitizer for HTML-relevant characters.
113+
*/
114+
class IncompleteSanitizer extends IncompleteBlacklistSanitizer {
115+
StringReplaceCallSequence chain;
116+
string unsanitized;
117+
118+
IncompleteSanitizer() {
119+
this = chain and
120+
fixedGlobalReplacement(chain) and
121+
not getALikelyReplacedCharacter(chain) = unsanitized and
122+
(
123+
// replaces `<` and `>`
124+
getALikelyReplacedCharacter(chain) = "<" and
125+
getALikelyReplacedCharacter(chain) = ">" and
126+
(
127+
unsanitized = "\""
128+
or
129+
unsanitized = "&"
130+
)
131+
or
132+
// replaces '&' and either `<` or `>`
133+
getALikelyReplacedCharacter(chain) = "&" and
134+
(
135+
getALikelyReplacedCharacter(chain) = ">" and
136+
unsanitized = ">"
137+
or
138+
getALikelyReplacedCharacter(chain) = "<" and
139+
unsanitized = "<"
140+
)
141+
) and
142+
// does not replace special characters that the browser doesn't care for
143+
not chain.getAReplacedString() = ["!", "#", "*", "?", "@", "|", "~"] and
144+
/// only replaces explicit characters: exclude character ranges and negated character classes
145+
not exists(RegExpTerm t | t = chain.getAMember().getRegExp().getRoot().getAChild*() |
146+
t.(RegExpCharacterClass).isInverted() or
147+
t instanceof RegExpCharacterRange
148+
) and
149+
// the replacements are either empty or HTML entities
150+
chain.getAReplacementString().regexpMatch("(?i)(|(&[#a-z0-9]+;))")
151+
}
152+
153+
override string getKind() { result = "HTML" }
154+
155+
override string getAnUnsanitizedCharacter() { result = unsanitized }
156+
}
157+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
/**
2+
* Provides a taint tracking configuration for reasoning about
3+
* incomplete HTML sanitization vulnerabilities.
4+
*
5+
* Note, for performance reasons: only import this file if
6+
* `IncompleteHtmlAttributeSanitization::Configuration` is needed, otherwise
7+
* `IncompleteHtmlAttributeSanitizationCustomizations` should be imported instead.
8+
*/
9+
10+
import javascript
11+
12+
module IncompleteHtmlAttributeSanitization {
13+
import IncompleteHtmlAttributeSanitizationCustomizations::IncompleteHtmlAttributeSanitization
14+
15+
private module Label {
16+
class Quote extends DataFlow::FlowLabel {
17+
Quote() { this = "\"" }
18+
}
19+
20+
class Ampersand extends DataFlow::FlowLabel {
21+
Ampersand() { this = "&" }
22+
}
23+
24+
DataFlow::FlowLabel characterToLabel(string c) { c = result }
25+
}
26+
27+
/**
28+
* A taint-tracking configuration for reasoning about incomplete HTML sanitization vulnerabilities.
29+
*/
30+
class Configuration extends TaintTracking::Configuration {
31+
Configuration() { this = "IncompleteHtmlAttributeSanitization" }
32+
33+
override predicate isSource(DataFlow::Node source, DataFlow::FlowLabel label) {
34+
label = Label::characterToLabel(source.(Source).getAnUnsanitizedCharacter())
35+
}
36+
37+
override predicate isSink(DataFlow::Node sink, DataFlow::FlowLabel label) {
38+
label = Label::characterToLabel(sink.(Sink).getADangerousCharacter())
39+
}
40+
41+
override predicate isAdditionalFlowStep(
42+
DataFlow::Node src, DataFlow::Node dst, DataFlow::FlowLabel srclabel,
43+
DataFlow::FlowLabel dstlabel
44+
) {
45+
super.isAdditionalFlowStep(src, dst) and srclabel = dstlabel
46+
}
47+
48+
override predicate isLabeledBarrier(DataFlow::Node node, DataFlow::FlowLabel lbl) {
49+
lbl = Label::characterToLabel(node.(StringReplaceCall).getAReplacedString()) or
50+
isSanitizer(node)
51+
}
52+
53+
override predicate isSanitizer(DataFlow::Node n) {
54+
n instanceof Sanitizer or
55+
super.isSanitizer(n)
56+
}
57+
}
58+
}

0 commit comments

Comments
 (0)