Skip to content

Commit cd20163

Browse files
authored
Merge pull request github#4676 from MathiasVP/untrusted-dataflow-to-external-api-query
C++: Untrusted data used in external APIs
2 parents 2945ead + 715f233 commit cd20163

23 files changed

+737
-10
lines changed

config/identical-files.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,14 @@
358358
"cpp/ql/test/TestUtilities/InlineExpectationsTest.qll",
359359
"python/ql/test/TestUtilities/InlineExpectationsTest.qll"
360360
],
361+
"C++ ExternalAPIs": [
362+
"cpp/ql/src/Security/CWE/CWE-020/ExternalAPIs.qll",
363+
"cpp/ql/src/Security/CWE/CWE-020/ir/ExternalAPIs.qll"
364+
],
365+
"C++ SafeExternalAPIFunction": [
366+
"cpp/ql/src/Security/CWE/CWE-020/SafeExternalAPIFunction.qll",
367+
"cpp/ql/src/Security/CWE/CWE-020/ir/SafeExternalAPIFunction.qll"
368+
],
361369
"XML": [
362370
"cpp/ql/src/semmle/code/cpp/XML.qll",
363371
"csharp/ql/src/semmle/code/csharp/XML.qll",
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
<overview>
6+
<p>Using unsanitized untrusted data in an external API can cause a variety of security issues. This query reports
7+
all external APIs that are used with untrusted data, along with how frequently the API is used, and how many
8+
unique sources of untrusted data flow to this API. This query is designed primarily to help identify which APIs
9+
may be relevant for security analysis of this application.</p>
10+
11+
<p>An external API is defined as a call to a function that is not defined in the source code, and is not
12+
modeled as a taint step in the default taint library. External APIs may be from the C++ standard library,
13+
third party dependencies or from internal dependencies. The query will report the function name, along with
14+
either <code>[param x]</code>, where <code>x</code> indicates the position of the parameter receiving the
15+
untrusted data or <code>[qualifier]</code> indicating the untrusted data is used as the qualifier to the
16+
function call.</p>
17+
18+
</overview>
19+
<recommendation>
20+
21+
<p>For each result:</p>
22+
23+
<ul>
24+
<li>If the result highlights a known sink, no action is required.</li>
25+
<li>If the result highlights an unknown sink for a problem, then add modeling for the sink to the relevant query.</li>
26+
<li>If the result represents a call to an external API which transfers taint, add the appropriate modeling, and
27+
re-run the query to determine what new results have appeared due to this additional modeling.</li>
28+
</ul>
29+
30+
<p>Otherwise, the result is likely uninteresting. Custom versions of this query can extend the <code>SafeExternalAPIFunction</code>
31+
class to exclude known safe external APIs from future analysis.</p>
32+
33+
</recommendation>
34+
<example>
35+
36+
<p>If the query were to return the API <code>fputs [param 1]</code>
37+
then we should first consider whether this a security relevant sink. In this case, this is writing to a <code>FILE*</code>, so we should
38+
consider whether this is an XSS sink. If it is, we should confirm that it is handled by the XSS query.</p>
39+
40+
<p>If the query were to return the API <code>strcat [param 1]</code>, then this should be
41+
reviewed as a possible taint step, because tainted data would flow from the 1st argument to the 0th argument of the call.</p>
42+
43+
<p>Note that both examples are correctly handled by the standard taint tracking library and XSS query.</p>
44+
</example>
45+
<references>
46+
47+
</references>
48+
</qhelp>
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/**
2+
* @name Frequency counts for external APIs that are used with untrusted data
3+
* @description This reports the external APIs that are used with untrusted data, along with how
4+
* frequently the API is called, and how many unique sources of untrusted data flow
5+
* to it.
6+
* @id cpp/count-untrusted-data-external-api
7+
* @kind table
8+
* @tags security external/cwe/cwe-20
9+
*/
10+
11+
import cpp
12+
import ExternalAPIs
13+
14+
from ExternalAPIUsedWithUntrustedData externalAPI
15+
select externalAPI, count(externalAPI.getUntrustedDataNode()) as numberOfUses,
16+
externalAPI.getNumberOfUntrustedSources() as numberOfUntrustedSources order by
17+
numberOfUntrustedSources desc
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#include <cstdio>
2+
3+
void do_get(FILE* request, FILE* response) {
4+
char page[1024];
5+
fgets(page, 1024, request);
6+
7+
char buffer[1024];
8+
strcat(buffer, "The page \"");
9+
strcat(buffer, page);
10+
strcat(buffer, "\" was not found.");
11+
12+
fputs(buffer, response);
13+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#include <cstdio>
2+
3+
void do_get(FILE* request, FILE* response) {
4+
char user_id[1024];
5+
fgets(user_id, 1024, request);
6+
7+
char buffer[1024];
8+
strcat(buffer, "SELECT * FROM user WHERE user_id='");
9+
strcat(buffer, user_id);
10+
strcat(buffer, "'");
11+
12+
// ...
13+
}
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
/**
2+
* Definitions for reasoning about untrusted data used in APIs defined outside the
3+
* database.
4+
*/
5+
6+
private import cpp
7+
private import semmle.code.cpp.models.interfaces.DataFlow
8+
private import semmle.code.cpp.models.interfaces.Taint
9+
import ExternalAPIsSpecific
10+
11+
/** A node representing untrusted data being passed to an external API. */
12+
class UntrustedExternalAPIDataNode extends ExternalAPIDataNode {
13+
UntrustedExternalAPIDataNode() { any(UntrustedDataToExternalAPIConfig c).hasFlow(_, this) }
14+
15+
/** Gets a source of untrusted data which is passed to this external API data node. */
16+
DataFlow::Node getAnUntrustedSource() {
17+
any(UntrustedDataToExternalAPIConfig c).hasFlow(result, this)
18+
}
19+
}
20+
21+
private newtype TExternalAPI =
22+
TExternalAPIParameter(Function f, int index) {
23+
exists(UntrustedExternalAPIDataNode n |
24+
f = n.getExternalFunction() and
25+
index = n.getIndex()
26+
)
27+
}
28+
29+
/** An external API which is used with untrusted data. */
30+
class ExternalAPIUsedWithUntrustedData extends TExternalAPI {
31+
/** Gets a possibly untrusted use of this external API. */
32+
UntrustedExternalAPIDataNode getUntrustedDataNode() {
33+
this = TExternalAPIParameter(result.getExternalFunction(), result.getIndex())
34+
}
35+
36+
/** Gets the number of untrusted sources used with this external API. */
37+
int getNumberOfUntrustedSources() {
38+
result = strictcount(getUntrustedDataNode().getAnUntrustedSource())
39+
}
40+
41+
/** Gets a textual representation of this element. */
42+
string toString() {
43+
exists(Function f, int index, string indexString |
44+
if index = -1 then indexString = "qualifier" else indexString = "param " + index
45+
|
46+
this = TExternalAPIParameter(f, index) and
47+
result = f.toString() + " [" + indexString + "]"
48+
)
49+
}
50+
}
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
/**
2+
* Provides AST-specific definitions for use in the `ExternalAPI` library.
3+
*/
4+
5+
import semmle.code.cpp.dataflow.TaintTracking
6+
import semmle.code.cpp.models.interfaces.FlowSource
7+
import semmle.code.cpp.models.interfaces.DataFlow
8+
import SafeExternalAPIFunction
9+
10+
/** A node representing untrusted data being passed to an external API. */
11+
class ExternalAPIDataNode extends DataFlow::Node {
12+
Call call;
13+
int i;
14+
15+
ExternalAPIDataNode() {
16+
// Argument to call to a function
17+
(
18+
this.asExpr() = call.getArgument(i)
19+
or
20+
i = -1 and this.asExpr() = call.getQualifier()
21+
) and
22+
exists(Function f |
23+
f = call.getTarget() and
24+
// Defined outside the source archive
25+
not f.hasDefinition() and
26+
// Not already modeled as a dataflow or taint step
27+
not f instanceof DataFlowFunction and
28+
not f instanceof TaintFunction and
29+
// Not a call to a known safe external API
30+
not f instanceof SafeExternalAPIFunction
31+
)
32+
}
33+
34+
/** Gets the called API `Function`. */
35+
Function getExternalFunction() { result = call.getTarget() }
36+
37+
/** Gets the index which is passed untrusted data (where -1 indicates the qualifier). */
38+
int getIndex() { result = i }
39+
40+
/** Gets the description of the function being called. */
41+
string getFunctionDescription() { result = getExternalFunction().toString() }
42+
}
43+
44+
/** A configuration for tracking flow from `RemoteFlowSource`s to `ExternalAPIDataNode`s. */
45+
class UntrustedDataToExternalAPIConfig extends TaintTracking::Configuration {
46+
UntrustedDataToExternalAPIConfig() { this = "UntrustedDataToExternalAPIConfig" }
47+
48+
override predicate isSource(DataFlow::Node source) {
49+
exists(RemoteFlowFunction remoteFlow |
50+
remoteFlow = source.asExpr().(Call).getTarget() and
51+
remoteFlow.hasRemoteFlowSource(_, _)
52+
)
53+
}
54+
55+
override predicate isSink(DataFlow::Node sink) { sink instanceof ExternalAPIDataNode }
56+
}
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
<overview>
6+
<p>Using unsanitized untrusted data in an external API can cause a variety of security issues. This query reports
7+
all external APIs that are used with untrusted data, along with how frequently the API is used, and how many
8+
unique sources of untrusted data flow to this API. This query is designed primarily to help identify which APIs
9+
may be relevant for security analysis of this application.</p>
10+
11+
<p>An external API is defined as a call to a function that is not defined in the source code, and is not
12+
modeled as a taint step in the default taint library. External APIs may be from the C++ standard library,
13+
third party dependencies or from internal dependencies. The query will report the function name, along with
14+
either <code>[param x]</code>, where <code>x</code> indicates the position of the parameter receiving the
15+
untrusted data or <code>[qualifier]</code> indicating the untrusted data is used as the qualifier to the
16+
function call.</p>
17+
18+
</overview>
19+
<recommendation>
20+
21+
<p>For each result:</p>
22+
23+
<ul>
24+
<li>If the result highlights a known sink, no action is required.</li>
25+
<li>If the result highlights an unknown sink for a problem, then add modeling for the sink to the relevant query.</li>
26+
<li>If the result represents a call to an external API which transfers taint, add the appropriate modeling, and
27+
re-run the query to determine what new results have appeared due to this additional modeling.</li>
28+
</ul>
29+
30+
<p>Otherwise, the result is likely uninteresting. Custom versions of this query can extend the <code>SafeExternalAPIFunction</code>
31+
class to exclude known safe external APIs from future analysis.</p>
32+
33+
</recommendation>
34+
<example>
35+
36+
<p>If the query were to return the API <code>fputs [param 1]</code>
37+
then we should first consider whether this a security relevant sink. In this case, this is writing to a <code>FILE*</code>, so we should
38+
consider whether this is an XSS sink. If it is, we should confirm that it is handled by the XSS query.</p>
39+
40+
<p>If the query were to return the API <code>strcat [param 1]</code>, then this should be
41+
reviewed as a possible taint step, because tainted data would flow from the 1st argument to the 0th argument of the call.</p>
42+
43+
<p>Note that both examples are correctly handled by the standard taint tracking library and XSS query.</p>
44+
</example>
45+
<references>
46+
47+
</references>
48+
</qhelp>
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/**
2+
* @name Frequency counts for external APIs that are used with untrusted data
3+
* @description This reports the external APIs that are used with untrusted data, along with how
4+
* frequently the API is called, and how many unique sources of untrusted data flow
5+
* to it.
6+
* @id cpp/count-untrusted-data-external-api-ir
7+
* @kind table
8+
* @tags security external/cwe/cwe-20
9+
*/
10+
11+
import cpp
12+
import ir.ExternalAPIs
13+
14+
from ExternalAPIUsedWithUntrustedData externalAPI
15+
select externalAPI, count(externalAPI.getUntrustedDataNode()) as numberOfUses,
16+
externalAPI.getNumberOfUntrustedSources() as numberOfUntrustedSources order by
17+
numberOfUntrustedSources desc
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
<overview>
6+
<p>Using unsanitized untrusted data in an external API can cause a variety of security issues. This query reports
7+
external APIs that use untrusted data. The results are not filtered, so you can audit all examples.
8+
The query provides data for security reviews of the application and you can also use it to identify external APIs
9+
that should be modeled as either taint steps, or sinks for specific problems.</p>
10+
11+
<p>An external API is defined as a call to a function that is not defined in the source code, and is not modeled
12+
as a taint step in the default taint library. External APIs may be from the
13+
C++ standard library, third-party dependencies or from internal dependencies. The query reports uses of
14+
untrusted data in either the qualifier or as one of the arguments of external APIs.</p>
15+
16+
</overview>
17+
<recommendation>
18+
19+
<p>For each result:</p>
20+
21+
<ul>
22+
<li>If the result highlights a known sink, confirm that the result is reported by the relevant query, or
23+
that the result is a false positive because this data is sanitized.</li>
24+
<li>If the result highlights an unknown sink for a problem, then add modeling for the sink to the relevant query,
25+
and confirm that the result is either found, or is safe due to appropriate sanitization.</li>
26+
<li>If the result represents a call to an external API that transfers taint, add the appropriate modeling, and
27+
re-run the query to determine what new results have appeared due to this additional modeling.</li>
28+
</ul>
29+
30+
<p>Otherwise, the result is likely uninteresting. Custom versions of this query can extend the <code>SafeExternalAPIFunction</code>
31+
class to exclude known safe external APIs from future analysis.</p>
32+
33+
</recommendation>
34+
<example>
35+
36+
<p>In this first example, input is read from <code>fgets</code> and then ultimately used in a call to the
37+
<code>fputs</code> external API:</p>
38+
39+
<sample src="ExternalAPISinkExample.cpp" />
40+
41+
<p>This is an XSS sink. The XSS query should therefore be reviewed to confirm that this sink is appropriately modeled,
42+
and if it is, to confirm that the query reports this particular result, or that the result is a false positive due to
43+
some existing sanitization.</p>
44+
45+
<p>In this second example, again a request parameter is read from <code>fgets</code>.</p>
46+
47+
<sample src="ExternalAPITaintStepExample.cpp" />
48+
49+
<p>If the query reported the call to <code>strcat</code> on line 9, this would suggest that this external API is
50+
not currently modeled as a taint step in the taint tracking library. The next step would be to model this as a taint step, then
51+
re-run the query to determine what additional results might be found. In this example, it seems likely that <code>buffer</code>
52+
will be executed as an SQL query, potentially leading to an SQL injection vulnerability.</p>
53+
54+
<p>Note that both examples are correctly handled by the standard taint tracking library and XSS query.</p>
55+
</example>
56+
<references>
57+
58+
</references>
59+
</qhelp>

0 commit comments

Comments
 (0)