Skip to content

Commit d433847

Browse files
committed
C++: Enforce unique enclosing callable
Every data-flow node should have a unique enclosing function (_callable_ in the terminology of the data-flow library), but this was not evident for the optimizer, and it led to a bad join order in `pathStep`. This commit fixes the join order for C++ AST data flow. All other copies of data flow seem to be fine. These are the tuple counts for OpenJDK before this commit: (231s) Tuple counts for DataFlowImplLocal::pathStep#fffff#cur_delta: 5882 ~0% {6} r1 = SCAN DataFlowImplLocal::PathNodeMid#class#ffffff#prev_delta AS I OUTPUT I.<2>, I.<0>, I.<1>, I.<3>, I.<4>, I.<5> 1063406780 ~0% {7} r2 = JOIN r1 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 1 OUTPUT r1.<2>, R.<1>, r1.<1>, r1.<0>, r1.<3>, r1.<4>, r1.<5> 5882 ~1% {6} r3 = JOIN r2 WITH DataFlowUtil::Node::getFunction_dispred#ff AS R ON FIRST 2 OUTPUT r2.<0>, r2.<6>, r2.<2>, r2.<3>, r2.<4>, r2.<5> 105 ~0% {5} r4 = JOIN r3 WITH project#DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_021#join_rhs AS R ON FIRST 2 OUTPUT r3.<2>, r3.<3>, r3.<4>, r3.<5>, R.<2> 5882 ~1% {6} r5 = JOIN r2 WITH DataFlowUtil::Node::getFunction_dispred#ff AS R ON FIRST 2 OUTPUT r2.<5>, r2.<2>, r2.<0>, r2.<3>, r2.<4>, r2.<6> 5882 ~0% {6} r6 = JOIN r5 WITH DataFlowImplLocal::TNil#ff_1#join_rhs AS R ON FIRST 1 OUTPUT r5.<2>, false, r5.<5>, r5.<1>, r5.<3>, r5.<4> 0 ~0% {5} r7 = JOIN r6 WITH DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_02413#join_rhs AS R ON FIRST 3 OUTPUT R.<4>, r6.<3>, r6.<4>, r6.<5>, R.<3> 0 ~0% {5} r8 = JOIN r7 WITH DataFlowImplLocal::TNil#ff AS R ON FIRST 1 OUTPUT r7.<1>, r7.<2>, r7.<3>, R.<1>, r7.<4> 105 ~0% {5} r9 = r4 \/ r8 The problem is that `DataFlowUtil::Node::getFunction_dispred#ff` (`getEnclosingCallable`) is joined too late. After this commit, the tuple counts look like this: (13s) Tuple counts for DataFlowImplLocal::pathStep#fffff#cur_delta: 5882 ~1% {6} r1 = SCAN DataFlowImplLocal::PathNodeMid#class#ffffff#prev_delta AS I OUTPUT I.<1>, I.<0>, I.<2>, I.<3>, I.<4>, I.<5> 5882 ~3% {7} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 1 OUTPUT r1.<2>, R.<1>, r1.<1>, r1.<0>, r1.<3>, r1.<4>, r1.<5> 5882 ~1% {6} r3 = JOIN r2 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 2 OUTPUT r2.<3>, r2.<6>, r2.<2>, r2.<0>, r2.<4>, r2.<5> 105 ~0% {5} r4 = JOIN r3 WITH project#DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_021#join_rhs AS R ON FIRST 2 OUTPUT r3.<2>, r3.<3>, r3.<4>, r3.<5>, R.<2> 5882 ~1% {6} r5 = JOIN r2 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 2 OUTPUT r2.<5>, r2.<2>, r2.<3>, r2.<0>, r2.<4>, r2.<6> 5882 ~0% {6} r6 = JOIN r5 WITH DataFlowImplLocal::TNil#ff_1#join_rhs AS R ON FIRST 1 OUTPUT r5.<2>, false, r5.<5>, r5.<1>, r5.<3>, r5.<4> 0 ~0% {5} r7 = JOIN r6 WITH DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_02413#join_rhs AS R ON FIRST 3 OUTPUT R.<4>, r6.<3>, r6.<4>, r6.<5>, R.<3> 0 ~0% {5} r8 = JOIN r7 WITH DataFlowImplLocal::TNil#ff AS R ON FIRST 1 OUTPUT r7.<1>, r7.<2>, r7.<3>, R.<1>, r7.<4> 105 ~0% {5} r9 = r4 \/ r8 There is a slight slowdown coming from the introduction of a new predicate `DataFlowImplLocal::pathStep#fffff#join_rhs`, which is used only in the standard order: (12s) Tuple counts for DataFlowImplLocal::pathStep#fffff#join_rhs: 282057 ~0% {2} r1 = SCAN DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS I OUTPUT I.<1>, I.<0> 9159890 ~1% {2} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r1.<1> return r2 The evaluation of `unique` is cheap but not free: DataFlowUtil::Node::getEnclosingCallable_dispred#ff .............. 3.9s DataFlowUtil::Node::getEnclosingCallable_dispred#ff_10#join_rhs .. 3.5s The first of these two predicates evaluates `unique`, and the second simply reorders columns. They take about the same time, which suggests that `unique` is about as fast as it can be, given the number of tuples it needs to push around. Note that the column reordering predicate is only needed because of the standard order.
1 parent 46fc913 commit d433847

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

cpp/ql/src/semmle/code/cpp/dataflow/internal/DataFlowUtil.qll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ class Node extends TNode {
4343
/**
4444
* INTERNAL: Do not use. Alternative name for `getFunction`.
4545
*/
46-
Function getEnclosingCallable() { result = this.getFunction() }
46+
final Function getEnclosingCallable() { result = unique( | | this.getFunction()) }
4747

4848
/** Gets the type of this node. */
4949
Type getType() { none() } // overridden in subclasses

0 commit comments

Comments
 (0)