|
| 1 | +# CodeQL Design Patterns |
| 2 | + |
| 3 | +A list of design patterns you are recommended to follow. |
| 4 | + |
| 5 | +## `::Range` for extensibility and refinement |
| 6 | + |
| 7 | +To allow both extensibility and refinement of classes, we use what is commonly referred to as the `::Range` pattern (since https://github.com/github/codeql/pull/727), but the actual implementation can use different names. |
| 8 | + |
| 9 | +This pattern should be used when you want to model a user-extensible set of values ("extensibility"), while allowing restrictive subclasses, typically for the purposes of overriding predicates ("refinement"). Using a simple `abstract` class gives you the former, but makes it impossible to create overriding methods for all contributing extensions at once. Using a non-`abstract` class provides refinement-based overriding, but requires the original class to range over a closed, non-extensible set. |
| 10 | +<details> |
| 11 | +<summary>Generic example of how to define classes with ::Range</summary> |
| 12 | + |
| 13 | +Using a single `abstract` class looks like this: |
| 14 | +```ql |
| 15 | +/** <QLDoc...> */ |
| 16 | +abstract class MySpecialExpr extends Expr { |
| 17 | + /** <QLDoc...> */ |
| 18 | + abstract int memberPredicate(); |
| 19 | +} |
| 20 | +class ConcreteSubclass extends MySpecialExpr { ... } |
| 21 | +``` |
| 22 | + |
| 23 | +While this allows users of the library to add new types of `MySpecialExpr` (like, in this case, `ConcreteSubclass`), there is no way to override the implementations of `memberPredicate` of all extensions at once. |
| 24 | + |
| 25 | +Applying the `::Range` pattern yields the following: |
| 26 | + |
| 27 | +```ql |
| 28 | +/** |
| 29 | + * <QLDoc...> |
| 30 | + * |
| 31 | + * Extend this class to refine existing API models. If you want to model new APIs, |
| 32 | + * extend `MySpecialExpr::Range` instead. |
| 33 | + */ |
| 34 | +class MySpecialExpr extends Expr { |
| 35 | + MySpecialExpr::Range range; |
| 36 | +
|
| 37 | + MySpecialExpr() { this = range } |
| 38 | +
|
| 39 | + /** <QLDoc...> */ |
| 40 | + int memberPredicate() { result = range.memberPredicate() } |
| 41 | +} |
| 42 | +
|
| 43 | +/** Provides a class for modeling new <...> APIs. */ |
| 44 | +module MySpecialExpr { |
| 45 | + /** |
| 46 | + * <QLDoc...> |
| 47 | + * |
| 48 | + * Extend this class to model new APIs. If you want to refine existing API models, |
| 49 | + * extend `MySpecialExpr` instead. |
| 50 | + */ |
| 51 | + abstract class Range extends Expr { |
| 52 | + /** <QLDoc...> */ |
| 53 | + abstract int memberPredicate(); |
| 54 | + } |
| 55 | +} |
| 56 | +``` |
| 57 | +Now, a concrete subclass can derive from `MySpecialExpr::Range` if it wants to extend the set of values in `MySpecialExpr`, and it will be required to implement the abstract `memberPredicate()`. Conversely, if it wants to refine `MySpecialExpr` and override `memberPredicate` for all extensions, it can do so by deriving from `MySpecialExpr` directly. |
| 58 | + |
| 59 | +The key element of the pattern is to provide a field of type `MySpecialExpr::Range`, equating it to `this` in the characteristic predicate of `MySpecialExpr`. In member predicates, we can use either `this` or `range`, depending on which type has the API we need. |
| 60 | + |
| 61 | +</details> |
| 62 | + |
| 63 | +Note that in some libraries, the `range` field is in fact called `self`. While we do recommend using `range` for consistency, the name of the field does not matter (and using `range` avoids confusion in contexts like Python analysis that has strong usage of `self`). |
| 64 | + |
| 65 | +### Rationale |
| 66 | + |
| 67 | +Let's use an example from the Go libraries: https://github.com/github/codeql-go/blob/2ba9bbfd8ba1818b5ee9f6009c86a605189c9ef3/ql/src/semmle/go/Concepts.qll#L119-L157 |
| 68 | + |
| 69 | +`EscapeFunction`, as the name suggests, models various APIs that escape meta-characters. It has a member-predicate `kind()` that tells you what sort of escaping the modelled function does. For example, if the result of that predicate is `"js"`, then this means that the escaping function is meant to make things safe to embed inside JavaScript. |
| 70 | +`EscapeFunction::Range` is subclassed to model various APIs, and `kind()` is implemented accordingly. |
| 71 | +But we can also subclass `EscapeFunction` to, as in the above example, talk about all JS-escaping functions. |
| 72 | + |
| 73 | +You can, of course, do the same without the `::Range` pattern, but it's a little cumbersome: |
| 74 | +If you only had an `abstract class EscapeFunction { ... }`, then `JsEscapeFunction` would need to be implemented in a slightly tricky way to prevent it from extending `EscapeFunction` (instead of refining it). You would have to give it a charpred `this instanceof EscapeFunction`, which looks useless but isn't. And additionally, you'd have to provide trivial `none()` overrides of all the abstract predicates defined in `EscapeFunction`. This is all pretty awkward, and we can avoid it by distinguishing between `EscapeFunction` and `EscapeFunction::Range`. |
| 75 | + |
| 76 | + |
| 77 | +## Importing all subclasses of a class |
| 78 | + |
| 79 | +Importing new files can modify the behaviour of the standard library, by introducing new subtypes of `abstract` classes, by introducing new multiple inheritance relationships, or by overriding predicates. This can change query results and force evaluator cache misses. |
| 80 | + |
| 81 | +Therefore, unless you have good reason not to, you should ensure that all subclasses are included when the base-class is (to the extent possible). |
| 82 | + |
| 83 | +One example where this _does not_ apply: `DataFlow::Configuration` and its variants are meant to be subclassed, but we generally do not want to import all configurations into the same scope at once. |
| 84 | + |
| 85 | + |
| 86 | +## Abstract classes as open or closed unions |
| 87 | + |
| 88 | +A class declared as `abstract` in QL represents a union of its direct subtypes (restricted by the intersections of its supertypes and subject to its characteristic predicate). Depending on context, we may want this union to be considered "open" or "closed". |
| 89 | + |
| 90 | +An open union is generally used for extensibility. For example, the abstract classes suggested by the `::Range` design pattern are explicitly intended as extension hooks. As another example, the `DataFlow::Configuration` design pattern provides an abstract class that is intended to be subclassed as a configuration mechanism. |
| 91 | + |
| 92 | +A closed union is a class for which we do not expect users of the library to add more values. Historically, we have occasionally modelled this as `abstract` classes in QL, but these days that would be considered an anti-pattern: Abstract classes that are intended to be closed behave in surprising ways when subclassed by library users, and importing libraries that include derived classes can invalidate compilation caches and subvert the meaning of the program. |
| 93 | + |
| 94 | +As an example, suppose we want to define a `BinaryExpr` class, which has subtypes of `PlusExpr`, `MinusExpr`, and so on. Morally, this represents a closed union: We do not anticipate new kinds of `BinaryExpr` being added. Therefore, it would be undesirable to model it as an abstract class: |
| 95 | + |
| 96 | +```ql |
| 97 | +/** ANTI-PATTERN */ |
| 98 | +abstract class BinaryExpr extends Expr { |
| 99 | + Expr getLhs() { result = this.getChild(0) } |
| 100 | + Expr getRight() { result = this.getChild(1) } |
| 101 | +} |
| 102 | +
|
| 103 | +class PlusExpr extends BinaryExpr {} |
| 104 | +class MinusExpr extends BinaryExpr {} |
| 105 | +... |
| 106 | +``` |
| 107 | + |
| 108 | +Instead, the `BinaryExpr` class should be non-`abstract`, and we have the following options for specifying its extent: |
| 109 | + |
| 110 | +- Define a dbscheme type `@binary_expr = @plus_expr | @minus_expr | ...` and add it as an additional super-class for `BinaryExpr`. |
| 111 | +- Define a type alias `class RawBinaryExpr = @plus_expr | @minus_expr | ...` and add it as an additional super-class for `BinaryExpr`. |
| 112 | +- Add a characteristic predicate of `BinaryExpr() { this instanceof PlusExpr or this instanceof MinusExpr or ... }`. |
0 commit comments