Skip to content

Commit eccaccf

Browse files
yoffCopilot
andcommitted
Python: visit function parameter and return annotations in new CFG
The new (shared-CFG-based) Python control flow graph in `semmle.python.controlflow.internal.Cfg` previously did not emit CFG nodes for parameter type annotations (`def f(x: T): ...`) or for the return type annotation (`-> T`). The legacy CFG emitted both, and a small number of framework models rely on this: `LocalSources.qll`'s `annotatedInstance` walks the parameter annotation expression by way of its CFG node to track that a parameter receives an instance of the annotated class. After the dataflow flip to the new CFG/SSA this regression manifested as lost flows in any test exercising annotation-based parameter tracking: FastAPI `Depends()` receivers, Pydantic request bodies, Starlette `WebSocket`, the call-graph type-annotation test, and so on. Extend `FunctionDefExpr` to visit each annotation as a child of the function-def expression, in CPython evaluation order: positional parameter annotations, `*args` annotation, keyword-only parameter annotations, `**kwargs` annotation, then the return annotation. (Lambda expressions have no annotations in Python syntax, so `LambdaExpr` is unchanged.) PEP 695 type parameters remain out of scope; they belong to the inner annotation scope, not the enclosing CFG. Restored test results across `framework/aiohttp`, `framework/fastapi`, `framework/lxml`, the `CallGraph-type-annotations` test, and `CWE-022-PathInjection`. Two FastAPI list-comprehension MISSING markers become positive (`taint_test.py:41,55`). CPython CFG consistency remains clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent cd59431 commit eccaccf

2 files changed

Lines changed: 63 additions & 4 deletions

File tree

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
category: minorAnalysis
3+
---
4+
* The new (shared-CFG-based) Python control flow graph now visits parameter and return type annotations as CFG nodes for function definitions, matching the legacy CFG. This restores annotation-based type tracking through framework models such as FastAPI's `Depends()`, Pydantic request models, Starlette `WebSocket` handlers, and any other models that flow a class reference through `Parameter.getAnnotation()` to identify instances of the annotated class.

python/ql/lib/semmle/python/controlflow/internal/AstNodeImpl.qll

Lines changed: 59 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1463,10 +1463,19 @@ module Ast implements AstSig<Py::Location> {
14631463

14641464
/**
14651465
* A function definition expression (visits positional and keyword
1466-
* defaults, but NOT PEP 695 type parameters — those bind in an
1467-
* annotation scope that nests the function body, so they belong to
1468-
* the inner scope's CFG, not the enclosing scope's; the legacy CFG
1469-
* also omitted them).
1466+
* defaults followed by parameter and return type annotations, but NOT
1467+
* PEP 695 type parameters — those bind in an annotation scope that
1468+
* nests the function body, so they belong to the inner scope's CFG,
1469+
* not the enclosing scope's; the legacy CFG also omitted them).
1470+
*
1471+
* Evaluation order follows CPython: defaults are pushed first, then
1472+
* keyword-only defaults, then annotations (the `__annotations__` dict
1473+
* is built last, before `MAKE_FUNCTION`). Annotations are emitted as
1474+
* CFG nodes so that flows from a class reference into a parameter's
1475+
* type annotation are visible to dataflow (e.g. so that framework
1476+
* models like FastAPI's `Depends()` can use a parameter's type hint
1477+
* to track that the parameter receives an instance of the annotated
1478+
* class — see `LocalSources::annotatedInstance`).
14701479
*/
14711480
additional class FunctionDefExpr extends Expr {
14721481
private Py::FunctionExpr funcExpr;
@@ -1490,15 +1499,61 @@ module Ast implements AstSig<Py::Location> {
14901499
rank[n + 1](Py::Expr d, int i | d = funcExpr.getArgs().getKwDefault(i) | d order by i)
14911500
}
14921501

1502+
/**
1503+
* Gets the `n`th annotation expression, in CPython evaluation
1504+
* order: positional parameter annotations (by argument position),
1505+
* `*args` annotation, keyword-only parameter annotations (by
1506+
* argument position), `**kwargs` annotation, then the return
1507+
* annotation. Each annotation appears at most once.
1508+
*/
1509+
Expr getAnnotation(int n) {
1510+
result.asExpr() =
1511+
rank[n + 1](Py::Expr a, int subOrder, int subIndex |
1512+
functionAnnotation(funcExpr, a, subOrder, subIndex)
1513+
|
1514+
a order by subOrder, subIndex
1515+
)
1516+
}
1517+
14931518
int getNumberOfDefaults() { result = count(funcExpr.getArgs().getADefault()) }
14941519

1520+
int getNumberOfKwDefaults() { result = count(funcExpr.getArgs().getAKwDefault()) }
1521+
1522+
int getNumberOfAnnotations() {
1523+
result = count(Py::Expr a | functionAnnotation(funcExpr, a, _, _))
1524+
}
1525+
14951526
override AstNode getChild(int index) {
14961527
result = this.getDefault(index)
14971528
or
14981529
result = this.getKwDefault(index - this.getNumberOfDefaults())
1530+
or
1531+
result = this.getAnnotation(index - this.getNumberOfDefaults() - this.getNumberOfKwDefaults())
14991532
}
15001533
}
15011534

1535+
/**
1536+
* Holds if `a` is an annotation of `funcExpr` in slot
1537+
* `(subOrder, subIndex)`. Slots are CPython evaluation order:
1538+
* positional param annotations (subOrder 0, subIndex = argument
1539+
* position), `*args` annotation (1, 0), keyword-only annotations
1540+
* (2, position), `**kwargs` annotation (3, 0), return annotation
1541+
* (4, 0).
1542+
*/
1543+
private predicate functionAnnotation(
1544+
Py::FunctionExpr funcExpr, Py::Expr a, int subOrder, int subIndex
1545+
) {
1546+
a = funcExpr.getArgs().getAnnotation(subIndex) and subOrder = 0
1547+
or
1548+
a = funcExpr.getArgs().getVarargannotation() and subOrder = 1 and subIndex = 0
1549+
or
1550+
a = funcExpr.getArgs().getKwAnnotation(subIndex) and subOrder = 2
1551+
or
1552+
a = funcExpr.getArgs().getKwargannotation() and subOrder = 3 and subIndex = 0
1553+
or
1554+
a = funcExpr.getReturns() and subOrder = 4 and subIndex = 0
1555+
}
1556+
15021557
/** A lambda expression (has default args evaluated at definition time). */
15031558
additional class LambdaExpr extends Expr {
15041559
private Py::Lambda lambda;

0 commit comments

Comments
 (0)