@@ -1151,6 +1151,40 @@ predicate normalCallArg(CallNode call, Node arg, ArgumentPosition apos) {
11511151 *
11521152 * Note: If `Bar.meth` and `Foo.meth` resolves to the same function, we will end up
11531153 * sending both `self` arguments to that function, which is by definition the right thing to do.
1154+ *
1155+ * ### Bound methods
1156+ *
1157+ * For bound methods, such as `bm = x.m; bm()`, it's a little unclear whether we should
1158+ * still use the object in the attribute lookup (`x.m`) as the self argument in the
1159+ * call (`bm()`). We currently do this, but there might also be cases where we don't
1160+ * want to do this.
1161+ *
1162+ * In the example below, we want to clear taint from the list before it reaches the
1163+ * sink, but because we don't have a use of `l` in the `clear()` call, we currently
1164+ * don't have any way to achieve our goal. (Note that this is a contrived example)
1165+ *
1166+ * ```py
1167+ * l = list()
1168+ * clear = l.clear
1169+ * l.append(tainted)
1170+ * clear()
1171+ * sink(l)
1172+ * ```
1173+ *
1174+ * To make the above even worse, bound-methods have a `__self__` property that refers to
1175+ * the object of the bound-method, so we can re-write the code as:
1176+ *
1177+ * ```py
1178+ * l = list()
1179+ * clear = l.clear
1180+ * clear.__self__.append(tainted)
1181+ * clear()
1182+ * sink(l)
1183+ * ```
1184+ *
1185+ * One idea to solve this is to track the object in a synthetic data-flow node every
1186+ * time the bound method is used, such that the `clear()` call would essentially be
1187+ * translated into `l.clear()`, and we can still have use-use flow.
11541188 */
11551189cached
11561190predicate getCallArg ( CallNode call , Function target , CallType type , Node arg , ArgumentPosition apos ) {
@@ -1160,16 +1194,24 @@ predicate getCallArg(CallNode call, Function target, CallType type, Node arg, Ar
11601194 type instanceof CallTypePlainFunction and
11611195 normalCallArg ( call , arg , apos )
11621196 or
1163- // self argument for normal method calls
1197+ // self argument for normal method calls -- see note above about bound methods
11641198 type instanceof CallTypeNormalMethod and
11651199 apos .isSelf ( ) and
11661200 resolveMethodCall ( call , target , type , arg ) and
1167- // dataflow lib has requirement that arguments and calls are in same enclosing callable.
1201+ // dataflow lib has requirement that arguments and calls are in same enclosing
1202+ // callable. This requirement would be broken if we used `my_obj` as the self
1203+ // argument in the `f()` call in the example below:
1204+ // ```py
1205+ // def call_func(f):
1206+ // f()
1207+ //
1208+ // call_func(my_obj.some_method)
1209+ // ```
11681210 exists ( CfgNode cfgNode | cfgNode .getNode ( ) = call |
11691211 cfgNode .getEnclosingCallable ( ) = arg .getEnclosingCallable ( )
11701212 )
11711213 or
1172- // cls argument for classmethod calls
1214+ // cls argument for classmethod calls -- see ntoe above about bound methods
11731215 type instanceof CallTypeClassMethod and
11741216 apos .isSelf ( ) and
11751217 resolveMethodCall ( call , target , type , arg ) and
0 commit comments