Skip to content

Commit 47c1281

Browse files
committed
C++: IR DataFlowUtil::modelFlow join order fix
We had these tuple counts on ElektraInitiative/libelektra (note that the `modelFlow` predicate got inlined into `simpleInstructionLocalFlowStep`): (652s) Tuple counts for DataFlowUtil::simpleInstructionLocalFlowStep#ff: ... 19701 ~1% {4} r27 = JOIN r26 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r26.<2>, r26.<1>, r26.<0> 7908 ~0% {3} r28 = JOIN r27 WITH SSAConstruction::Cached::getInstructionIndex#ff@staged_ext AS R ON FIRST 2 OUTPUT r27.<0>, r27.<2>, r27.<3> 4023 ~0% {3} r29 = JOIN r28 WITH Instruction::WriteSideEffectInstruction#class#ff AS R ON FIRST 1 OUTPUT r28.<1>, r28.<2>, r28.<0> ... 1060807009 ~3% {3} r34 = JOIN r33 WITH SSAConstruction::Cached::getInstructionIndex#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r33.<1>, r33.<2> 15670 ~5% {2} r35 = JOIN r34 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 2 OUTPUT r34.<0>, r34.<2> 7973 ~0% {2} r36 = JOIN r35 WITH Instruction::ReadSideEffectInstruction::getSideEffectOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r35.<1> ... In this predicate there are two cases (`WriteSideEffectInstruction` and `ReadSideEffectInstruction`) where we need to join on both the call and the argument index of a side effect. It works well enough for the first case, `WriteSideEffectInstruction`, where the call is joined on before the index, but it explodes in the second case, `ReadSideEffectInstruction`, where the index is joined first. To fix the second case, and to guard against future optimizer accidents in the first case, this commit changes both of those cases to use a new helper predicate that makes it possible to join on both columns at once. The resulting tuple counts are: (3s) Tuple counts for DataFlowUtil::simpleInstructionLocalFlowStep#ff: ... 7908 ~0% {3} r27 = JOIN r26 WITH DataFlowUtil::getSideEffectFor#fff AS R ON FIRST 2 OUTPUT R.<2>, r26.<2>, r26.<0> 4023 ~0% {3} r28 = JOIN r27 WITH Instruction::WriteSideEffectInstruction#class#ff AS R ON FIRST 1 OUTPUT r27.<1>, r27.<2>, r27.<0> ... 15670 ~5% {2} r33 = JOIN r32 WITH DataFlowUtil::getSideEffectFor#fff AS R ON FIRST 2 OUTPUT R.<2>, r32.<2> 7973 ~0% {2} r34 = JOIN r33 WITH Instruction::ReadSideEffectInstruction::getSideEffectOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r33.<1> ... The bulge is now limited to a factor of two, and that's just because I didn't write separate versions of `getSideEffectFor` for `ReadSideEffectInstruction` and `WriteSideEffectInstruction`.
1 parent 09960e0 commit 47c1281

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -365,10 +365,10 @@ private predicate modelFlow(Instruction iFrom, Instruction iTo) {
365365
modelOut.isReturnValueDeref() and
366366
iTo = call
367367
or
368-
exists(WriteSideEffectInstruction outNode |
369-
modelOut.isParameterDeref(outNode.getIndex()) and
368+
exists(int index, WriteSideEffectInstruction outNode |
369+
modelOut.isParameterDeref(index) and
370370
iTo = outNode and
371-
outNode.getPrimaryInstruction() = call
371+
outNode = getSideEffectFor(call, index)
372372
)
373373
// TODO: add write side effects for qualifiers
374374
) and
@@ -380,8 +380,7 @@ private predicate modelFlow(Instruction iFrom, Instruction iTo) {
380380
or
381381
exists(int index, ReadSideEffectInstruction read |
382382
modelIn.isParameterDeref(index) and
383-
read.getIndex() = index and
384-
read.getPrimaryInstruction() = call and
383+
read = getSideEffectFor(call, index) and
385384
iFrom = read.getSideEffectOperand().getAnyDef()
386385
)
387386
or
@@ -392,6 +391,18 @@ private predicate modelFlow(Instruction iFrom, Instruction iTo) {
392391
)
393392
}
394393

394+
/**
395+
* Holds if the result is a side effect for instruction `call` on argument
396+
* index `argument`. This helper predicate makes it easy to join on both of
397+
* these columns at once, avoiding pathological join orders in case the
398+
* argument index should get joined first.
399+
*/
400+
pragma[noinline]
401+
SideEffectInstruction getSideEffectFor(CallInstruction call, int argument) {
402+
call = result.getPrimaryInstruction() and
403+
argument = result.(IndexedInstruction).getIndex()
404+
}
405+
395406
/**
396407
* Holds if data flows from `source` to `sink` in zero or more local
397408
* (intra-procedural) steps.

0 commit comments

Comments
 (0)