前言

这是南大软件分析课程的第六个实验,需要在实验A5的基础上完成上下文敏感的过程间指针分析。详细实验说明见官方网站。涉及到的主要课程知识如下:

这个实验包含两部分,首先实现的是上下文敏感指针分析的主要逻辑,然后再实现课上讲到的三种不同的上下文敏感策略。个人感觉前半部分不是很难,只需要在上一节代码的基础上根据算法老老实实添加对应的上下文即可。在后半部分,我一开始倒是迷糊了好久:call-site sensitivity的实现不难,但是object sensitivity和type sensitivity的实现一直存在问题。经过一段时间的调试摸索,我解决了k=1的实现问题。在参考了相关博客GitHub仓库后,我完成了正确的实现,并通过了本地和OJ平台的所有测试用例。此外,搜索过程中发现的另一个GitHub仓库也不错,针对八个实验分别给出了有用的提示。

与上一节类似,本节实验也新增了对静态字段、数组索引和静态方法调用的支持。

实验用到的过程间指针分析算法如下图所示:

image-20230409161534157

新增的分析规则

我们在第12课中的分析规则表的基础上新增针对静态字段、数组索引和静态方法调用的规则,在这里给出一张完整的表,方便参考:

种类 语句 规则 PFG边
New i: x = new T() $\frac{}{c:o_{i} \in pt(c:x)}$ N/A
Assign x = y $\frac{c’:o_{i} \in pt(c:y)}{c’:o_{i} \in pt(c:x)}$ $c:x \leftarrow c:y$
Store x.f = y $\frac{c’:o_{i} \in pt(c:x),\ c’’:o_{j} \in pt(c:y)}{c’’:o_{j} \in pt(c’:o_{i}.f)}$ $c’:o_{i}.f \leftarrow c:y$
Load y = x.f $\frac{c’:o_{i} \in pt(c:x),\ c’’:o_{j} \in pt(c’:o_{i}.f)}{c’’:o_{j} \in pt(c:y)}$ $c:y \leftarrow c’:o_{i}.f$
Call l: r = x.k(a1, ..., an) image-20230406215641893 $c:a1 \rightarrow c^{t}:m_{p1} \\ … \\ c:an \rightarrow c^{t}:m_{pn} \\ c:r \leftarrow c^{t}:m_{ret}$
Static Store T.f = y $\frac{c’:o_{i} \in pt(c:y)}{c’:o_{i} \in pt(T.f)}$ $T.f \leftarrow c:y$
Static Load y = T.f $\frac{c’:o_{i} \in pt(T.f)}{c’:o_{i} \in pt(c:y)}$ $c:y \leftarrow T.f$
Array Store x[i] = y $\frac{c’:o_{u} \in pt(c:x),\ c’’:o_{v} \in pt(c:y)}{c’’:o_{v} \in pt(c’:o_{u}[*])}$ $c’:o_{u}[*] \leftarrow c:y$
Array Load y = x[i] $\frac{c’:o_{u} \in pt(c:x),\ c’’:o_{v} \in pt(c’:o_{u}[*])}{c’’:o_{v} \in pt(c:y)}$ $c:y \leftarrow c’:o_{u}[*]$
Static Call l: r = T.m(a1, ..., an) image-20240412152442573 image-20240412152703541

实现上下文敏感指针分析

接下来,结合算法和实验手册分步实现即可。

void addReachable(CSMethod)

与上节实验不同的是,我们要向StmtProcessor中传入上下文信息,需要为每个可达方法创建一个独立的StmtProcessor实例:

private void addReachable(CSMethod csMethod) {
    if (callGraph.contains(csMethod)) {
        return;
    }
    callGraph.addReachableMethod(csMethod);
    StmtProcessor stmtProcessor = new StmtProcessor(csMethod);
    csMethod.getMethod().getIR().forEach(stmt -> {
        stmt.accept(stmtProcessor);
    });
}

语句处理的核心逻辑实现代码位于StmtProcessor中,注意设置好上下文即可:

private class StmtProcessor implements StmtVisitor<Void> {
    private final CSMethod csMethod;
    private final Context context;
    private StmtProcessor(CSMethod csMethod) {
        this.csMethod = csMethod;
        this.context = csMethod.getContext();
    }
    public Void visit(Invoke stmt) {
        if (stmt.isStatic()) { // static invocation
            JMethod m = resolveCallee(null, stmt);
            CSCallSite csCallSite = csManager.getCSCallSite(context, stmt);
            Context ctx = contextSelector.selectContext(csCallSite, m); // c_t
            CSMethod targetMethod = csManager.getCSMethod(ctx, m);
            Edge<CSCallSite, CSMethod> edge = new Edge<>(CallKind.STATIC, csCallSite, targetMethod);
            if (callGraph.addEdge(edge)) { // the call graph changes
                addReachable(targetMethod);
                for (int i = 0; i < m.getParamCount(); i++) {
                    CSVar arg = csManager.getCSVar(context, stmt.getRValue().getArg(i));
                    CSVar param = csManager.getCSVar(ctx, m.getIR().getParam(i));
                    addPFGEdge(arg, param);
                }
                Var lVar = stmt.getLValue();
                if (lVar != null) {
                    CSVar lv = csManager.getCSVar(context, lVar);
                    for (Var ret: m.getIR().getReturnVars()) {
                        CSVar csRet = csManager.getCSVar(ctx, ret);
                        addPFGEdge(csRet, lv);
                    }
                }
            }
        }
        return null;
    }
    public Void visit(New stmt) {
        CSVar csVar = csManager.getCSVar(context, stmt.getLValue());
        Obj o = heapModel.getObj(stmt);
        Context ctx = contextSelector.selectHeapContext(csMethod, o);
        CSObj csObj = csManager.getCSObj(ctx, o);
        PointsToSet pts = PointsToSetFactory.make(csObj);
        workList.addEntry(csVar, pts);
        return null;
    }
    public Void visit(Copy stmt) {
        CSVar from = csManager.getCSVar(context, stmt.getRValue());
        CSVar to = csManager.getCSVar(context, stmt.getLValue());
        addPFGEdge(from, to);
        return null;
    }
    public Void visit(StoreField stmt) {
        JField field = stmt.getFieldRef().resolve();
        if (field.isStatic()) {
            CSVar from = csManager.getCSVar(context, stmt.getRValue());
            StaticField to = csManager.getStaticField(field);
            addPFGEdge(from, to);
        }
        return null;
    }
    public Void visit(LoadField stmt) {
        JField field = stmt.getFieldRef().resolve();
        if (field.isStatic()) {
            CSVar to = csManager.getCSVar(context, stmt.getLValue());
            StaticField from = csManager.getStaticField(field);
            addPFGEdge(from, to);
        }
        return null;
    }
    public Void visit(StoreArray stmt) {
        return StmtVisitor.super.visit(stmt);
    }
    public Void visit(LoadArray stmt) {
        return StmtVisitor.super.visit(stmt);
    }
}

void addPFGEdge(Pointer, Pointer)

对应算法的AddEdge函数,比较简单:

private void addPFGEdge(Pointer source, Pointer target) {
    if (pointerFlowGraph.addEdge(source, target)) {
        if (!source.getPointsToSet().isEmpty()) {
            workList.addEntry(target, source.getPointsToSet());
        }
    }
}

void analyze()

对应Solve函数的主要部分——while循环:

private void analyze() {
    while (!workList.isEmpty()) {
        WorkList.Entry entry = workList.pollEntry();
        PointsToSet delta = propagate(entry.pointer(), entry.pointsToSet());
        if (entry.pointer() instanceof CSVar csVar) {
            Var var = csVar.getVar();
            delta.forEach(csObj -> {
                var.getStoreFields().forEach(stmt -> {
                    CSVar from = csManager.getCSVar(csVar.getContext(), stmt.getRValue());
                    JField f = stmt.getFieldRef().resolve();
                    InstanceField to = csManager.getInstanceField(csObj, f);
                    addPFGEdge(from, to);
                });
                var.getLoadFields().forEach(stmt -> {
                    CSVar to = csManager.getCSVar(csVar.getContext(), stmt.getLValue());
                    JField f = stmt.getFieldRef().resolve();
                    InstanceField from = csManager.getInstanceField(csObj, f);
                    addPFGEdge(from, to);
                });
                var.getStoreArrays().forEach(stmt -> {
                    CSVar from = csManager.getCSVar(csVar.getContext(), stmt.getRValue());
                    ArrayIndex to = csManager.getArrayIndex(csObj);
                    addPFGEdge(from, to);
                });
                var.getLoadArrays().forEach(stmt -> {
                    CSVar to = csManager.getCSVar(csVar.getContext(), stmt.getLValue());
                    ArrayIndex from = csManager.getArrayIndex(csObj);
                    addPFGEdge(from, to);
                });
                processCall(csVar, csObj);
            });
        }
    }
}

PointsToSet propagate(Pointer, PointsToSet)

这里对应算法的Propagate函数:

private PointsToSet propagate(Pointer pointer, PointsToSet pointsToSet) {
    PointsToSet delta = PointsToSetFactory.make();
    pointsToSet.forEach(csobj -> {
        if (!pointer.getPointsToSet().contains(csobj)) {
            delta.addObject(csobj);
        }
    });
    if (!delta.isEmpty()) {
        delta.forEach(csobj -> {
            pointer.getPointsToSet().addObject(csobj);
        });
        Set<Pointer> succ = pointerFlowGraph.getSuccsOf(pointer);
        succ.forEach(pointer1 -> {
            workList.addEntry(pointer1, delta);
        });
    }
    return delta;
}

void processCall(CSVar, CSObj)

这里是算法中ProcessCall函数的实现:

private void processCall(CSVar recv, CSObj recvObj) {
    Var var = recv.getVar();
    var.getInvokes().forEach(stmt -> {
        JMethod m = resolveCallee(recvObj, stmt);
        CSCallSite csCallSite = csManager.getCSCallSite(recv.getContext(), stmt);
        Context ctx = contextSelector.selectContext(csCallSite, recvObj, m);
        CSMethod targetMethod = csManager.getCSMethod(ctx, m);
        CSVar mThis = csManager.getCSVar(ctx, m.getIR().getThis());
        workList.addEntry(mThis, PointsToSetFactory.make(recvObj));
        Edge<CSCallSite, CSMethod> edge = null;
        if (stmt.isDynamic()) {
            edge = new Edge<>(CallKind.DYNAMIC, csCallSite, targetMethod);
        } else if (stmt.isInterface()) {
            edge = new Edge<>(CallKind.INTERFACE, csCallSite, targetMethod);
        } else if (stmt.isSpecial()) {
            edge = new Edge<>(CallKind.SPECIAL, csCallSite, targetMethod);
        } else if (stmt.isVirtual()) {
            edge = new Edge<>(CallKind.VIRTUAL, csCallSite, targetMethod);
        }
        if (callGraph.addEdge(edge)) {
            addReachable(targetMethod);
            for (int i = 0; i < m.getParamCount(); i++) {
                CSVar arg = csManager.getCSVar(recv.getContext(), stmt.getRValue().getArg(i));
                CSVar param = csManager.getCSVar(ctx, m.getIR().getParam(i));
                addPFGEdge(arg, param);
            }
            Var lVar = stmt.getLValue();
            if (lVar != null) {
                CSVar lv = csManager.getCSVar(recv.getContext(), lVar);
                for (Var ret: m.getIR().getReturnVars()) {
                    CSVar csRet = csManager.getCSVar(ctx, ret);
                    addPFGEdge(csRet, lv);
                }
            }
        }
    });
}

实现常见的上下文敏感策略

这部分内容包含call-site sensitivity、object sensitivity和type sensitivity的上下文选择器的代码实现。_1开头的文件是1层上下文的对应实现,_2开头的文件是2层上下文的对应实现。

_1CallSelector.java

下面是1层call-site sensitivity的实现:

public class _1CallSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        return ListContext.make(callSite.getCallSite());
    }
    @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        return ListContext.make(callSite.getCallSite());
    } 
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        return getEmptyContext();
    }
}

_1ObjSelector.java

下面是1层object sensitivity的实现:

public class _1ObjSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    }
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        return callSite.getContext();
    }
    @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        return ListContext.make(recv.getObject());
    }
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        return getEmptyContext();
    }
}

_1TypeSelector.java

下面是1层type sensitivity的实现:

public class _1TypeSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        return callSite.getContext();
    }
    @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
       return ListContext.make(recv.getObject().getContainerType());
    } 
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        return getEmptyContext();
    }
}

一开始我没有弄清楚一些方法的意义,在处理实例方法的selectContext时是这样写的:

return ListContext.make(recv.getObject().getAllocation().getClass());

在打印输出、断点调试了很久之后,终于锁定到这个语句——为了获取object的allocation-site所在class,应该调用.getObject().getContainerType(),而非.getObject().getAllocation().getClass()

_2CallSelector.java

下面是2层call-site sensitivity的实现:

public class _2CallSelector implements ContextSelector {
    public Context getEmptyContext() {
        return ListContext.make();
    }
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        int length = callSite.getContext().getLength();
        if (length > 0) {
            return ListContext.make(callSite.getContext().getElementAt(length - 1), callSite.getCallSite());
        } else {
            return ListContext.make(callSite.getCallSite());
        }
    }
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        int length = callSite.getContext().getLength();
        if (length > 0) {
            return ListContext.make(callSite.getContext().getElementAt(length - 1), callSite.getCallSite());
        } else {
            return ListContext.make(callSite.getCallSite());
        }
    }
    public Context selectHeapContext(CSMethod method, Obj obj) {
        int length = method.getContext().getLength();
        if (length > 0) {
            return ListContext.make(method.getContext().getElementAt(length - 1));
        } else {
            return getEmptyContext();
        }
    }
}

_2ObjSelector.java

这里的实现花了一些时间。一开始我的思维还受callsite的影响,实现了下面这样的错误代码:

public class _2ObjSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        return callSite.getContext();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        int length = callSite.getContext().getLength();
        if (length > 0) {
            return ListContext.make(callSite.getContext().getElementAt(length - 1), recv.getObject());
        } else {
            return ListContext.make(recv.getObject());
        }
    } 
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        return ListContext.make(obj);
    }
}

后来在参考了别人的实现后,我发现自己还是没有真正领会算法中Select的意义。不同的上下文敏感策略依赖的Select的参数不同!下面是修正后的版本:

public class _2ObjSelector implements ContextSelector { 
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        // TODO - finish me - Done
        int length = callSite.getContext().getLength();
        if (length > 0) {
            if (length == 1) {
                return ListContext.make(callSite.getContext().getElementAt(0));
            } else {
                return ListContext.make(callSite.getContext().getElementAt(length - 2), callSite.getContext().getElementAt(length - 1));
            }
        }
        return getEmptyContext();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        int length = recv.getContext().getLength();
        if (length > 0) {
            return ListContext.make(recv.getContext().getElementAt(length - 1), recv.getObject());
        }
        return ListContext.make(recv.getObject());
    } 
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        int length = method.getContext().getLength();
        if (length > 0) {
            return ListContext.make(method.getContext().getElementAt(length - 1));
        }
        return getEmptyContext();
    }
}

_2TypeSelector.java

_2ObjSelector.java中的问题一样,一开始我的实现是错误的:

public class _2TypeSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    }
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        return callSite.getContext();
    }
     @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
        int length = callSite.getContext().getLength();
        if (length > 0) {
            return ListContext.make(callSite.getContext().getElementAt(length - 1), recv.getObject().getContainerType());
        } else {
            return ListContext.make(recv.getObject().getContainerType());
        }
    }
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        return ListContext.make(obj.getContainerType());
    }
}

修正后的版本如下:

public class _2TypeSelector implements ContextSelector {
    @Override
    public Context getEmptyContext() {
        return ListContext.make();
    } 
    @Override
    public Context selectContext(CSCallSite callSite, JMethod callee) {
        int length = callSite.getContext().getLength();
        if (length > 0) {
            if (length == 1) {
                return ListContext.make(callSite.getContext().getElementAt(0));
            } else {
                return ListContext.make(callSite.getContext().getElementAt(length - 2), callSite.getContext().getElementAt(length - 1));
            }
        }
        return getEmptyContext();
    }
     @Override
    public Context selectContext(CSCallSite callSite, CSObj recv, JMethod callee) {
         int length = recv.getContext().getLength();
        if (length > 0) {
            return ListContext.make(recv.getContext().getElementAt(length - 1), recv.getObject().getContainerType());
        }
        return ListContext.make(recv.getObject().getContainerType());
    }
    @Override
    public Context selectHeapContext(CSMethod method, Obj obj) {
        int length = method.getContext().getLength();
        if (length > 0) {
            return ListContext.make(method.getContext().getElementAt(length - 1));
        }
        return getEmptyContext();
    }
}

总结与思考

纸上得来终觉浅,绝知此事要躬行!