Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to keep track of a variable with Clang's static analyzer?

Suppose I'm working with the following C snippet:

void inc(int *num) {*num++;}
void dec(int *num) {*num--;}

void f(int var) {
    inc(&var);
    dec(&var);
}

By using a static analyzer, I want to be able to tell if the value of var didn't change during the function's execution. I know I have to keep its state on my own (that's the point of writing a Clang checker), but I'm having troubles getting a unique reference of this variable.

For example: if I use the following API

void MySimpleChecker::checkPostCall(const CallEvent &Call,
                                    CheckerContext &C) const {
    SymbolRef MyArg = Call.getArgSVal(0).getAsSymbol();
}

I'd expect it to return a pointer to this symbol's representation in my checker's context. However, I always get 0 into MyArg by using it this way. This happens for both inc and dec functions in the pre and post callbacks.

What am I missing here? What concepts did I get wrong?

Note: I'm currently reading the Clang CFE Internals Manual and I've read the excellent How to Write a Checker in 24 Hours material. I still couldn't find my answer so far.

like image 486
ivarec Avatar asked May 03 '14 18:05

ivarec


1 Answers

Interpretation of question

Specifically, you want to count the calls to inc and dec applied to each variable and report when they do not balance for some path in a function.

Generally, you want to know how to associate an abstract value, here a number, with a program variable, and be able to update and query that value along each execution path.

High-level answer

Whereas the tutorial checker SimpleStreamChecker.cpp associates an abstract value with the value stored in a variable, here we want associate an abstract value with the variable itself. That is what IteratorChecker.cpp does when tracking containers, so I based my solution on it.

Within the static analyzer's abstract state, each variable is represented by a MemRegion object. So the first step is to make a map where MemRegion is the key:

REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)

Next, when we have an SVal that corresponds to a pointer to a variable, we can use SVal::getAsRegion to get the corresponding MemRegion. For instance, given a CallEvent, call, with a first argument that is a pointer, we can do:

    if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {

to get the region that the pointer points at.

Then, we can access our map using that region as its key:

      state = state->set<TrackVarMap>(region, newValue);

Finally, in checkDeadSymbols, we use SymbolReaper::isLiveRegion to detect when a region (variable) is going out of scope:

  const TrackVarMapTy &Map = state->get<TrackVarMap>();
  for (auto const &I : Map) {
    MemRegion const *region = I.first;
    int delta = I.second;
    if (SymReaper.isLiveRegion(region) || (delta==0))
      continue;              // Not dead, or unchanged; skip.

Complete example

To demonstrate, here is a complete checker that reports unbalanced use of inc and dec:

// TrackVarChecker.cpp
// https://stackoverflow.com/questions/23448540/how-to-keep-track-of-a-variable-with-clangs-static-analyzer

#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"

using namespace clang;
using namespace ento;

namespace {
class TrackVarChecker
  : public Checker< check::PostCall,
                    check::DeadSymbols >
{
  mutable IdentifierInfo *II_inc, *II_dec;
  mutable std::unique_ptr<BuiltinBug> BT_modified;

public:
  TrackVarChecker() : II_inc(nullptr), II_dec(nullptr) {}

  void checkPostCall(CallEvent const &Call, CheckerContext &C) const;
  void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;
};
} // end anonymous namespace

// Map from memory region corresponding to a variable (that is, the
// variable itself, not its current value) to the difference between its
// current and original value.
REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)

void TrackVarChecker::checkPostCall(CallEvent const &call, CheckerContext &C) const
{
  const FunctionDecl *FD = dyn_cast<FunctionDecl>(call.getDecl());
  if (!FD || FD->getKind() != Decl::Function) {
    return;
  }

  ASTContext &Ctx = C.getASTContext();
  if (!II_inc) {
    II_inc = &Ctx.Idents.get("inc");
  }
  if (!II_dec) {
    II_dec = &Ctx.Idents.get("dec");
  }

  if (FD->getIdentifier() == II_inc || FD->getIdentifier() == II_dec) {
    // We expect the argument to be a pointer.  Get the memory region
    // that the pointer points at.
    if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {
      // Increment the associated value, creating it first if needed.
      ProgramStateRef state = C.getState();
      int delta = (FD->getIdentifier() == II_inc)? +1 : -1;
      int const *curp = state->get<TrackVarMap>(region);
      int newValue = (curp? *curp : 0) + delta;
      state = state->set<TrackVarMap>(region, newValue);
      C.addTransition(state);
    }
  }
}

void TrackVarChecker::checkDeadSymbols(
  SymbolReaper &SymReaper, CheckerContext &C) const
{
  ProgramStateRef state = C.getState();
  const TrackVarMapTy &Map = state->get<TrackVarMap>();
  for (auto const &I : Map) {
    // Check for a memory region (variable) going out of scope that has
    // a non-zero delta.
    MemRegion const *region = I.first;
    int delta = I.second;
    if (SymReaper.isLiveRegion(region) || (delta==0)) {
      continue;              // Not dead, or unchanged; skip.
    }

    //llvm::errs() << region << " dead with delta " << delta << "\n";
    if (ExplodedNode *N = C.generateNonFatalErrorNode()) {
      if (!BT_modified) {
        BT_modified.reset(
          new BuiltinBug(this, "Delta not zero",
                         "Variable changed from its original value."));
      }
      C.emitReport(llvm::make_unique<BugReport>(
        *BT_modified, BT_modified->getDescription(), N));
    }
  }
}

void ento::registerTrackVarChecker(CheckerManager &mgr) {
  mgr.registerChecker<TrackVarChecker>();
}

bool ento::shouldRegisterTrackVarChecker(const LangOptions &LO) {
  return true;
}

To hook this in to the rest of Clang, add entries to:

  • clang/include/clang/StaticAnalyzer/Checkers/Checkers.td and
  • clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt

Example input to test it:

// trackvar.c
// Test for TrackVarChecker.

// The behavior of these functions is hardcoded in the checker.
void inc(int *num);
void dec(int *num);

void call_inc(int var) {
  inc(&var);
} // reported

void call_inc_dec(int var) {
  inc(&var);
  dec(&var);
} // NOT reported

void if_inc(int var) {
  if (var > 2) {
    inc(&var);
  }
} // reported

void indirect_inc(int val) {
  int *p = &val;
  inc(p);
} // reported

Sample run:

$ gcc -E -o trackvar.i trackvar.c
$ ~/bld/llvm-project/build/bin/clang -cc1 -analyze -analyzer-checker=alpha.core.TrackVar trackvar.i
trackvar.c:10:1: warning: Variable changed from its original value
}
^
trackvar.c:21:1: warning: Variable changed from its original value
}
^
trackvar.c:26:1: warning: Variable changed from its original value
}
^
3 warnings generated.
like image 194
Scott McPeak Avatar answered Oct 31 '22 13:10

Scott McPeak