SMTChecker toward completeness

The SMTChecker current model checking engine is safe but not complete. This means that assertions and other verification targets reported as safe should be safe — unless there’s a bug in the SMTChecker or backend solver — but false positives might be spurious due abstractions.

Two of the main sources of false positives in the counterexamples given by the current engine are

  • loops are handled in a bounded way (only one iteration is considered);
  • functions are analyzed separately in their own contexts.

Thus, contracts that either rely on loops for program logic or use state variables in different functions will likely lead to false positives when analyzed by the SMTChecker.

In order to increase the SMTChecker’s proving power and reduce false positives, we recently implemented a new model checking engine based on systems of Horn clauses capable of reasoning about loops and relations between state variables. We use SPACER as the backend Horn solver, which is integrated into Z3 and also used by Seahorn. Note that the life cycle of a smart contract can also be modeled as the following loop, where each loop iteration is a transaction:

constructor(...);
while(true)
random_public_function(...);

The goal of the new engine is to automatically infer both loop and state invariants while trying to prove safety properties, removing the need for extra assumptions, as written before. More technical details can be found in an upcoming paper which will be linked here when ready.

We can immediately see the results by analyzing the same contract from the previous post without the extra assumption as require :

pragma experimental SMTChecker;
contract C {
uint sum;
uint count;
 function () external payable {
require(msg.value > 0);
// Avoid overflow.
require(sum + msg.value >= sum);
require(count + 1 > count);
sum += msg.value;
++count;
}
 function average() public returns (uint) {
require(count > 0);
uint avg = sum / count;
assert(avg > 0);
return avg;
}
}

The SMTChecker now tries to prove the assertion considering the entire contract and an unbounded number of transactions, instead of one execution of only function average . It automatically learns the contract invariant (count > 0) => ((sum / count) > 0) which holds at the beginning and end of every function and is decisive to prove the desired property. The check takes 0.16s on my laptop.

If we replace the division by uint avg = count / sum , which clearly breaks the assertion, we get the following counterexample:

(and 
(function_average_68_0 2 1 0 0)
(interface_C_69_0 2 1)
(fallback_42_0 0 0 2)
(interface_C_69_0 0 0)
)

This is an internal representation of the multi-transaction bottom-up path that leads to the assertion failure where

  • (interface_C_69 <sum> <count>)is the contract’s idle state and the following numbers are the current values of state variables sum and count, respectively.
  • (fallback_42_0 <sum> <count> <msg.value>) is a call to the fallback function.
  • (function_average_60_8 <sum> <count> <return_value> <avg>) is a call to function average.

Note that this format is internal only and transforming it into a more readable version is under development.

After contract construction both sum and count are 0. The first transaction calls the fallback function with msg.value = 2 , which leads to sum = 2 and count = 1. The second transaction calls function average where (count / sum) = 0. The check takes 0.14s.

Besides contract invariants the engine also tries to summarize while and for loops inside functions. It can prove simple properties regarding loop behavior quite quickly, but the problem can also easily become too hard. Remember, this is an undecidable problem 🙂

function f(uint x) public pure {
uint y;
require(x > 0);
while (y < x)
++y;
assert(y == x);
}

The assertion above is proven in 0.05s.

function f() public pure {
uint x = 10;
uint y;
while (y < x)
{
++y;
x = 0;
while (x < 10)
++x;
assert(x == 10);
}
assert(y == x);
}

The assertions above involve a more complex structure with nested loops and are proven in 0.375s.

uint[] a;
function f(uint n) public {
uint i;
while (i < n)
{
a[i] = i;
++i;
}
require(n > 1);
assert(a[n-1] > a[n-2]);
}

Even though the code above also uses an array the assertion concerns only two elements of the array and the check takes just 0.16s.

function max(uint n, uint[] memory _a) public pure returns (uint) {
require(n > 0);
uint i;
uint m;
while (i < n) {
if (_a[i] > m)
m = _a[i];
++i;
}
i = 0;
while (i < n) {
assert(m >= _a[i]);
++i;
}
return m;
}

The function above computes and returns the max value of an array. The length of the array is given as a parameter since .length is not supported yet. This check is a lot more complex, since it checks that the computed max value is indeed greater or equal all the elements of an unbounded array.
EDIT: When writing this post, this sequence of assertions could not be proven after an 1 hour timeout. After tweaking some quantifier solver parameters the check is successful after 0.13s!
If we change the assertion to assert(m > _a[i]) the given counterexample is the array [0, 0].

Another example more similar to a state machine is the following Crowdsale skeleton:

pragma experimental SMTChecker;
contract SimpleCrowdsale {
enum State { INIT, STARTED, FINISHED }
State state = State.INIT;
 uint weiRaised;
uint cap;
 function setCap(uint _cap) public {
require(state == State.INIT);
require(_cap > 0);
cap = _cap;
state = State.STARTED;
}
 function () external payable {
require(state == State.STARTED);
require(msg.value > 0);
uint newWeiRaised = weiRaised + msg.value;
// Avoid overflow.
require(newWeiRaised > weiRaised);
// Would need to return the difference to the sender or revert.
if (newWeiRaised > cap)
newWeiRaised = cap;
weiRaised = newWeiRaised;
}
 function finalize() public {
require(state == State.STARTED);
assert(weiRaised <= cap);
state = State.FINISHED;
}
}

Function setCap can be called only once, when state is INIT. The fallback function accepts multiple deposits (without issuing tokens in this mock) while state is STARTED . The Crowdsale may be finalized in function finalize , where it makes sure that the cap was not breached.
The analysis automatically learns the invariant weiRaised <= cap which proves the assertion. The check takes 0.09s.

If we change the correct assertion to assert(weiRaised < cap); we get the following 3-transactions counterexample:

(and
(function_finalize_107_0 1 1 1)
(interface_SimpleCrowdsale_108_0 1 1 1)
(fallback_85_0 1 0 1 0)
(interface_SimpleCrowdsale_108_0 1 0 1)
(function_setCap_42_0 0 0 0 1)
(interface_SimpleCrowdsale_108_0 0 0 0)
)

This bottom-up internal representation follows the same format as the one presented in the first example. All state variables start as 0. The first tx calls setCap(1) which leads to state = State.START and cap = 1. The second tx calls the fallback function with msg.value = 1 which modifies weiRaised to 1. Finally, the third tx calls finalize and the assertion fails. The check takes 0.08s.

There are still features missing for the engine to be considered ready, and we are constantly increasing the supported subset of Solidity, but these preliminary experiments show that the new engine might be able to automatically prove complex properties that involve multi-tx behavior reasonably quickly. For the next set of experiments I hope to verify some real world contracts that were already verified by other tools.

The code is still under development and the new SMTChecker engine should be released as part of one of the next 2 Solidity releases.

—Source link—

What do you think?

10 lessons from a serial entrepreneur – Justin Kan, Atrium, YC, and Twitch

The Coinbase Mission, Vision