Is there a way to specify the domain of a variable when defining it using an array?

1

My program reads the constraints from a smt2 file, and all the variables are defined as an array. For example

(declare-fun x () (Array (_ BitVec 32) (_ BitVec 8) ) )
(declare-fun y () (Array (_ BitVec 32) (_ BitVec 8) ) )
(assert (bvslt  (concat  (select  x (_ bv3 32) ) (concat  (select  x (_ bv2 32) ) (concat  (select  x (_ bv1 32) ) (select  x (_ bv0 32) ) ) ) ) (concat  (select  y (_ bv3 32) ) (concat  (select  y (_ bv2 32) ) (concat  (select  y (_ bv1 32) ) (select  y (_ bv0 32) ) ) ) ) ) )
(check-sat)
(exit)

Some other constraints are omitted. Sometimes the solver gives a value of x as:

(store (store (store ((as const (Array (_ BitVec 32) (_ BitVec 8))) #xfe)
                     #x00000002
                     #x00)
              #x00000001
              #xff)
       #x00000003
       #x80)

According to the definition, each element of the array is a hex value, so the value should be 0x8000fffe. This value is beyond the upper bounds of integer in C++. When I covert it back to int, it is a negative value. So I guess Z3 treats all variables defined by an array as unsigned int. For example, if the constraint is x > y, the solver may give x = 0x8000fffe and y = 0x00000001. The values satisfy the constraint in unsigned comparison, but when conducting a signed comparison, x is negative and y is positive so it is wrong. I am wondering if there is a way to tell the solver that the numbers are signed when defining them as an array?

Added 22:26:43 09/14/2019 I got two smt2 files, one is

(set-logic QF_AUFBV )
(declare-fun x () (Array (_ BitVec 32) (_ BitVec 8) ) )
(declare-fun y () (Array (_ BitVec 32) (_ BitVec 8) ) )
(assert (bvslt  (concat  (select  x (_ bv3 32) ) (concat  (select  x (_ bv2 32) ) (concat  (select  x (_ bv1 32) ) (select  x (_ bv0 32) ) ) ) ) (concat  (select  y (_ bv3 32) ) (concat  (select  y (_ bv2 32) ) (concat  (select  y (_ bv1 32) ) (select  y (_ bv0 32) ) ) ) ) ) )
(check-sat)
(exit)

The constraint is simply x < y. The other one is

(set-logic QF_AUFBV )
(declare-fun x () (Array (_ BitVec 32) (_ BitVec 8) ) )
(declare-fun y () (Array (_ BitVec 32) (_ BitVec 8) ) )
(assert (let ( (?B1 (concat  (select  y (_ bv3 32) ) (concat  (select  y (_ bv2 32) ) (concat  (select  y (_ bv1 32) ) (select  y (_ bv0 32) ) ) ) ) ) (?B2 (concat  (select  x (_ bv3 32) ) (concat  (select  x (_ bv2 32) ) (concat  (select  x (_ bv1 32) ) (select  x (_ bv0 32) ) ) ) ) ) ) (let ( (?B3 (bvsub  ?B1 ?B2 ) ) ) (and  (and  (and  (and  (and  (=  false (=  (_ bv0 32) ?B2 ) ) (=  false (=  (_ bv0 32) ?B1 ) ) ) (=  false (bvslt  ?B1 ?B2 ) ) ) (=  false (=  (_ bv0 32) ?B3 ) ) ) (=  false (bvslt  ?B3 ?B2 ) ) ) (=  (_ bv0 32) (bvsub  ?B3 ?B2 ) ) ) ) ) )
(check-sat)
(exit)

which is

 [(! (0 == x)), 
   (! (0 == y)), 
   (! ( y < x)),
   (! (0 ==( y - x))), 
   (! (( y - x) < x)), 
   (0 ==(( y - x) - x)) ]

These smt2 files are generated by Klee.The solver gives

x = (store (store (store ((as const (Array (_ BitVec 32) (_ BitVec 8))) #xfe)
                     #x00000002
                     #x00)
              #x00000001
              #xff)
       #x00000003
       #x80)
y = before minimize: (store (store (store ((as const (Array (_ BitVec 32) (_ BitVec 8))) #xfc)
                     #x00000002
                     #x01)
              #x00000001
              #xff)
       #x00000003
       #x00)

so x=0x8000fffe, and y=0x0001fffc. Converted to decimal, we have x=2147549180, and y=131068. So y-x-x is-4294967296, not decimal 0. The solver thinks it is satisfied bacause 4294967296 is

1 00000000 00000000 00000000 00000000

in binary, where the "1" is the 33rd bit, and will be removed. So -4294967296 is considered 0x00 in the memory. This is the reason I asked this question. X and y should be integers, so 0x8000fffe is -0x0000fffe, aka -65534. And y is 131068. And y-x-x is apparently not 0. So in terms of integer, the values don't satisfy the constraints. The expression y - x - x seems to be computed in unsigned rules.

z3
smt
asked on Stack Overflow Sep 14, 2019 by tjfy1992 • edited Aug 6, 2020 by HoldOffHunger

1 Answer

1

Bit-vectors have no signs

There's no notion of signed or unsigned bit-vector in SMTLib. A bit-vector is simply a sequence of bits, without any attached semantics as to how to treat it as a number.

It is the operations, however, that distinguish signedness. This is why you have bvslt and bvult; for signed and unsigned less-than comparison, for instance. You might want to read the logic description here: http://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml

Long story short, all the solver is telling you is that the result contains these bits; how you interpret that as an unsigned word or a signed 2's complement number is totally up to you. Note that this perfectly matches how machine arithmetic is done in hardware, where you simply have registers that contain bit-sequences. It's the instructions that treat the values according to whatever convention they might choose to do so.

I hope that's clear; feel free to ask about a specific case; posting full programs is always helpful as well, so long as they abstract away from details and describe what you're trying to do.

Also see this earlier question that goes into a bit more detail: How to model signed integer with BitVector?

Avoiding overflow/underflow

You can ask z3 to avoid overflow/underflow during bit-vector arithmetic. However, this will require adding extra assertions for each operation you want to perform, so it can get rather messy. (Also, looks like you want to use Klee; I'm not sure if Klee allows you to do this to start with.) The technique is explained in detail in this paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/z3prefix.pdf

In particular, you want to read through Section 5.1: Authors describe how to "annotate" each arithmetic operation and assert that it does not overflow explicitly. For instance, if you want to make sure addition doesn't overflow; you first zero-extend your bit-vectors from 32-bits to 33-bits; do the addition, and check if the 33-bit of the result is 1. To avoid overflow, you simply write an assertion saying that bit cannot be 1. Here's an example:

; Two 32-bit variables
(declare-fun x () (_ BitVec 32))
(declare-fun y () (_ BitVec 32))

; Zero-Extend them to 33-bits
(define-fun x33 () (_ BitVec 33) (concat #b0 x))
(define-fun y33 () (_ BitVec 33) (concat #b0 y))

; Add them
(define-fun  extendedAdd () (_ BitVec 33) (bvadd x33 y33))

; Get the sign bit
(define-fun signBit () (_ BitVec 1) ((_ extract 32 32) extendedAdd))

; Assert that the addition won't overflow:
(assert (= signBit #b0))

; Regular addition result:
(define-fun addResult () (_ BitVec 32) ((_ extract 31 0) extendedAdd))

; Now you can use addResult as the result of x+y; and you'll
; be assured that this addition will never overflow

(check-sat)
(get-model)

You'd also have to check for underflow at each operation. Further adding complexity.

As you can see, this can get very hairy and the rules for multiplication are actually quite tricky. To simplify this z3 actually provides built-in primitives for multiplication overflow-checking, called:

  • bvsmul_noovfl: True only if signed multiplication doesn't overflow
  • bvsmul_noudfl: True only if signed multiplication doesn't underflow
  • bvumul_noovfl: True only if unsigned multiplication doesn't overflow

There is no predicate for checking if an unsigned multiplication can underflow because that cannot happen. But the point remains: You have to annotate each operation and explicitly assert the relevant conditions. This is best done by a higher-level API during code generation, and some z3 bindings do support such operations. (For instance, see http://hackage.haskell.org/package/sbv-8.4/docs/Data-SBV-Tools-Overflow.html for how the Haskell layer on top of SMT-solvers handles this.) If you'll do this at scale, you probably want to build some mechanism that automatically generates for you as doing it manually would be extremely error-prone.

Or you can switch and use Int type, which never overflows! But then, of course, you're no longer modeling an actual running program but reasoning about actual integer values; which might be acceptable depending on your problem domain.

answered on Stack Overflow Sep 14, 2019 by alias • edited Sep 15, 2019 by alias

User contributions licensed under CC BY-SA 3.0