Doing so leads to thinking about tests not in terms of individual examples, but general properties, and from there to an approach called property-based testing or QuickCheck. QuickCheck generates tests from properties. It has its origins in functional programming and has been a crucial tool in eliminating bugs from many complex software projects. But QuickCheck does more than help find bugs: It encourages thinking about our API in terms of properties, and this style of thinking often leads to interesting domain insights and architectural improvements.
APIs and Specifications
Consider the following API for a simple “smart” shopping cart, written in F#:
type Interface = abstract member Add: int -> Item -> unit abstract member Total: unit -> Price
The first method, Add, takes an integer representing a count, and an item to be added to the cart, and returns nothing. (unit is used in F# similarly to void in other languages.) The second method returns a total price for all items in the cart. (That’s why it’s “smart” – it knows the prices of the items in it.)
We add a convenience module with functions for calling the methods:
module Interface = let add (cart: Interface) (count: int) (item: Item) = cart.Add count item let total (cart: Interface): Price = cart.Total ()
We also add a factory interface, creating a shopping-cart implementation from a catalog assigning a price to each item:
type Catalog = Map<Item, Price> type Factory = abstract member make: Catalog -> Interface
STAY TUNED!
Learn more about API Conference
Now, the types give us some information about what kinds of values are input and output to the various operations, but not what those values actually are. How could we obtain that information? We could look at the implementation, but that might be messy and complex – databases and microservices might be involved. We could look at unit tests, which common development methodologies produce as a matter of course. These might look as follows, assuming there is a factory for shopping carts available as factory:
let test = let cart1 = factory.make Map.empty assertEquals (Interface.total cart1) (Price 0) let cart2 = factory.make (Map.ofList [Item "Milk", Price 100; Item "Egg", Price 20]) assertEquals (Interface.total cart2) (Price 0)
(We assume a function assertEquals from some test frameworks, and that items and prices are constructed with the Item and Price constructors.)
These are trivial tests, but they do make a point: The first one says that the total a freshly created shopping cart is 0 – provided its catalog is empty. The second one says the same thing for a catalog containing only milk and eggs. The TDD community calls this an “executable specification” (or part of one), but there is still a lot missing. Looking at it, you might say “The total is 0 for all shopping carts, for all possible catalogs.” This is in fact a property of the shopping-cart factory, and, even though may seem trivial, worth stating explicitly. This could look as follows:
let total0Correct = Prop.forAll Arb.catalog (fun catalog -> let cart = factory.make catalog Interface.total cart .=. Price 0)
The Prop and Arb modules come from the FsCheck package. Prop.forAll returns a property (of type Property) that should hold for all values from a certain set. Arb.catalog means “arbitrary catalog”, and Prop.forAll accepts a function for naming that catalog. The body of that function is a boolean expression that first creates a cart associated with the catalog, and then checks whether the total reported by the cart is indeed 0. As opposed to the two “example unit tests” above, this states that the total is 0 for all catalogs.
FsCheck provides a function called quick that checks whether the property holds. In F# Interative, it prints output as follows:
quick total0Correct;; Ok, passed 100 tests.
FsCheck has just generated 100 tests from the description of the property. We can instrument total0Correct to print out the catalogs of those tests:
map [(Item "HF", Price 1); (Item "dK", Price 2)] map [(Item "Bx", Price 1); (Item "QaY", Price 1); (Item "ca", Price 3)] map [(Item "Qhwu", Price 2); (Item "t", Price 3)] map [(Item "HS", Price 1); (Item "h", Price 3); (Item "uj", Price 1)] map [(Item "IXVgqo", Price 1); (Item "Rgr", Price 5); (Item "Zgnh", Price 3)] map [(Item "ZXhwG", Price 4); (Item "nwFUSd", Price 2)]
…
This means that FsCheck can generate many tests from a declarative description of a property. FsCheck’s underlying idea comes from one of the most influential works in functional programming, the paper QuickCheck. The QuickCheck technique is so pervasively useful that QuickCheck implementations exist for most programming languages, even non-functional ones.
The total0Correct property shows two advantages over example-based unit tests like the two above: It separates the general property from the specific examples, thus being a more informative part of an executable specification. Moreover, generating many test cases from a single property is much more likely to find bugs in the code.
Of course, total0Correct is quite simplistic. In Listing 1 is a more involved property.
Listing 1 let totalCorrect = Prop.forAll Arb.catalog (fun catalog -> let items = List.map fst (Map.toList catalog) Prop.forAll (Arb.list (Arb.pickOneOf items)) (fun items -> let cart = factory.make catalog List.iter (Interface.add cart 1) items let prices = List.map (Catalog.itemPrice catalog) items let total = List.fold Price.add (Price 0) prices Interface.total cart .=. total))
Again, this is a property that holds for all catalogs. For each catalog, the property extracts the available items from the catalog as items. It then goes on to say that the property must also hold for all Arb.list (Arb.pickOneOf items). Translated to English, this means “an abitrary list of elements, each of which is picked from items” – or “an arbitrary list of items from the catalog”. The property then proceeds to construct a cart as before, and uses List.iter to add one of each item to the catalog. The next line looks up the price for each item with the provided Catalog.itemPrice function, calling the resulting list prices. The one ofter that uses Price.add to add all those prices yielding total, and the last line compares that total to the output of Interface.total.
Good API Design is crucial for your success
Explore the API Design Track
Finding Bugs
We can use the totalCorrect property to test a faulty implementation of the shopping-cart interface, producing this output from FsCheck (Listing 2).
Listing 2 quick totalCorrect;; Falsifiable, after 2 tests (3 shrinks) (StdGen (957853183,296699855)): Label of failing property: Price 1 = Price 2 Original: map [(Item "t", Price 4)] [Item "t"; Item "t"] Shrunk: map [(Item "", Price 1)] [Item ""; Item ""]
This says that FsCheck found a counterexample that proves that the property does not hold. The counterexample consists of a catalog from the first Prop.forAll and a list of items from the second Prop.forAll – map [(Item “t”, Price 4)] and [Item “t”; Item
“t”], respectively.
Thus, if the catalog contains just a single item called t priced at 4, adding that one t twice in a row results in a test failure.
But FsCheck does more. It shrinks the counterexample it found to create a simpler one, and comes up with one that has a simpler name (the empty string instead of t) and a simpler price (4 instead of 1). As this is the simplest counterexample FsCheck could find, it means that just purchasing “one of the empty string” does not trigger the bug. This together with the statement Price 1 = Price 2 gives a clue as to what the cause of the problem is. In fact, when Interface.add is called twice for the same item, it forgets about the first item.
Generating Test Data and Informing the Business
Consider extending the shopping cart with idea of a discount. Imagine the business requirement is that a discount can be described as “buy X, only pay for Y”, with Y less than X. Such a discount could be described by the following type alias:
type Discount = { receive: int; payFor: int }
The available discounts could be described by a map associating an item with a discount:
type Discounts = Map<Item, Discount>
To take discounts into account, we add a method to the Factory interface:
type Factory = abstract member make: Catalog -> Interface abstract member make: Catalog * Discounts -> Interface
Now, imagine the business also requires that adding an item to a shopping cart should never decrease the price. We can codify this requirement as a property as follows (Listing 3).
Listing 3 let addingItemIncreasesTotal = Prop.forAll Arb.catalog (fun catalog -> Prop.forAll (Arb.discounts catalog) (fun discounts -> Prop.forAll (Arb.countAndItems catalog) (fun countAndItems -> let cart = factory.make(catalog, discounts) List.iter (fun (count, item) -> Interface.add cart count item) countAndItems let itemPrice = Seq.head (Map.toSeq catalog) let totalBefore = Interface.total cart Interface.add cart 1 (fst itemPrice) let totalAfter = Interface.total cart totalAfter .>=. totalBefore)))
There are now three nested Prop.forAll in the property: It must hold for all catalogs, but also for all discounts that are applicable for the catalog, and also any combination of counts and items already in the shopping cart before that one additional item is purchased. The subsequent lines prepare the shopping cart, calculate the total, add one item (for simplicity, it just uses the first item from the catalog), and calculate the total again, comparing them in the final line.
Now, this property uses the custom function Arb.discounts to generate discounts for a catalog. It is defined as follows:
let discounts catalog = let items = Seq.map fst (Map.toSeq catalog) Arb.map (Arb.pickOneOf items) discount
This generates a map with the keys picked from the items in the catalog, and the values generated as an arbitrary discount. Here is its definition:
let discount = let arbCount = Arb.choose 1 20 let convertTo (receive, payFor) = { receive = receive; payFor = payFor } let convertFrom discount = (discount.receive, discount.payFor) let valid (receive, payFor) = payFor < receive Arb.convert convertTo convertFrom (Arb.filter valid (Arb.pair arbCount arbCount))
The key part here is the Arb.filter valid (Arb.pair arbN arbN) expression at the end: It says that a discount is constructed from a pair of arbitrary counts, as long as that pair is valid. An arbitrary count arbCount is defined in the first line as a integer between 1 und 20, and the valid function ensure the first business requirements – that customers always pay for less than they bought. The convertTo and convertFrom functions merely convert between these pairs and the Discount type.
Checking this property against a correct shopping-cart implementation shows a problem (Listing 4).
Listing 4 quick (addingItemIncreasesTotal factory8);; Falsifiable, after 1 test (4 shrinks) (StdGen (1040086835,296699864)): Label of failing property: Price 1 >= Price 3 Original: map [(Item "uI", Price 2)] map [(Item "uI", { receive = 4 payFor = 2 })] [(1, Item "uI"); (2, Item "uI")] Shrunk: map [(Item "", Price 1)] map [(Item "", { receive = 4 payFor = 1 })] [(1, Item ""); (2, Item "")]
The cause is apparent from the shrunk counterexample. It comes with a catalog with a single item, which is discounted as “buy 4, only pay for 1”. If three items are already in the shopping cart (first, one was purchased, then two more), buying one more gets the cart over the threshold of 4 and decreases the price to 1.
The insight from this is that, in “buy X, only pay for Y”, really Y must be X-1 and cannot be less, which can be codified and communicated back to the business.
This example again may be a bit simplistic, but demonstrates the value of property-based testing for exploring the requirements and resulting specification, and nailing down seemingly unlikely edge cases.
STAY TUNED!
Learn more about API Conference
Conclusion
Properties and property-based testing are powerful additions to the software developer’s toolbox. The most immediate benefit is increased coverage of tests, and thus less bugs. A more important consequence is in the properties themselves: Expressing and thinking about properties encourages thinking about the specification of the system under development. This leads to better understanding of the business requirements, and frequently to simpler and more elegant designs.