Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing a too general program

Suppose the correct definition of a predicate would be

len([],0).
len([_|T],N)  :-  len(T,X), N is  X+1.

However, we end up with the following erroneous definition in stead.

len2([],0).
len2([_|T],N)  :-  len(T,X),  ( N  is  X+1 ; N is X + 2, N = 10000 ).

All the standard testing doesn’t reveal the mistake because it works just like len/2 except when it stumbles upon a list of length exactly 9999 elements where there are two possible answers.

as user mjano314 observes. How is it possible to detect such an error?


Note that len2/2 above uses len/2. In this manner there is precisely one single case where the definition is too general. Would len2/2 be directly recursive, we would have infinitely many cases that are too general. Obviously, in such a situation it would be easier to locate errors.

like image 262
false Avatar asked Jul 01 '21 19:07

false


People also ask

What are 4 types of testing in programming?

There are four main stages of testing that need to be completed before a program can be cleared for use: unit testing, integration testing, system testing, and acceptance testing.

What are the 3 stages of testing our programs?

Testing your program is essential to any software development. Throughout the software development life cycle, there are three levels of software testing: functional, integration and end-to-end.

What are the 3 classifications of testing?

Let us explore some of the most common testing types: Accessibility testing. Acceptance testing. Black box testing.

Who writes the unit tests in software testing?

In this process, unit tests are written first, by the software engineers (often with pair programming in the extreme programming methodology). The tests are expected to fail initially. Each failing test is followed by writing just enough code to make it pass.

What is operational acceptance testing in software testing?

Operational acceptance testing. Operational acceptance is used to conduct operational readiness (pre-release) of a product, service or system as part of a quality management system. OAT is a common type of non-functional software testing, used mainly in software development and software maintenance projects.

Are your testing and engineering processes properly integrated?

Often it is seen that testing and engineering processes are not properly integrated. This means that components or subsystems are often tested for flaws before they are mature enough to be tested on all parameters. Moreover, there may be some project specific needs that need to be looked into.

How to improve software testing performance?

Testing improvement can best be achieved by evaluating effectiveness of testing at top of every software test assignment. While this assessment is primarily performed by testers, it should involve developers, users of software, and quality assurance professionals if function exists within the IT organization.


2 Answers

If we already suspect that the predicate len2(X,Y) is not functional while we expect it to be, meaning in this case that there are no two answers with the same value for the first argument and different values for the second argument, then we can verify our suspicion by searching for such two answers with the following snippet:

len2(X,Y1), len2(X,Y2), Y1\=Y2

In this case, the program will give us an answer with Y1=9999, Y2=10000 and X a list of 9999 variables.

However, if the fault is not present or if the code of the predicate was such that the input triggering the fault was not generated in finite time (imagine that it generates all even-length lists before any odd-length list), then the code above will not finish. This means, in my opinion, that this approach is only useful as a debugging tool but not really suitable as part of some automated testing/validation of the predicate.

like image 62
jnmonette Avatar answered Sep 20 '22 09:09

jnmonette


As has been noted by @jnmonette, there is a functional dependency from the first to the second argument. With a query like

?- len2(L, N), dif(N, M), len2(L, M).
   L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J|...], N = 9999, M = 10000
;  L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J|...], N = 10000, M = 9999
;  loops.

the error in len2/2 can be detected. After all, L cannot have two different lengths. Additionally,

?- len2([_|L], N), len2(L, N).
   N = 10000, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J|...]
;  loops.

identifies the error going into the other direction. The lengths of L and [_|L] cannot be the same. This can be generalized to cover all such errors:

?- len2(L, N), phrase(([_],...), L,K), len2(K, N).
   L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J|...], N = 10000, K = [_B,_C,_D,_E,_F,_G,_H,_I,_J,_K|...]
;  loops.

So far we have used Prolog directly. We were able to state general properties by posing this queries. However, we will only find counterexamples should there be some, but otherwise we are left in the uncertain. And, we were pretty lucky that the actual definition enumerates answers in a fair manner letting each and every answer appear in finite time. In case of a more demanding definition (think of exchanging the order of clauses in len/2) simply add length(L, _) in front of all queries so far.

However, in case of a running query we inevitably ask: Should we continue to wait for an answer, or can we already abort the query? After all, for a correct implementation the query will not produce any answer and thus loop indefinitely.

There is no way (in current implementations) to delegate such a query at least into the background running there with lower priority. And therefore such queries are not used for testing at all.

On the other hand, such queries are a very powerful way to express many testable properties. For example many of the bugs in clpfd-systems of SICStus, SWI, and Scryer have been identified in this manner using condor. The cruder support however does not lead to very elegant solutions.

To start to address this problem, the following annotation may help:

:/-& len2(L, N), dif(N, M), len2(L, M).
:/-& len2(L, N), phrase(([_],...), L,K), len2(K, N).

The :/- means there is no solution - similarly to :- \+ Q_0. and the additional &, an asciified ∞, meaning Prolog's execution will be infinite. This annotation therefore leaves room to try out better strategies that disprove that there is no solution.

In GUPU this annotation is executed as a Prolog goal with a (relatively short) timeout. Also alternate strategies are tried in particular iterative deepening which in this case do time out as well. So effectively, the error remains undetected. But with more resources or a better strategy the error might be discovered.

like image 37
false Avatar answered Sep 18 '22 09:09

false