Algorithms and correctness

From Algowiki
Jump to navigation Jump to search

Algorithmic problem

An algorithmic problem is described by:

  1. a set of feasible inputs;
  2. for each input a set of outputs (a.k.a. feasible solutions or solutions, for short);
  3. optionally, an objective function, which assigns a real number to each feasible solution (the quality of the solution). If an objective function is specified, it is also specified whether the objective function is to be minimized or to be maximized.

Instructions and operations

Imperative programming languages offer possibilities to specify instructions (a.k.a. statements). An instruction specifies a sequence of (machine) operations; executing the instruction on a machine means that the machine performs this sequence of operations. Instructions may be bundled as subroutines (a.k.a. procedures, functions, methods).

Algorithm

  1. An algorithm is associated with an algorithmic problem.
  2. An algorithm is an abstract description of a process that can be formulated (a.k.a. implemented) as a subroutine, which computes some feasible output for any given input of the associated algorithmic problem.
  3. If an objective function is given, the objective function value of the generated solution is a criterion for the quality of an algorithm. More specifically:
    1. in case of minimization: a low objective function value is favored;
    2. in case of maximization: a high objective function value is favored.

Iterative and recursive algorithms

  1. Basically, a non-trivial algorithm is a loop or recursion plus some preprocessing (a.k.a. initialization) and/or some postprocessing.
  2. If an algorithm consists of two or more loops/recursions that are executed strictly after each other, it may be viewed as two or more algorithms that are executed after each other. Therefore, without loss of generality, an algorithm may be viewed as one loop or one recursion plus some pre/postprocessing.
  3. An iteration of a loop may contain another loop or recursion. Analogously, a recursive call may contain another loop or recursion. We say that a loop/recursion inside another loop/recursion is nested or the inner loop/recursion. Correspondingly, the nesting loop/recursion is the outer one.

Remarks:

  1. Clearly, a loop may be transformed into a recursion and vice versa. So, every algorithm may be formulated either as a loop or as a recursion. However, in most cases, one of these two options looks simpler and "more natural" than the other one.
  2. Formulating an algorithm as a loop might be favorable in many cases because a loop allows more control than a recursion. More specifically, a loop may be implemented as an iterator whose method for going one step forward implements the execution of one iteration of the loop. Such an implementation allows one to terminate the loop early, to suspend execution and resume execution later on, and to execute some additional instructions between two iterations (e.g. for visualization or for testing). The crucial point is that, for all of these purposes, the code of the algorithm need not be modified (the source need not even be available).

Correctness

An algorithm is correct if three conditions are fulfilled for every feasible input:

  1. All operations are well-defined. For example, access to an array component outside the array's index range and accessing an attribute or method of void are ill-defined operations.
  2. Each loop and recursion terminates regularly, that is, the break condition is fulfilled after a finite number of operations.
  3. The output delivered by the algorithm is a feasible solution to the given input.

Invariant

  1. The invariant of a loop consists of all assertions that are true immediately before the first iteration, immediately after the last iteration, and between two iterations. These assertions are called the invariant assertions of this loop.
  2. The invariant of a recursion consists of all assertions that are fulfilled immediately after each recursive call. These assertions are called the invariant assertions of this recursion.

Remarks:

  1. If an algorithm only accesses its input and its own local data, the invariant assertions are assertions about these local data and nothing else.
  2. The invariant of an algorithm has a specific function in the correctness proof for the algorithm. Typically, in a documentation, only those invariant assertions are stated that are relevant for this purpose.
  3. The invariant is also essential for testing purposes: a white-box test of a loop/recursion should definitely check maintenance of all relevant invariant assertions.
  4. Last not least, the invariant is the very "essence" of the algorithmic idea; understanding the algorithm deeply amounts to understanding the invariant.

Variant

  1. The variant of a loop consists of all differences between the state immediately before an iteration and the state immediately after that iteration (typically, but not exclusively, the values of integral loop variables).
  2. The variant of a recursion consists of all differences between the input of a recursive call [math]\displaystyle{ X }[/math] and the inputs of all recursive calls that are directly called in [math]\displaystyle{ X }[/math] (typically, but not exclusively, some integral measure of the size of the input).

Remarks:

  1. If an algorithm only accesses its input and its own local data, the variant states changes of the contents of these local data and nothing else.
  2. The variant of an algorithm has a specific function in the correctness proof for the algorithm. Typically, in a documentation, only those variant assertions are stated that are relevant for this purpose.
  3. The variant is also essential for testing purposes: a white-box test of a loop/recursion should definitely check all relevant changes.

Correctness proofs

  1. Assuming the algorithm is correct:
    1. The variant implies that the break condition will be fulfilled
      1. after a finite number of iterations in case of a loop;
      2. after a finite number of recursive calls in case of a recursion.
    2. Correctness of the output follows from what the invariant says about the state immediately after the last iteration.
  2. Proving the invariant amounts to an induction on the number of iterations performed so far / the recursion parameter over which the variant of the recursion id defined.
    1. The invariant is the induction hypothesis.
    2. Proving the induction basis amounts to proving that the initialization establishes the invariant.
    3. Proving the induction step amounts to proving that an iteration / a recursive call maintains the invariant.

Remarks:

  1. In a nutshell, the variant proves termination, and the invariant proves that the output is correct.
  2. Typically, well-definedness of all operations is not considered explicitly in correctness proofs. The reason is that well-definedness is non-trivial in rare cases only.
  3. In many cases, correctness of the output follows immediately from what the invariant says about the state immediately after the last iteration; in other cases, additional arguments are necessary to prove correctness of the output.
  4. The variant is also essential to estimate, asymptotically, the number of iterations / the recursion depth.