## Monday, May 19, 2014

### Combining array bounds checking and loop condition checking into a single operation

In a memory-safe language, the naive way to loop over an array involves too many runtime checks:

int i = 0;
while (i < array.length) {
doSomething(array[i]);
i = i + 1;
}

The variable i is checked against array.length in the loop condition, and then checked against 0 and array.length again when array[i] is dereferenced. That's three comparisons for every element of the array. A clever compiler can remove some of them, but that requires complicated mathematical reasoning. For example, to remove the check against 0, the compiler must build a proof by mathematical induction that i always stays non-negative.

In this post I'll try to make it easier for the compiler to remove all checks except one. First, let's note that looping over a linked list, implemented as a sum type, requires only one check per element to be completely memory-safe:

data List a = Empty | Append a (List a)

product :: List Int -> Int
product (Empty) = 1
product (Append value tail) = value * sum tail

We can see that looping over a linked list requires only one check, which serves double purpose as a bounds check (for memory safety) and a loop condition check (for loop termination). Then how come looping over an array requires three checks? Can we reduce the number of checks without losing memory-safety?

The answer is that we can have only one check per element for arrays too, if we treat them like lists. We just need to use an Iterator class which is guaranteed to contain a valid index into the array, so that dereferencing the iterator doesn't require a bound check, and moving the iterator forward or backward requires exactly one bound check:

#include <iostream>

template <typename Type> class NonEmptyArray;  // defined below

template <typename Type> class NonEmptyIterator {
friend class NonEmptyArray<Type>;

private:
Type* array;
unsigned int maxIndex;
unsigned int index;  // 0 <= index <= maxIndex
NonEmptyIterator(Type* array_, unsigned int maxIndex_):
array(array_), maxIndex(maxIndex_), index_(0) {}

public:
Type get() {
return array[index];
}

void set(Type value) {
array[index] = value;
}

bool increment() {
if (index < maxIndex) {
index++;
return true;
} else {
return false;
}
}

bool decrement() {
if (index > 0) {
index--;
return true;
} else {
return false;
}
}
};

template <typename Type> class NonEmptyArray {
private:
Type* array;
unsigned int maxIndex;  // this means length - 1

public:
NonEmptyArray(unsigned int maxIndex_): maxIndex(maxIndex_) {
array = new Type[maxIndex + 1];
}

~NonEmptyArray() {
delete [] array;
}

// Noncopyable boilerplate omitted...

NonEmptyIterator<Type> iterator() {
return NonEmptyIterator<Type>(array, maxIndex);
}
};

int main() {
NonEmptyArray<double> array(2);  // that means length is 3
NonEmptyIterator<double> it(array.iterator());

// Fill the array
double x = 1.0;
do {
it.set(x);
x += 1.0;
} while (it.increment());

// Compute the sum of the array elements
double sum = 0.0;
do {
sum += it.get();
} while (it.decrement());
std::cout << sum << std::endl;  // prints 6
}

The above iterator class works only for non-empty arrays, because it needs to maintain the invariant that get() always returns a valid value. We could work around that by simulating sum types in C++, but I don't want to overcomplicate the code. The important thing is that once the iterator is constructed, calling the methods get(), set(), increment() and decrement() in any order is completely memory-safe for the lifetime of the array, while requiring only one bounds check for the typical case of traversing the array in a loop.

(Clever readers might point out that the check inside increment() and decrement() returns a boolean, which is then inspected in the loop condition, so that's technically two checks. But if the method calls are inlined, the compiler can trivially combine these two checks into one without doing any fancy mathematical reasoning.)

It's an interesting exercise to adapt the above approach to more complicated cases, like traversing two arrays of the same length at once. But I feel that's not really the point. Unlike approaches based on theorem provers or dependent types (or even "non-dependent types"), this approach to eliminating bounds checking probably doesn't scale to more complicated use cases. But it does have the advantage of being simple and understandable to most programmers, without using any type-theoretic tricks.

### Using real arithmetic to eliminate bounds checking

This post is a repost of a question I asked on Lambda the Ultimate.
The usual problem with eliminating array bounds checking is that integer arithmetic is undecidable. For example, the compiler can't figure out that you can index into an array of m·n elements using an expression like i·n+j, even if it's known statically that 0≤i<m and 0≤j<n. Or rather, that particular problem is simple, but it's hard to come up with a general algorithm that would work for many different programs. To solve that problem, some people are trying to use dependent types and machine-checkable proofs, like Hongwei Xi, or restrict their attention to Presburger arithmetic (which is decidable but doesn't support multiplication), like Corneliu Popeea.
To take a different approach to the problem, we can note that statements like "if i ∈ [0,m-1] and j ∈ [0,n-1], then i·n+j ∈ [0,m·n-1]" (all segments are closed) can be interpreted and proved in real arithmetic, instead of integer arithmetic. The nice thing is that real arithmetic is decidable. If that approach works, we can have a language with integer-parameterized types like "array of n elements" and "integer between m and n", and have a simple definition of what the type checker can and cannot automatically prove about programs using such types, without having to write any proofs explicitly.
Of course the system will be limited by the fact that it's talking about reals rather than integers. For example, there will be problems with integer-specific operations on type parameters, like division and remainder, but I don't know how often these will be required.
In the comments to my post on LtU, Tom Primožič has carried out some experiments with several SMT solvers, including Microsoft's Z3. Indeed, it turns out that integer solvers can't handle the above example with matrix addressing, but a real arithmetic solver does the job easily. Here's a translation of the matrix addressing example, which you can try out in Z3's online interface:
(declare-const i Int)
(declare-const j Int)
(declare-const m Int)
(declare-const n Int)
(assert (and (<= 0 i) (<= i (- m 1))))
(assert (and (<= 0 j) (<= j (- n 1))))
(assert (and (<= 0 n) (<= 0 m)))

(assert (not (<= (+ (* i n) j) (- (* m n) 1))))
(check-sat-using qfnra-nlsat)

Note that the last line invokes the nonlinear real arithmetic solver NLSat. If you replace the last line with (check-sat) to use the default integer solver instead, it will timeout and fail to find a proof.

## Tuesday, May 6, 2014

### Functional equivalence in value-level programming

When you construct a container like Set or Map, you need to either explicitly pass the functions used for comparing or hashing elements, or automatically pick them up based on the types of the elements. "Value-level programming" tries to discourage type-directed dispatch, so we need to pass the functions explicitly. That leads to a problem when you need to quickly merge two Sets or Maps. The merging function cannot assume that the same hashing function was used in both containers, so it might need to rehash all elements, while with type-directed dispatch it could assume that the hashing function for the same type is always the same and skip the rehashing.

That's a genuine problem with value-level programming. There are several possible solutions:

1) If two containers have pointers to the same hashing function, that means it's safe to skip rehashing. This approach is guaranteed to be safe, but it's sometimes too conservative.

2) Each hashing function could contain an explicit ID, so the merging function can skip rehashing if the IDs match. This approach is not guaranteed to be safe, but it seems to discourage programmer errors pretty well.

3) The compiler could enforce the fact that there's only one value of a specific type, like HashFunction<String>. That can be done with private constructors in Java, or with abstract data types in functional languages. This approach is guaranteed to be safe, but it needs an escape hatch, possibly in the form of (1) or (2).

Overall it seems to me that value-level solutions to the above problem are not worse than type-directed solutions. I don't know if there are other, more severe variants of that problem.

### Simulating template specialization with value-level polymorphism

When I showed my last post about translating a generic algorithm from C++ to Java to my friend who's an expert in C++, his main question was, what about template specialization? The simplest example of specialization is std::advance, which advances an iterator by n steps. It works in O(n) time for iterators that only support "it++", and in O(1) time for iterators that support "it+=n".

In value-level programming, that can be simulated by having a method in the ForwardIteratorOperations interface that returns Maybe<RandomAccessIteratorOperations>. For iterators that don't support fast random access, that method would return Maybe.Nothing, and for those that do, it would return a Maybe.Something containing a valid RandomAccessIteratorOperations. Since Java doesn't have sum types like Maybe, we can simulate it with a nullable return value instead. I've written a translation of std::advance into Java using that approach, you can find the code at the end of this post.

The problem is that the above approach doesn't scale to all uses of template specialization in C++, because if there are multiple interrelated concepts that you can specialize on, the number of possible conversions grows quadratically or worse. Another complication is that some concepts can have several type parameters, like multi-parameter type classes in Haskell. I'm still trying to figure out if that's an important use case.

In fact, I'm still trying to figure out if template specialization is worth the trouble at all. The C++ STL seems to have few uses of specialization. The examples I've found are std::advance, std::distance, vector<bool> which is known to be broken, and some specializations on trivially destructible types like pointers that allow you to skip calling destructors. The Boost libraries use specialization more, but it's often to compensate for the drawbacks of C++, like the lack of lambdas or garbage collection. I haven't found a really convincing use case for specialization yet, but I'll change my view if I find one.

Why am I so against specialization? Well, part of the reason is that I'm trying to explore "value-level programming", which means having no type-directed overloading or dispatch anywhere. But the bigger problem is that specialization in C++ seems to be incompatible with modularity and separate compilation, and would stay incompatible even if templates supported separate compilation. If you specialize someone else's templates, and their headers don't include yours, then everything will compile without errors, but the runtime behavior is undefined. That violates the principle that "well-typed programs don't have type errors", which feels very important to me. With the approach I outlined above, such errors are provably impossible, but at the cost of spelling out many things explicitly.

APPENDIX: the translation of std::advance into Java.

public class Main {

static interface Forward<Iterator> {
Iterator increment(Iterator iterator);
boolean equal(Iterator first, Iterator second);
/* Nullable */ RandomAccess<Iterator> getFastRandomAccess();
}

static interface RandomAccess<Iterator> {
Integer subtract(Iterator first, Iterator second);
}

static interface Input<Iterator, Value> {
Value get(Iterator iterator);
}

Iterator iterator,
Integer delta,
Forward<Iterator> forward
) {
if (forward.getFastRandomAccess() != null) {
} else {
while (delta > 0) {
iterator = forward.increment(iterator);
delta = delta - 1;
}
return iterator;
}
}

static class List<Value> {
public final Value value;
/* Nullable */ public final List<Value> next;
public List(
Value value,
List<Value> next
) {
this.value = value;
this.next = next;
}
}

static class ArrayRandomAccess implements RandomAccess<Integer> {
public Integer add(Integer iterator, Integer delta) {
return iterator + delta;
}
public Integer subtract(Integer first, Integer second) {
return first - second;
}
}

static class ArrayForward implements Forward<Integer> {
public Integer increment(Integer iterator) {
System.out.println("Using integer increment");
return iterator + 1;
}
public boolean equal(Integer first, Integer second) {
return first == second;
}
public RandomAccess<Integer> getFastRandomAccess() {
return new ArrayRandomAccess();
}
}

static class ArrayInput<Value> implements Input<Integer, Value> {
private final Value [] array;
public ArrayInput(Value [] array) {
this.array = array;
}
public Value get(Integer index) {
return array[index];
}
}

static class ListForward<Value> implements Forward<List<Value>> {
public List<Value> increment(List<Value> iterator) {
System.out.println("Using list increment");
return iterator.next;
}
public boolean equal(List<Value> first, List<Value> second) {
return first == second;
}
public RandomAccess<List<Value>> getFastRandomAccess() {
return null;
}
}

static class ListInput<Value> implements Input<List<Value>,Value> {
public Value get(List<Value> list) {
return list.value;
}
}

Iterator iterator,
Integer delta,
Forward<Iterator> forward,
Input<Iterator, Value> input
) {
}

public static void main(String [] args) {
Double [] array = new Double [] { 1.0, 2.0, 3.0 };
List<Double> list =

new List<Double>(1.0,
new List<Double>(2.0,
new List<Double>(3.0, null)));

0, 2, new ArrayForward(), new ArrayInput(array)));

list, 2, new ListForward(), new ListInput()));
}
}

And the output is this:

3.0
Using list increment
Using list increment
3.0

## Sunday, May 4, 2014

### An example of value-level programming in Java

In this post, which is just a big listing of code, I translate a C++ generic algorithm from the <algorithm> STL header into Java, using only basic features of Java. To prove that the generic algorithm can be used on any third-party data structure, I give an example of using it on a Java built-in array. I hope the example is nontrivial enough to show that many STL algorithms can be translated in the same way. More generally, I want to explore the possibility that "value-level programming" might be a good solution to the problems that C++ generic programming was intended to solve, as described in this interview with Alex Stepanov.

You can paste the code into http://www.compileonline.com/compile_java_online.php to try it out:

public class Main {

static interface BidirectionalIteratorOperations<Iterator> {
Iterator increment(Iterator iterator);
Iterator decrement(Iterator iterator);
boolean equal(Iterator first, Iterator second);
}

// This part factored out for convenience.
static interface Dereference<Iterator, Value> {
Value get(Iterator iterator);
void set(Iterator iterator, Value value);
}

static interface UnaryPredicate<Value> {
boolean test(Value value);
}

// Line-by-line translation.
// Note that Iterator and Value can be _any_ types.
static <Iterator, Value> Iterator partition(
Iterator first,
Iterator last,
UnaryPredicate<Value> predicate,
Dereference<Iterator, Value> dereference,
BidirectionalIteratorOperations<Iterator> operations
) {
while (!operations.equal(first, last)) {
while (predicate.test(dereference.get(first))) {
first = operations.increment(first);
if (operations.equal(first, last)) {
return first;
}
}
do {
last = operations.decrement(last);
if (operations.equal(first, last)) {
return first;
}
} while (!predicate.test(dereference.get(last)));
swap(first, last, dereference);
operations.increment(first);
}
return first;
}

// This isn't the true swap from C++, which swaps values.
static <Iterator, Value> void swap(
Iterator first,
Iterator second,
Dereference<Iterator, Value> dereference
) {
Value temp = dereference.get(first);
dereference.set(first, dereference.get(second));
dereference.set(second, temp);
}

// Let's use the above to partition a Java built-in array,
// using Integers as iterators.
public static void main(String [] args) {
final Double [] data =
new Double [] { 1.0, 2.0, 3.0, 4.0, 5.0 };
UnaryPredicate<Double> predicate =
new UnaryPredicate<Double>() {
public boolean test(Double value) {
return (value < 1.5) || (value > 4.5);
}
}
;
Dereference<Integer, Double> dereference =
new Dereference<Integer, Double>() {
public Double get(Integer index) {
return data[index];
}
public void set(Integer index, Double value) {
data[index] = value;
}
}
;
BidirectionalIteratorOperations<Integer> operations =
new BidirectionalIteratorOperations<Integer>() {
public Integer increment(Integer index) {
return index + 1;
}
public Integer decrement(Integer index) {
return index - 1;
}
public boolean equal(Integer first, Integer second) {
return first == second;
}
}
;
partition(0, data.length, predicate, dereference, operations);
System.out.println(java.util.Arrays.toString(data));
}
}

### Exception handling is for bugs

Should we use exceptions or return values to indicate failure? If we take the usual advice, "don't use exceptions for control flow", then we should never use exceptions at all, because control flow in unusual situations is still control flow. Maybe the advice is intended to mean something like "use exceptions only in unusual situations", but how do different programmers agree on which situations are unusual? For example, is "file not found" unusual? Or how about division by zero?

To decide whether to use exceptions or return values, my rule of thumb is that exceptions should be used for situations that indicate bugs. In other words, you should arrange your code so that it throws an exception only if there's a bug somewhere. That way, if you see an exception in your logs, you should find and fix the bug that caused it to be thrown. For example, "file not found" should not be an exception, because you cannot write a program that never gets "file not found". But division by zero should be an exception, because you can and should write programs that never divide by zero.

Using exceptions only to indicate bugs doesn't mean you should avoid exceptions. On the contrary, it's useful to have many assertions in the code and treat assertion failures as exceptions. If you use exception handling only for bugs, as I suggest, then all exceptions become assertion failures in some sense.

## Saturday, May 3, 2014

I often see people calling some piece of code "readable" if reads like an English-language description of the task, and uses complex programming techniques to make it actually work as a program. For example, many programs in Ruby are like that, or you can check out this post for another example in Haskell.

To me, such code doesn't look readable at all. I mostly read code in order to make changes or find bugs, or figure out something about the implementation. Complex programming techniques often make these tasks more difficult, regardless of their other benefits.

A piece of code looks readable to me if I can understand how it works, not just what it does. Using readable English identifiers is good, but it is also important to have straightforward control flow. That makes it easier to build a mental model of how the code executes. I wish more programmers paid attention to straightforward control flow, instead of using complex programming techniques to achieve surface readability.

## Friday, May 2, 2014

### A snippet from my new programming language

In the last few days I'm thinking about a hypothetical programming language called MVL, which is short for "Manifestly typed Value-level Language" or maybe "Verbose ML".

Here's some example code in that language:

GENERIC(Type) RECORD Group(
operation: FUNCTION(Type, Type): Type,
inverse: FUNCTION(Type): Type,
identity: Type
)

GENERIC(Type) RECORD Field(
multiplication: Group(Type)
)

GENERIC(Scalar, Vector) RECORD VectorSpace(
scalars: Field(Scalar),
vectors: Group(Vector),
scale: FUNCTION(Vector, Scalar): Vector
)

(Yes, I know that multiplication is not a group. But we can throw exceptions on divide by zero, so it's close enough.)

The idea of "value-level programming" (which I got from of Gabriel Gonzalez) is to pass these records around as arguments to functions, instead of using something like OO interfaces, Haskell typeclasses or concepts in C++ templates. The resulting code is similar to Haskell after translating typeclasses to dictionaries. Since it's annoying to always pass the records around, I'm thinking of a system like Agda's instance arguments or Scala's implicit arguments to make it easier:

GENERIC(Type) FUNCTION power(
x: Type,
y: Integer,
IMPLICIT group: Group(Type)
): Type
IF y > 0
RETURN group.operation(x, power(x, y - 1))
ELSE IF y = 0
RETURN group.identity
ELSE
RETURN group.inverse(power(x, -y))

FUNCTION floatMultiply(x: Float, y: Float): Float
RETURN x * y

FUNCTION floatInverse(x: Float): Float
RETURN 1 / x

VAR floatMultiplicationGroup: Group(Float) = (
operation: floatMultiply,
inverse: floatInverse,
identity: 1.0
)

VAR x: Float = power(0.5, -2)
# Returns 4.0

When you call the "power" function, you don't need to explicitly pass the group as an argument, because it's marked as implicit and picked up from the closest surrounding scope. The top level call picks up the value of floatMultiplicationGroup, and the recursive calls pick up the value from the enclosing call. If there are multiple matches in the same scope, that's a compile-time error, and you should pass the value explicitly.

Now all that remains is to make that idea into a workable language :-)