A Simple Analysis to Eliminate Overflow Checking on Increments
Henry Cejtin
henry@sourcelight.com
Fri, 13 Jul 2001 18:20:24 -0500
Yes, the fact that if I use `=' and start in the wrong place my loop will
continue until overflow is good: it means the program will die right there.
If I use `<=', it will stop right away, but clearly I was confused and this
may result in other errors. The notion is to die as quickly as possible near
any error.
As to bounds checking, my notion was that the important thing is to realise
that if i starts out in [0, Array.length a], then
fun loop i =
if i = Array.length a
then ()
else (Array.update (a, i, ???);
loop (i + 1))
is ok (by induction). Of course you need the `eureka' step of producing that
inductive assumption, but given the Array.update, it isn't bad.
On an very weakly related note, we really need something better than the
basis library as a way of creating arrays, and, more importantly, strings and
vectors. (These are more important because being immutable, it really
requires unsafe casts or another copy.)
The point is that tabulate is a convenient special case, but not the general
case that you want. The general case you want is unfold. It takes 3
arguments: the size (length of the result), the state, and a function of type
where 'elem is the type of the elements. In addition to this, you would
unfoldL, which produces the elements in the reverse order. (I have no notion
of the correct capitalization for this.)
Using these, tabulate becomes
fun tabulate (size: int, proc: int -> 'elem) =
unfold (size,
0,
fn i => (proc i, i + 1))
With these, including unfoldL, lots of things are easy and efficient. One
typical example is where you have a list of characters and you want to build
the string which is those characters but in reverse order (because you
accumulated the characters in a list, so that reversed it). The fact that
currently you have to reverse the list is silly.
Other cases where you have to carry along some state as you build up the
elements are very common and currently you have to pick one of the following:
Produce an array instead of a vector and then do a copy at the end.
Use a ref cell to hold the state and mutate as you walk along.
I hate both of these. Besides, the unfold and unfoldL really are the right
idiom and they should be supported by the basis library.