As we shall see, introducing assignment into our programming language
leads us into a thicket of difficult conceptual issues. Nevertheless,
viewing systems as
collections of objects with local state is a
powerful technique for maintaining a
modular design. As a simple
example, consider the design of a
function
rand that, whenever
it is called, returns an integer chosen at random.
It is not at all clear what is meant by chosen at random.
What we presumably want is for successive calls to
rand to produce a sequence of numbers that has
statistical properties of uniform distribution. We will not discuss methods
for generating suitable sequences here. Rather, let us assume that we have a
function
rand_update
that has the property that if we start with a given number
$x_{1}$ and form
We can implement rand as a
function
with a local state variable x that is
initialized to some fixed value
random_init.
Each call to rand computes
rand_update
of the current value of x, returns this as the
random number, and also stores this as the new value of
x.
function make_rand() {
let x = random_init;
return () => {
x = rand_update(x);
return x;
};
}
const rand = make_rand();
Of course, we could generate the same sequence of random numbers
without using assignment by simply calling
rand_update
directly. However, this would mean that any part of our program that used
random numbers would have to explicitly remember the current value of
x to be passed as an argument to
rand_update.
To realize what an annoyance this would be, consider using random numbers
to implement a technique called
Monte Carlo simulation.
もちろん、代入を使わなくても、単に
rand_update
を直接呼び出すことで同じ乱数列を生成できます。しかし、その場合、乱数を使うプログラムのあらゆる部分が、
rand_update
に引数として渡すための x の現在の値を明示的に覚えておかなければなりません。これがどれほど厄介かを理解するために、乱数を使って
モンテカルロシミュレーションと呼ばれる手法を実装することを考えてみましょう。
The Monte Carlo method consists of choosing sample experiments at random
from a large set and then making deductions on the basis of the
probabilities estimated from tabulating the results of those experiments.
For example, we can approximate
$\pi$ using the fact that
$6/\pi^2$ is the probability that two integers
chosen at random will have no factors in common; that is, that their
greatest common divisor will be 1.
To obtain the approximation to $\pi$, we perform
a large number of experiments. In each experiment we choose two integers at
random and perform a test
to see if their GCD is 1. The fraction of times that the test is passed
gives us our estimate of $6/\pi^2$, and from this
we obtain our approximation to $\pi$.
The heart of our program is a
function
monte_carlo,
which takes as arguments the number of times to try an experiment, together
with the experiment, represented as a no-argument
function
that will return either true or false each time it is run.
The function
monte_carlo
runs the experiment for the designated number of trials and returns a
number telling the fraction of the trials in which the experiment was
found to be true.
Now let us try the same computation using
rand_update
directly rather than rand, the way we would be
forced to proceed if we did not use assignment to model local state:
While the program is still simple, it betrays some painful breaches of
modularity. In our first version of the program, using
rand, we can express the Monte Carlo method
directly as a general
monte_carlo
function
that takes as an argument an arbitrary
experiment
function.
In our second version of the program, with no local state for the
random-number generator,
random_gcd_test
must explicitly manipulate the random numbers
x1 and x2 and
recycle x2 through the iterative loop as the
new input to
rand_update.
This explicit handling of the random numbers intertwines the structure of
accumulating test results with the fact that our particular experiment uses
two random numbers, whereas other Monte Carlo experiments might use one
random number or three. Even the top-level
function
estimate_pi
has to be concerned with supplying an initial random number. The fact that
the random-number generator's insides are leaking out into other parts
of the program makes it difficult for us to isolate the Monte Carlo idea so
that it can be applied to other tasks. In the first version of the program,
assignment encapsulates the state of the random-number generator within the
rand
function,
so that the details of random-number generation remain independent of the
rest of the program.
The general phenomenon illustrated by the Monte Carlo example is this: From
the point of view of one part of a complex process, the other parts appear
to change with time. They have hidden time-varying local state. If we wish
to write computer programs whose structure reflects this decomposition, we
make computational objects (such as bank accounts and random-number
generators) whose behavior changes with time. We model state with local
state variables, and we model the changes of state with assignments to those
variables.
It is tempting to conclude this discussion by saying that, by introducing
assignment and the technique of hiding state in local variables, we are able
to structure systems in a more modular fashion than if all state had to be
manipulated explicitly, by passing additional parameters. Unfortunately,
as we shall see, the story is not so simple.
Monte Carlo integration
is a method of estimating definite
integrals by means of Monte Carlo simulation. Consider computing the
area of a region of space described by a predicate
$P(x, y)$ that is true for points
$(x, y)$ in the region and false for points not
in the region. For example, the region contained within a circle of radius
$3$ centered at
$(5, 7)$ is described by the predicate that tests
whether $(x-5)^2 + (y-7)^2\leq 3^2$. To estimate
the area of the region described by such a predicate, begin by choosing a
rectangle that contains the region. For example, a rectangle with diagonally
opposite corners at $(2, 4)$ and
$(8, 10)$ contains the circle above. The desired
integral is the area of that portion of the rectangle that lies in the
region. We can estimate the integral by picking, at random, points
$(x, y)$ that lie in the rectangle, and testing
$P(x, y)$ for each point to determine whether the
point lies in the region. If we try this with many points, then the fraction
of points that fall in the region should give an estimate of the proportion
of the rectangle that lies in the region. Hence, multiplying this fraction
by the area of the entire rectangle should produce an estimate of the
integral.
Implement Monte Carlo integration as a
function
estimate_integral
that takes as arguments a predicate P, upper
and lower bounds x1,
x2, y1, and
y2 for the rectangle, and the number of trials
to perform in order to produce the estimate. Your
function
should use the same
monte_carlo
function
that was used above to estimate $\pi$. Use your
estimate_integral
to produce an estimate of $\pi$ by measuring the
area of a unit circle.
You will find it useful to have a
function
that returns a number chosen at random from a given range. The following
random_in_range
function
implements this in terms of the
math_random
function
used in section 1.2.6, which returns a
nonnegative number less
than 1.
It is useful to be able to
reset a random-number generator to produce
a sequence starting from a given value. Design a new
rand
function
that is called with an argument that is either the
string "generate" or the string
"reset"
and behaves as follows:
rand("generate")
produces a new random number;
rand("reset")($new$-$value$)
resets the internal state variable to the designated $new$-$value$. Thus, by resetting the
state, one can generate repeatable sequences. These are very handy to have
when testing and debugging programs that use random numbers.
let state = 2;
function rand(symbol) {
if (symbol === "reset") {
return new_state => {
state = new_state;
};
} else {
// symbol is "generate"
state = (state * 1010) % 1101;
return state;
}
}
One common way to implement
rand_update
is to use the rule that $x$ is updated to
$ax+b$ modulo $m$,
where $a$, $b$, and
$m$ are appropriately chosen integers.
Chapter 3 of
Knuth 1997b includes an extensive
discussion of techniques for generating sequences of random numbers and
establishing their statistical properties. Notice that the
rand_update
function
computes a mathematical function: Given the same input twice, it
produces the same output. Therefore, the number sequence produced by
rand_update
certainly is not random, if by random we
insist that each number in the sequence is unrelated to the preceding
number. The relation between real randomness and so-called
pseudo-random sequences, which are produced by well-determined
computations and yet have suitable statistical properties, is a
complex question involving difficult issues in mathematics and
philosophy.
Kolmogorov,
Solomonoff, and
Chaitin have made great
progress in clarifying these issues; a discussion can be found in
Chaitin 1975.