3.1.2 The Benefits of Introducing Assignment

Permalink copied!

3.1.2

As we shall see, introducing assignment into our programming language leads us into a thicket of difficult conceptual issues. Nevertheless, viewing systems as collections of objects with local state is a powerful technique for maintaining a modular design. As a simple example, consider the design of a function rand that, whenever it is called, returns an integer chosen at random.

It is not at all clear what is meant by chosen at random. What we presumably want is for successive calls to rand to produce a sequence of numbers that has statistical properties of uniform distribution. We will not discuss methods for generating suitable sequences here. Rather, let us assume that we have a function rand_update that has the property that if we start with a given number $x_{1}$ and form

$x_2$ = rand_update($x_1$); $x_3$ = rand_update($x_2$);

then the sequence of values $x_1, x_2, x_3, \ldots$, will have the desired statistical properties.

[1]

We can implement rand as a function with a local state variable x that is initialized to some fixed value random_init. Each call to rand computes rand_update of the current value of x, returns this as the random number, and also stores this as the new value of x.

function make_rand() {
   let x = random_init;
   return () => {
              x = rand_update(x);
              return x;
          };
}
const rand = make_rand();

Of course, we could generate the same sequence of random numbers without using assignment by simply calling rand_update directly. However, this would mean that any part of our program that used random numbers would have to explicitly remember the current value of x to be passed as an argument to rand_update. To realize what an annoyance this would be, consider using random numbers to implement a technique called Monte Carlo simulation.

The Monte Carlo method consists of choosing sample experiments at random from a large set and then making deductions on the basis of the probabilities estimated from tabulating the results of those experiments. For example, we can approximate $\pi$ using the fact that $6/\pi^2$ is the probability that two integers chosen at random will have no factors in common; that is, that their greatest common divisor will be 1.

[2]

To obtain the approximation to $\pi$, we perform a large number of experiments. In each experiment we choose two integers at random and perform a test to see if their GCD is 1. The fraction of times that the test is passed gives us our estimate of $6/\pi^2$, and from this we obtain our approximation to $\pi$.

The heart of our program is a function monte_carlo, which takes as arguments the number of times to try an experiment, together with the experiment, represented as a no-argument function that will return either true or false each time it is run. The function monte_carlo runs the experiment for the designated number of trials and returns a number telling the fraction of the trials in which the experiment was found to be true.

function estimate_pi(trials) {
    return math_sqrt(6 / monte_carlo(trials, dirichlet_test));
}
function dirichlet_test() {
    return gcd(rand(), rand()) === 1;
}
function monte_carlo(trials, experiment) {
    function iter(trials_remaining, trials_passed) {
        return trials_remaining === 0
               ? trials_passed / trials
               : experiment()
               ? iter(trials_remaining - 1, trials_passed + 1)
               : iter(trials_remaining - 1, trials_passed);
    }
    return iter(trials, 0);
}

Now let us try the same computation using rand_update directly rather than rand, the way we would be forced to proceed if we did not use assignment to model local state:

function estimate_pi(trials) {
    return math_sqrt(6 / random_gcd_test(trials, random_init));
}
function random_gcd_test(trials, initial_x) {
    function iter(trials_remaining, trials_passed, x) {
        const x1 = rand_update(x);
        const x2 = rand_update(x1);
        return trials_remaining === 0
               ? trials_passed / trials
               : gcd(x1, x2) === 1
               ? iter(trials_remaining - 1, trials_passed + 1, x2)
               : iter(trials_remaining - 1, trials_passed, x2);
    }
    return iter(trials, 0, initial_x);
}

While the program is still simple, it betrays some painful breaches of modularity. In our first version of the program, using rand, we can express the Monte Carlo method directly as a general monte_carlo function that takes as an argument an arbitrary experiment function. In our second version of the program, with no local state for the random-number generator, random_gcd_test must explicitly manipulate the random numbers x1 and x2 and recycle x2 through the iterative loop as the new input to rand_update. This explicit handling of the random numbers intertwines the structure of accumulating test results with the fact that our particular experiment uses two random numbers, whereas other Monte Carlo experiments might use one random number or three. Even the top-level function estimate_pi has to be concerned with supplying an initial random number. The fact that the random-number generator's insides are leaking out into other parts of the program makes it difficult for us to isolate the Monte Carlo idea so that it can be applied to other tasks. In the first version of the program, assignment encapsulates the state of the random-number generator within the rand function, so that the details of random-number generation remain independent of the rest of the program.

The general phenomenon illustrated by the Monte Carlo example is this: From the point of view of one part of a complex process, the other parts appear to change with time. They have hidden time-varying local state. If we wish to write computer programs whose structure reflects this decomposition, we make computational objects (such as bank accounts and random-number generators) whose behavior changes with time. We model state with local state variables, and we model the changes of state with assignments to those variables.

It is tempting to conclude this discussion by saying that, by introducing assignment and the technique of hiding state in local variables, we are able to structure systems in a more modular fashion than if all state had to be manipulated explicitly, by passing additional parameters. Unfortunately, as we shall see, the story is not so simple.

Exercise 3.5

Monte Carlo integration is a method of estimating definite integrals by means of Monte Carlo simulation. Consider computing the area of a region of space described by a predicate $P(x, y)$ that is true for points $(x, y)$ in the region and false for points not in the region. For example, the region contained within a circle of radius $3$ centered at $(5, 7)$ is described by the predicate that tests whether $(x-5)^2 + (y-7)^2\leq 3^2$. To estimate the area of the region described by such a predicate, begin by choosing a rectangle that contains the region. For example, a rectangle with diagonally opposite corners at $(2, 4)$ and $(8, 10)$ contains the circle above. The desired integral is the area of that portion of the rectangle that lies in the region. We can estimate the integral by picking, at random, points $(x, y)$ that lie in the rectangle, and testing $P(x, y)$ for each point to determine whether the point lies in the region. If we try this with many points, then the fraction of points that fall in the region should give an estimate of the proportion of the rectangle that lies in the region. Hence, multiplying this fraction by the area of the entire rectangle should produce an estimate of the integral.

Implement Monte Carlo integration as a function estimate_integral that takes as arguments a predicate P, upper and lower bounds x1, x2, y1, and y2 for the rectangle, and the number of trials to perform in order to produce the estimate. Your function should use the same monte_carlo function that was used above to estimate $\pi$. Use your estimate_integral to produce an estimate of $\pi$ by measuring the area of a unit circle.

You will find it useful to have a function that returns a number chosen at random from a given range. The following random_in_range function implements this in terms of the math_random function used in section 1.2.6, which returns a nonnegative number less than 1.

モンテカルロ積分は、モンテカルロシミュレーションを用いて定積分を推定する方法です。述語 $P(x, y)$ で記述される空間領域の面積を計算することを考えましょう。この述語は、領域内の点 $(x, y)$ に対しては true、領域外の点に対しては false を返します。例えば、$(5, 7)$ を中心とする半径 $3$ の円に含まれる領域は、$(x-5)^2 + (y-7)^2\leq 3^2$ かどうかをテストする述語で記述されます。このような述語で記述される領域の面積を推定するには、まず領域を含む長方形を選びます。例えば、対角の頂点が $(2, 4)$ と $(8, 10)$ にある長方形は上記の円を含みます。求めたい積分は、長方形のうち領域内にある部分の面積です。積分を推定するには、長方形内の点 $(x, y)$ をランダムに選び、各点について $P(x, y)$ をテストして、その点が領域内にあるかどうかを判定します。多数の点でこれを試みれば、領域内に落ちる点の割合が、長方形のうち領域内にある比率の推定値を与えるはずです。したがって、この割合に長方形全体の面積を掛ければ、積分の推定値が得られます。

モンテカルロ積分を関数 estimate_integral として実装してください。引数として、述語 P、長方形の上下限 x1、x2、y1、y2、および推定を行うための試行回数を取ります。この関数は、上記で $\pi$ を推定するのに使ったのと同じ monte_carlo 関数を使うべきです。 estimate_integral を使って、単位円の面積を測定することで $\pi$ の推定値を求めてください。

指定された範囲からランダムに数を選ぶ関数があると便利です。以下の random_in_range 関数は、セクション 1.2.6で使った math_random 関数を使って実装しています。この関数は 0 以上 1 未満の非負の数を返します。

function random_in_range(low, high) {
    const range = high - low;
    return low + math_random() * range;
}

function random_in_range(low, high) {
    const range = high - low;
    return low + math_random() * range;
}
function estimate_integral(pred, x1, x2, y1, y2, trials) {
    const area_rect = (x2 - x1) * (y2 - y1);
    return monte_carlo(trials,
                       () => pred(random_in_range(x1, x2),
                                  random_in_range(y1, y2))) * area_rect;
}

Exercise 3.6

It is useful to be able to reset a random-number generator to produce a sequence starting from a given value. Design a new rand function that is called with an argument that is either the string "generate" or the string "reset" and behaves as follows: rand("generate") produces a new random number; rand("reset")($new$-$value$) resets the internal state variable to the designated $new$-$value$. Thus, by resetting the state, one can generate repeatable sequences. These are very handy to have when testing and debugging programs that use random numbers.

let state = 2;

function rand(symbol) {
    if (symbol === "reset") {
        return new_state => {
                   state = new_state;
               };
    } else {
        // symbol is "generate"
        state = (state * 1010) % 1101;
        return state;
    }
}

[1]

One common way to implement rand_update is to use the rule that $x$ is updated to $ax+b$ modulo $m$, where $a$, $b$, and $m$ are appropriately chosen integers. Chapter 3 of Knuth 1997b includes an extensive discussion of techniques for generating sequences of random numbers and establishing their statistical properties. Notice that the rand_update function computes a mathematical function: Given the same input twice, it produces the same output. Therefore, the number sequence produced by rand_update certainly is not random, if by random we insist that each number in the sequence is unrelated to the preceding number. The relation between real randomness and so-called pseudo-random sequences, which are produced by well-determined computations and yet have suitable statistical properties, is a complex question involving difficult issues in mathematics and philosophy. Kolmogorov, Solomonoff, and Chaitin have made great progress in clarifying these issues; a discussion can be found in Chaitin 1975.

[2]

This theorem is due to G. Lejeune Dirichlet. See section 4.5.2 of Knuth 1997b for a discussion and a proof.

< Previous

Next >

3.1.2

The Benefits of Introducing Assignment