Programmers write programs as text, i.e. sequences of characters, entered in a programming environment or a text editor. To run our evaluator, we need to start with a representation of this program text as a JavaScript value. In section 2.3.1 we introduced strings to represent text. We would like to evaluate programs such as "const size = 2; 5 * size;" from section 1.1.2. Unfortunately, such program text does not provide enough structure to the evaluator. In this example, the program parts "size = 2" and "5 * size" look similar, but carry very different meanings. Abstract syntax functions such as declaration_value_expression would be difficult and error-prone to implement by examining the program text. In this section, we therefore introduce a function parse that translates program text to a tagged-list representation, reminiscent of the tagged data of section 2.4.2. For example, the application of parse to the program string above produces a data structure that reflects the structure of the program: a sequence consisting of a constant declaration associating the name size with the value 2 and a multiplication.
parse("const size = 2; 5 * size;");
list("sequence",
list(list("constant_declaration",
list("name", "size"), list("literal", 2)),
list("binary_operator_combination", "*",
list("literal", 5), list("name", "size"))))The evaluator is reminiscent of the symbolic differentiation program discussed in section 2.3.2. Both programs operate on symbolic data. In both programs, the result of operating on an object is determined by operating recursively on the pieces of the object and combining the results in a way that depends on the type of the object. In both programs we used data abstraction to decouple the general rules of operation from the details of how the objects are represented. In the differentiation program this meant that the same differentiation function could deal with algebraic expressions in prefix form, in infix form, or in some other form. For the evaluator, this means that the syntax of the language being evaluated is determined solely by parse and the functions that classify and extract pieces of the tagged lists produced by parse.
Literal expressions are parsed into tagged lists with tag "literal" and the actual value. \[ \begin{align*} \ll\ \mathit{literal}\textit{-}\mathit{expression}\ \gg & = \texttt{list("literal", }\mathit{value}\texttt{)} \end{align*} \] where $value$ is the JavaScript value represented by the $literal$-$expression$ string. Here $\ll\ $$literal$-$expression$$\ \gg$ denotes the result of parsing the string $literal$-$expression$.
parse("1;");
list("literal", 1)parse("'hello world';");
list("literal", "hello world")parse("null;");
list("literal", null)function is_literal(component) {
return is_tagged_list(component, "literal");
}
function is_tagged_list(component, the_tag) {
return is_pair(component) && head(component) === the_tag;
}
function literal_value(component) {
return head(tail(component));
}
literal_value(parse("null;"));
null
We provide a constructor for literals, which will come in handy:
function make_literal(value) {
return list("literal", value);
}
The tagged-list representation for names includes the tag "name" as first element and the string representing the name as second element. \[ \begin{align*} \ll\ \mathit{name}\ \gg & = \texttt{list("name", }\mathit{symbol}\texttt{)} \end{align*} \] where $symbol$ is a string that contains the characters that make up the $name$ as written in the program. The syntax predicate for names is is_name. The symbol is accessed using the selector symbol_of_name. We provide a constructor for names, to be used by operator_combination_to_application:
function make_name(symbol) {
return list("name", symbol);
}
We do not need to distinguish between expressions and expression statements. Consequently, parse can ignore the difference between the two kinds of components: \[ \ll\ \mathit{expression}\texttt{;}\ \gg {=} \ll\ \mathit{expression}\ \gg \]
Function applications are parsed as follows: \[ \ll\ \mathit{fun}\textit{-}\mathit{expr}\texttt{(}\mathit{arg}\textit{-}\mathit{expr}_1\texttt{, }\ldots\texttt{, } \mathit{arg}\textit{-}\mathit{expr}_n \texttt{)}\ \gg \\ = \\ \texttt{list("application",} \ll\ \mathit{fun}\textit{-}\mathit{expr}\gg\texttt{, list(} \ll\ \mathit{arg}\textit{-}\mathit{expr}_1\;\gg \texttt{, }\ldots\texttt{, } \ll\ \mathit{arg}\textit{-}\mathit{expr}_n\;\gg \texttt{))} \] We declare is_application as the syntax predicate and function_expression and arg_expressions as the selectors. We add a constructor for function applications, to be used by operator_combination_to_application:
function make_application(function_expression, argument_expressions) {
return list("application",
function_expression, argument_expressions);
}
Conditional expressions are parsed as follows:
$\ll\ \mathit{predicate}\ \texttt{?}\ \mathit{consequent}\textit{-}\mathit{expression}\ \texttt{:}\ \mathit{alternative}\textit{-}\mathit{expression}\ \gg\ \ = $ list("conditional_expression", $\ll\ $$predicate$$\ \gg$, $\ll\ $$consequent$-$expression$$\ \gg$, $\ll\ $$alternative$-$expression$$\ \gg$)Similarly, conditional statements are parsed as follows:
$\ll\ \textbf{if}\ \texttt{(} \mathit{predicate} \texttt{)}\ \mathit{consequent}\textit{-}\mathit{block}\ \textbf{else}\ \mathit{alternative}\textit{-}\mathit{block}\ \gg\ \ =$ list("conditional_statement", $\ll\ $$predicate$$\ \gg$, $\ll\ $$consequent$-$block$$\ \gg$, $\ll\ $$alternative$-$block$$\ \gg$)The syntax predicate is_conditional returns true for both kinds of conditionals, and the selectors conditional_predicate, conditional_consequent, and conditional_alternative can be applied to both kinds.
A lambda expression whose body is an expression is parsed as if the body consisted of a block containing a single return statement whose return expression is the body of the lambda expression. \[ \ll\ \texttt{(}\mathit{name}_1\texttt{, }\ldots\texttt{, } \mathit{name}_n \texttt{) =>}\ \mathit{expression}\ \gg \\ = \\ \ll\ \texttt{(}\mathit{name}_1\texttt{, }\ldots\texttt{, } \mathit{name}_n \texttt{) => \{}\ \textbf{return} \ \mathit{expression}\texttt{;}\ \texttt{\}}\ \gg \] A lambda expression whose body is a block is parsed as follows:
$\ll\ \texttt{(}\mathit{name}_1\texttt{, }\ldots\texttt{, } \mathit{name}_n \texttt{) =>}\ \mathit{block}\ \gg\ \ =$ list("lambda_expression", list($\ll\ name_1\;\gg$, $\ldots$, $\ll\ name_n\;\gg$), $\ll\ block\ \gg$)The syntax predicate is is_lambda_expression and the selector for the body of the lambda expression is lambda_body. The selector for the parameters, called lambda_parameter_symbols, additionally extracts the symbols from the names.
function lambda_parameter_symbols(component) {
return map(symbol_of_name, head(tail(component)));
}
function make_lambda_expression(parameters, body) {
return list("lambda_expression", parameters, body);
}
A sequence statement packages a sequence of statements into a single statement. A sequence of statements is parsed as follows: \[ \ll\ \mathit{statement}_1 \cdots \mathit{statement}_n\;\gg \\ = \\ \texttt{list("sequence", list(} \ll\ \mathit{statement}_1\;\gg \texttt{, } \ldots \texttt{, } \ll\ \mathit{statement}_n\;\gg \texttt{))} \] The syntax predicate is is_sequence and the selector is sequence_statements. We retrieve the first of a list of statements using first_statement and the remaining statements using rest_statements. We test whether the list is empty using the predicate is_empty_sequence and whether it contains only one element using the predicate is_last_statement.[1]
function first_statement(stmts) { return head(stmts); }
function rest_statements(stmts) { return tail(stmts); }
function is_empty_sequence(stmts) { return is_null(stmts); }
function is_last_statement(stmts) { return is_null(tail(stmts)); }
Blocks are parsed as follows:[2] \[ \begin{align*} \ll\ \texttt{\{}\ \mathit{statements}\ \texttt{\}}\ \gg & = \texttt{list("block",}\ \ll\ \mathit{statements}\ \gg \texttt{)} \end{align*} \] Here $statements$ refers to a sequence of statements, as shown above. The syntax predicate is is_block and the selector is block_body.
Return statements are parsed as follows: \[ \begin{align*} \ll\ \textbf{return}\ \mathit{expression} \texttt{;}\ \gg & = \texttt{list("return\_statement",}\ \ll\ \mathit{expression}\ \gg \texttt{)} \end{align*} \] The syntax predicate and selector are, respectively, is_return_statement and return_expression.
Assignments are parsed as follows: \[ \begin{align*} \ll\ \mathit{name}\ \ \texttt{=}\ \ \mathit{expression}\ \gg & = \texttt{list("assignment", }\ll\ \mathit{name}\gg \texttt{, }\ll\ \mathit{expression}\ \gg \texttt{)} \end{align*} \] The syntax predicate is is_assignment and the selectors are assignment_symbol and assignment_value_expression. The symbol is wrapped in a tagged list representing the name, and thus assignment_symbol needs to unwrap it.
function assignment_symbol(component) {
return symbol_of_name(head(tail(component))));
}
Constant and variable declarations are parsed as follows: \[ \ll\ \textbf{const}\ \mathit{name}\ \ \texttt{=}\ \ \mathit{expression}\texttt{;}\ \gg \\ = \\ \texttt{list("constant\_declaration", } \ll\ \mathit{name}\ \gg \texttt{, }\ll\ \mathit{expression}\ \gg \texttt{)}\\[3mm] \ll\ \textbf{let}\ \mathit{name} \ \ \texttt{=}\ \ \mathit{expression}\texttt{;}\ \gg \\ = \\ \texttt{list("variable\_declaration", } \ll\ \mathit{name}\ \gg \texttt{, }\ll\ \mathit{expression}\ \gg \texttt{)} \] The selectors declaration_symbol and declaration_value_expression apply to both kinds.
function declaration_symbol(component) {
return symbol_of_name(head(tail(component)));
}
function declaration_value_expression(component) {
return head(tail(tail(component)));
}
function make_constant_declaration(name, value_expression) {
return list("constant_declaration", name, value_expression);
}
Function declarations are parsed as follows:
$\ll\ \textbf{function}\ \mathit{name} \texttt{(}\mathit{name}_1\texttt{, }\ldots\texttt{, } \mathit{name}_n \texttt{)} \ \mathit{block}\ \gg\ \ =$ list("function_declaration", $\ll\ $$name$$\ \gg$, list($\ll\ $$name$$_1\;\gg$, $\ldots$, $\ll\ $$name$$_n\;\gg$), $\ll\ $$block$$\ \gg$)The syntax predicate is_function_declaration recognizes these. The selectors are function_declaration_name, function_declaration_parameters, and function_declaration_body.
The syntax predicate is_declaration returns true for all three kinds of declarations.
function is_declaration(component) {
return is_tagged_list(component, "constant_declaration") ||
is_tagged_list(component, "variable_declaration") ||
is_tagged_list(component, "function_declaration");
}
Some syntactic forms in our language can be defined in terms of components involving other syntactic forms, rather than being implemented directly. One example is function declaration, which evaluate transforms into a constant declaration whose value expression is a lambda expression.[3]
function function_decl_to_constant_decl(component) {
return make_constant_declaration(
function_declaration_name(component),
make_lambda_expression(
function_declaration_parameters(component),
function_declaration_body(component)));
}
Similarly, we define operator combinations in terms of function applications. Operator combinations are unary or binary and carry their operator symbol as second element in the tagged-list representation: \[ \ll\ \mathit{unary}\textit{-}\mathit{operator}\ \ \mathit{expression}\ \gg \\ = \\ \texttt{list("unary\_operator\_combination", "}\mathit{unary}\textit{-}\mathit{operator}\texttt{"},\ \texttt{list(}\ll\ \mathit{expression}\ \gg \texttt{))} \] where $unary$-$operator$ is ! (for logical negation) or -unary (for numeric negation), and \[ \ll\ \mathit{expression}_1\ \mathit{binary}\textit{-}\mathit{operator}\ \ \mathit{expression}_2\;\gg \\ = \\ \texttt{list("binary\_operator\_combination", "}\mathit{binary}\textit{-}\mathit{operator}\texttt{"},\\ \texttt{list(}\ll\ \mathit{expression}_1\;\gg\texttt{,}\ \ll\ \mathit{expression}_2\;\gg \texttt{))} \] where $binary$-$operator$ is +, -, *, /, %, ===, !==, >, <, >= or <=. The syntax predicates are is_operator_combination, is_unary_operator_combination, and is_binary_operator_combination, and the selectors are operator_symbol, first_operand, and second_operand.
The evaluator uses operator_combination_to_application to transform an operator combination into a function application whose function expression is the name of the operator:
function operator_combination_to_application(component) {
const operator = operator_symbol(component);
return is_unary_operator_combination(component)
? make_application(make_name(operator),
list(first_operand(component)))
: make_application(make_name(operator),
list(first_operand(component),
second_operand(component)));
}
Components (such as function declarations and operator combinations) that we choose to implement as syntactic transformations are called derived components. Logical composition operations are also derived components (see exercise 4.4).
$\ll\ \mathit{expression}_1\ \ \mathit{logical}\textit{-}\mathit{operation}\ \ \mathit{expression}_2\;\gg \ \ = $ list("logical_composition", "$logical$-$operation$", list($\ll\ $$expression$$_1\;\gg$, $\ll\ $$expression$$_2\;\gg$))where $logical$-$operation$ is && or ||. Install && and || as new syntactic forms for the evaluator by declaring appropriate syntax functions and evaluation functions eval_and and eval_or. Alternatively, show how to implement && and || as derived components.
function make_conditional_expr_decl(predicate, consequent_expression, alternative_expression) {
return list("conditional_expression", predicate, consequent_expression, alternative_expression);
}
function make_literal(value) {
return list("literal", value);
}
// Syntax selectors
function logical_operation(component) {
return head(tail(component));
}
function first_expression(component) {
return head(tail(tail(component)));
}
function second_expression(component) {
return head(tail(tail(tail(component))));
}
function logical_comp_decl_to_conditional_expr_decl(component) {
const operation = logical_operation(component);
return operation === "&&"
? make_conditional_expr_decl(
first_expression(component),
second_expression(component),
false
)
: operation === "||"
? make_conditional_expr_decl(
first_expression(component),
true,
second_expression(component)
)
: error(component, "unknown operation -- logical_comp_decl_to_conditional_expr_decl");
}
display(logical_comp_decl_to_conditional_expr_decl(parse("a && b;")));
display(logical_comp_decl_to_conditional_expr_decl(parse("a || b;")));
display(logical_comp_decl_to_conditional_expr_decl(parse("(a && !b) || (!a && b);")));
let* x = 3; let* y = x + 2; let* z = x + y + 5; display(x * z);
{
let x = 3;
{
let y = x + 2;
{
let z = x + y + 5;
display(x * z);
}
}
}For example, recall the imperative-style version of the iterative factorial function from section 3.1.3:
function factorial(n) {
let product = 1;
let counter = 1;
function iter() {
if (counter > n) {
return product;
} else {
product = counter * product;
counter = counter + 1;
return iter();
}
}
return iter();
}
function factorial(n) {
let product = 1;
let counter = 1;
while (counter <= n) {
product = counter * product;
counter = counter + 1;
}
return product;
}
$\ll\ \textbf{while} \ \texttt{(}\ \mathit{predicate}\ \texttt{)}\ \mathit{block}\ \gg \ \ =$ $\ll\ $while ($predicate$) $block$$\ \gg$ = list("while_loop", $\ll\ $$predicate$$\ \gg$, $\ll\ $$block$$\ \gg$)
function factorial(n) {
let product = 1;
let counter = 1;
while_loop(() => counter <= n,
() => {
product = counter * product;
counter = counter + 1;
});
return product;
}
For such a program, JavaScript
statically
distinguishes between value-producing and
non-value-producing statements. (Here
statically
means that
we can make the distinction by inspecting the program
rather than by running it.)
All declarations are
non-value-producing, and all
expression statements and conditional statements are
value-producing.
The value of an expression statement is the value of the expression.
The value of a conditional statement is the value of the branch that
gets executed, or the value
undefined if that branch is
not value-producing.
A block is value-producing if its body (sequence of statements)
is value-producing, and then its value is the value of its body.
A sequence is value-producing if any of
its component statements is value-producing, and then its value is
the value of its last value-producing component statement.
Finally, if the whole
program is not value-producing, its value is the value
undefined.
1; 2; 3;
1; { if (true) {} else { 2; } }
1; const x = 2;
1; { let x = 2; { x = x + 3; } }