The feature in question is the notation for anonymous functions: the underscore. It seems that in Scala
_ (just underscore) is the abbreviation for the identity function
f => f((_:Integer).intValue) is the abbreviation for
((x:Integer) => x.intValue)(Integer.valueOf(_)) abbreviates
(x => Integer.valueOf(x))anObject.method(_ ~ xxx ~ yyy) is the abbreviation of
anObject.method(f => f ~ xxx ~ yyy)Let us briefly review the Montagovian semantics of natural languages. A declarative sentence such as
(1) John saw Mary.denotes a proposition, which is true just in case a person denoted by the name John indeed saw a person denoted by the name Mary. We can write the proposition as a formula in the Simple Theory of Types (STT):
(2) see mary johnwhere
mary and john are individuals (of the type e) who go by the names Mary and John, respectively; see, the meaning of the transitive verb `see', is a function symbol of the type e -> e -> t. Here t is the type of propositions. (In Church's STT, the type e is called i and the type t called o. For some reason, Montague renamed the types).
Let us consider a sentence
(3) It was Mary who John saw.or, simpler
(4) Mary, John saw.We'd like to write a logical formula representing the meaning of (4). We would like our semantics to be compositional: the meaning of a sentence should be composed from the meaning of its parts. Thus, we have to determine the meaning of the part ``John saw.'' A transitive verb requires two arguments: first an object, then a subject. The phrase ``John saw'' seems to be missing an object. To handle such phrases -- verb phrases (VP) missing a component -- linguists have introduced a so-called `trace', often written as an underscore. The trace is assumed silent (not pronounced). Thus the sentence (4) is to be written
(4') Mary, John saw _.
The sentences with traces are common. Examples include raised questions and relative clauses:
(5) Who did John see _.
(6) I know whom John saw _.
(7) The person John saw _ ran away.
Some theories of quantifier scope use trace to explain how a quantifier word such as `someone' influences the meaning of -- `takes scope over' -- the entire sentence. This so-called `quantifier raising' posits that a sentence such as
Alice thinks someone left.should be analyzed as if it were
someone (Alice thinks _ left).
It seems like a good idea to take the meaning of ``John saw _'' to be
(8) fun x -> see x johnThe meaning of (4') then can be composed from the meaning of its components as
(9) (fun x -> see x john) maryA beta-reduction then gives us (2). Chomsky might say that the transformation from the surface form of the utterance (4') to the logical, hidden form (2) involves a movement of Mary into the position occupied by the trace.
We now face the question of the meaning of trace and using that meaning to compositionally derive (8) from ``John saw _.'' We observe that ``John saw _'' looks like ``John saw Mary'' with ``Mary'' taken out. Therefore, ``John saw _'' is a context, a term with a hole; trace denotes the hole. Continuations are the meanings of contexts. Since ``John saw _'' is a sub-phrase of the larger sentence (4'), our context is partial -- or delimited, -- with a delimited continuation as its meaning. A delimited continuation can be captured by a delimited control operator shift:
reset (see (shift (fun k -> k)) john)indeed reduces to
fun x -> see x johnTherefore, we may take the meaning of trace to be
(shift (fun (k -> k)). To type that expression, we need shift with the answer-type modification.
The topic of delimited control in natural languages is discussed in great detail by Chung-chieh Shan in a book chapter ``Linguistic Side Effects.'' He analyzes more complex sentences, for example:
(10) Who _ saw his mother?
A pronoun could be treated like a trace. The sentence (10) hence exhibits three components with side-effect: the question word, the trace, and the pronoun. Chung-chieh Shan explains their interactions.
Coming back to Scala, it seems that _ is the trace and the expression boundary is a reset.
It is remarkable that natural and programming languages can inform each other. Consciously or not, Scala seems like the best example of such mutual development.
Abstract Categorial Grammars (ACG) represent syntax-semantic interface by regarding the surface and logical forms as two different interpretations of an abstract form. All three forms are represented in ACG uniformly, as closed (normal) terms in typed linear lambda-calculus over a signature defining base types, constants and the type assignments to the constants. An interpretation of one, abstract calculus, in another, object calculus is formalized in a lexicon, which is a homomorphic, typing preserving map of constants and types of the abstract calculus into terms and types of the object calculus.
Strictly speaking, the mapping of an abstract term tA to an object term includes not only the mapping of constants of tA as specified by the lexicon, but also the normalization of the result. The ACG literature shows the normalization (beta-reduction) step when presenting examples; the step, however, is left implicit in the formalism. Since simply-typed lambda-calculus is strongly normalizing, the formalities are not needed.
To model semantics, that is, design an ACG lexicon to map an abstract form to a logical form, extensions are often necessary. For example, the linearity requirement is dropped. We carry out further extensions. First, we distinguish a subset of terms of the logical form calculus and call them values. We define the process of reduction of terms to values as left-to-right call-by-value evaluation. We call the logical form calculus a (semantic) metalanguage.
The mapping of an abstract term tA to a logical form is an explicitly two-stage process: we map the constants of tA to metalanguage terms, and evaluate the result. The evaluation may fail to yield a value: the mapping of tA to a semantic form is partial. We take the failure as an indication of ungrammaticality of tA. The corresponding surface forms are thus judged syntactically correct but meaningless. In a departure from the Categorial Grammar tradition, the rules of grammatical composition are specified by the evaluation rules of the metalanguage -- and only incompletely by the underlying type calculus. Types are quickly-decidable approximation of the evaluation rules. Whereas categorial grammars and type-logical grammars stress finding proofs, we stress deterministic evaluation.
To avoid type-lifting and still model (covert) movements we add to our metalanguage multi-prompt delimited continuations. Such continuations implement mutable dynamic binding (or, `local store'), which is a computational model of Cooper storage.
We can use an existing programming language as our metalanguage, which gives us a ready-made implementation of computational abstract grammars. We straightforwardly encode abstract forms and lexicons in OCaml, and see the logical forms as the result of running the corresponding OCaml programs. We also take advantage of the abstraction, modularity, and interactive development and debugging facilities of OCaml.
Symantics1.hs [6K]
A Haskell translation of an early version of symantics1.ml, for comparison
Sylvain Pogodalla: Syntax-Semantics Architecture: Comparing CG and ACG
Section 4 of the Advanced ESSLLI09 course on Abstract Categorial Grammars
< http://www.loria.fr/equipes/calligramme/acg/publications/esslli-09/2009-esslli-acg-week-2-part-2.pdf >
We use the examples from Sec 4.4.
Typed Tagless Interpretations
Abstract Categorial Grammars are the instance of Typed Tagless (Final) Interpretations.
Dynamic Logic in ACG: Discourse Anaphora and Scoping Islands
Our enhancement to ACG affects only the mapping from abstract language to semantics. We retain all ACG's benefits of parsing from the surface form. Crucially, by avoiding type lifting we keep the order of the abstract signature low, so that parsing remains tractable.
We regard the mapping from abstract language to semantics partial: some sentences, albeit well-formed, just don't make sense. We model this partial mapping as a potentially failing computation in a call-by-value language with multi-prompt delimited control. The evaluation and type inference rules of the language are simple and deterministic. Control prompts may be regarded as loci of binding or quantification, used by quantified phrases and pronouns and set by context. We arrive at the mechanism of interaction of a phrase with its context that determines the scope.
symantics2.ml [11K]
OCaml code with the analyses of the puzzles discussed in the talk
The code presents abstract forms for the puzzles, surface forms and the computed logical formulas, if exist.
< http://www.loria.fr/equipes/calligramme/acg/ >
Abstract Categorial Grammar Homepage
The delimcc library: delimited continuations for (byte-code) OCaml
We specify the compilation from the intermediate language to the STT denotation as reduction rules. Hence the intermediate language is (call-by-name) lambda calculus whose constants are STT formulas. To compile an intermediate language term to its STT denotation we evaluate the term. The compilation is not always compositional: so-called scopal expressions contribute their meaning in the places other than where they are seen or heard. Scopal phenomena include generalized quantification, wh-phrases, topicalization, focus, anaphora and binding. To account for these phenomena, our intermediate language includes control operators shift and reset. We are thus able to improve on the previous continuation-based analyses of quantification, binding, raised and in-situ wh-questions, binding in wh-questions, and superiority.
The evaluation of an intermediate language term does not always lead to the STT denotation: the evaluation may get stuck. In that case, we regard the corresponding source language phrase as ungrammatical. We introduce a type system to delineate a set of intermediate language terms whose evaluation never fails to produce denotation. Being well-typed becomes an alternative criterion of grammaticality. Typeability, unlike reducibility, has an advantage of being applicable to separate phrases rather than whole sentences. Our main result is that both typing and call-by-name are necessary to correctly predict superiority and binding in wh-questions with topicalization, without resorting to thunking or type raising and thus maintaining the uniformity of the analyses.
gengo-cbn-talk.pdf [224K]
Presentation at the workshop, August 6, 2008. Hamburg, Germany.
gengo-side-effects-cbn.elf [19K]
The complete code with all the analyses described in the paper
Call-by-name typed shift/reset calculus
Our intermediate language
Joint work with Chung-chieh Shan.
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML