previous   next   up   top

Semantics of Natural Languages



Montagovian semantics and Scala programming

Scala programming language has a feature that is prominent in natural languages. Perhaps the feature was introduced into Scala to make our linguistic intuitions guide our programming.

The feature in question is the notation for anonymous functions: the underscore. It seems that in Scala

Let us briefly review the Montagovian semantics of natural languages. A declarative sentence such as

     	 (1) John saw Mary.
denotes a proposition, which is true just in case a person denoted by the name John indeed saw a person denoted by the name Mary. We can write the proposition as a formula in the Simple Theory of Types (STT):
     	 (2) see mary john
where mary and john are individuals (of the type e) who go by the names Mary and John, respectively; see , the meaning of the transitive verb `see', is a function symbol of the type e -> e -> t . Here t is the type of propositions. (In Church's STT, the type e is called i and the type t called o . For some reason, Montague renamed the types).

Let us consider a sentence

     	 (3) It was Mary who John saw.
or, simpler
     	 (4) Mary, John saw.
We'd like to write a logical formula representing the meaning of (4). We would like our semantics to be compositional: the meaning of a sentence should be composed from the meaning of its parts. Thus, we have to determine the meaning of the part ``John saw.'' A transitive verb requires two arguments: first an object, then a subject. The phrase ``John saw'' seems to be missing an object. To handle such phrases -- verb phrases (VP) missing a component -- linguists have introduced a so-called `trace', often written as an underscore. The trace is assumed silent (not pronounced). Thus the sentence (4) is to be written
     	 (4') Mary, John saw _.

The sentences with traces are common. Examples include raised questions and relative clauses:

     	(5) Who did John see _.
     	(6) I know whom John saw _.
     	(7) The person John saw _ ran away.
Some theories of quantifier scope use trace to explain how a quantifier word such as `someone' influences the meaning of -- `takes scope over' -- the entire sentence. This so-called `quantifier raising' posits that a sentence such as
     	 Alice thinks someone left.
should be analyzed as if it were
     	 someone (Alice thinks _ left).

It seems like a good idea to take the meaning of ``John saw _'' to be

     	 (8) fun x -> see x john
The meaning of (4') then can be composed from the meaning of its components as
     	 (9) (fun x -> see x john) mary
A beta-reduction then gives us (2). Chomsky might say that the transformation from the surface form of the utterance (4') to the logical, hidden form (2) involves a movement of Mary into the position occupied by the trace.

We now face the question of the meaning of trace and using that meaning to compositionally derive (8) from ``John saw _.'' We observe that ``John saw _'' looks like ``John saw Mary'' with ``Mary'' taken out. Therefore, ``John saw _'' is a context , a term with a hole; trace denotes the hole. Continuations are the meanings of contexts. Since ``John saw _'' is a sub-phrase of the larger sentence (4'), our context is partial -- or delimited, -- with a delimited continuation as its meaning. A delimited continuation can be captured by a delimited control operator shift:

     	 reset (see (shift (fun k -> k)) john)
indeed reduces to
     	 fun x -> see x john
Therefore, we may take the meaning of trace to be (shift (fun (k -> k)) . To type that expression, we need shift with the answer-type modification.

The topic of delimited control in natural languages is discussed in great detail by Chung-chieh Shan in a book chapter ``Linguistic Side Effects.'' He analyzes more complex sentences, for example:

     	(10) Who _ saw his mother?

A pronoun could be treated like a trace. The sentence (10) hence exhibits three components with side-effect: the question word, the trace, and the pronoun. Chung-chieh Shan explains their interactions.

Coming back to Scala, it seems that _ is the trace and the expression boundary is a reset.

It is remarkable that natural and programming languages can inform each other. Consciously or not, Scala seems like the best example of such mutual development.

The current version is November 2, 2009.
Chung-chieh Shan: Linguistic Side Effects.
In ``Direct compositionality,'' ed. Chris Barker and Pauline Jacobson, pp. 132-163. Oxford University Press, 2007.


Canonical Constituents and Non-canonical Coordination: Simple Categorial Grammar Account

[The Abstract of the paper]
A variation of the standard non-associative Lambek calculus with the slightly non-standard yet very traditional semantic interpretation turns out to straightforwardly and uniformly express the instances of non-canonical coordination while maintaining phrase structure constituents. Non-canonical coordination looks just as canonical on our analyses. Gapping, typically problematic in Categorial Grammar--based approaches, is analyzed just like the ordinary object coordination. Furthermore, the calculus uniformly treats quantification in any position, quantification ambiguity and islands. It lets us give what seems to be the simplest account for both narrow- and wide-scope quantification into coordinated phrases and of narrow- and wide-scope modal auxiliaries in gapping.

The calculus lets us express standard covert movements and anaphoric-like references (analogues of overt movements) in types -- as well as describe how the context can block these movements.

The current version is November 2014.
NL.pdf [269K]
The extended article
The slightly shorter paper was published in the Proceedings of LENLS 11, the International Workshop on Logic and Engineering of Natural Language Semantics. November 22-24, 2014, Japan, pp. 138-151.

lenls-talk.pdf [216K]
The annotated slides of the talk at LENLS 11. November 23, 2014. Keio University, Japan.

HOCCG.hs [28K]
The accompanying source code to verify the analyses described in the paper
The code checks NL derivations and computes the logical formula representing the meaning of the corresponding sentences. The code is in Haskell, in the tagless-final style.


Continuation Hierarchy and Quantifier Scope

[The Abstract of the chapter]
We present a directly compositional and type-directed analysis of quantifier ambiguity, scope islands, wide-scope indefinites and inverse linking. It is based on Danvy and Filinski's continuation hierarchy, with deterministic semantic composition rules that are uniquely determined by the formation rules of the overt syntax. We thus obtain a compositional, uniform and parsimonious treatment of quantifiers in subject, object, embedded-NP and embedded-clause positions without resorting to Logical Forms, Cooper storage, type-shifting and other ad hoc mechanisms.

To safely combine the continuation hierarchy with quantification, we give a precise logical meaning to often used informal devices such as picking a variable and binding it off. Type inference determines variable names, banishing ``unbound traces''.

Quantifier ambiguity arises in our analysis solely because quantifier words are polysemous, or come in several strengths. The continuation hierarchy lets us assign strengths to quantifiers, which determines their scope. Indefinites and universals differ in their scoping behavior because their lexical entries are assigned different strengths. PPs and embedded clauses, like the main clause, delimit the scope of embedded quantifiers. Unlike the main clause, their limit extends only up to a certain hierarchy level, letting higher-level quantifiers escape and take wider scope. This interplay of strength and islands accounts for the complex quantifier scope phenomena.

We present an economical ``direct style'', or continuation hierarchy on-demand, in which quantifier-free lexical entries and phrases keep their simple, unlifted types.

Joint work with Chung-chieh Shan.

The current version is 2014.
inverse-scope.pdf [380K]
The paper appeared as a chapter in the book Formal Approaches to Semantics and Pragmatics: Japanese and Beyond published in the series Studies in Linguistics and Philosophy (E. McCready, K. Yabushita and K. Yoshimoto, eds.)
Springer Netherlands, 2014, pp. 105-134.

QuanCPS.hs [24K]
The accompanying source code to verify the analyses described in the paper and compute the denotations.
The code is in Haskell, in the tagless-final style.


Applicative Abstract Categorial Grammars

[The Abstract of the paper]
We present the grammar/semantic formalism of Applicative Abstract Categorial Grammars (AACG), based on the recent techniques from functional programming: applicative functors, staged languages and typed final language embeddings. AACG is a generalization of Abstract Categorial Grammars (ACG), retaining the benefits of ACG as a grammar formalism and making it possible and convenient to express a variety of semantic theories.

We use the AACG formalism to uniformly formulate Potts' analyses of expressives, the dynamic-logic account of anaphora, and the continuation tower treatment of quantifier strength, quantifier ambiguity and scope islands. Carrying out these analyses in ACG required compromises with the accompanying ballooning of parsing complexity, or was not possible at all. The AACG formalism brings modularity, which comes from the compositionality of applicative functors, in contrast to monads, and the extensibility of the typed final embedding. The separately developed analyses of expressives and QNP are used as they are to compute truth conditions of sentences with both these features.

AACG is implemented as a `semantic calculator', which is the ordinary Haskell interpreter. The calculator lets us interactively write grammar derivations in a linguist-readable form and see their yields, inferred types and computed truth conditions. We easily extend fragments with more lexical items and operators, and experiment with different semantic-mapping assemblies. The mechanization lets a semanticist test more and more complex examples, making empirical tests of a semantic theory more extensive, organized and systematic.

The current version is July 2015.
AACG.pdf [277K]
The paper published in the Proceedings of NLCS'15: Third Workshop on Natural Language and Computer Science
EPiC Volume 32, pp. 29-38, 2015

Abstract.hs [10K]
The definition of the Abstract language and its mapping to the surface `syntax'
We define Abstract for a small fragment, later extended with sentence conjunction, anaphora and quantifiers.

Logic.hs [5K]
The language of meaning: First-order predicate logic

SemTr.hs [18K]
The syntax-semantic interface
We demonstrate the interface for the small fragment, and its modular extension for expressives and two levels of quantification. We illustrate the successive, compositional, multi-stage re-writing of Abstract into the meaning.

Applicatives.hs [3K]
Standard CPS and State Applicatives, which could have been defined in the standard applicative library

CAG-talk.pdf [168K]
Annotated slides of the talk presented at the 3d CAuLD workshop: Logical Methods for Discourse. December 15. Nancy, France.

< >
Abstract Categorial Grammar Homepage


Compilation by evaluation as syntax-semantics interface

We regard the relation of a natural language phrase to its meaning as a two-stage transformation. First, the source, natural language phrase in concrete syntax is elaborated to an intermediate language term. The elaboration is syntactic and involves parsing, dis-inflection, etc. The intermediate language is then deterministically compiled to Church's simple theory of types (STT). The resulting STT formula is taken to be the denotation, the meaning, of the original source language phrase.

We specify the compilation from the intermediate language to the STT denotation as reduction rules. Hence the intermediate language is (call-by-name) lambda calculus whose constants are STT formulas. To compile an intermediate language term to its STT denotation we evaluate the term. The compilation is not always compositional: so-called scopal expressions contribute their meaning in the places other than where they are seen or heard. Scopal phenomena include generalized quantification, wh-phrases, topicalization, focus, anaphora and binding. To account for these phenomena, our intermediate language includes control operators shift and reset. We are thus able to improve on the previous continuation-based analyses of quantification, binding, raised and in-situ wh-questions, binding in wh-questions, and superiority.

The evaluation of an intermediate language term does not always lead to the STT denotation: the evaluation may get stuck. In that case, we regard the corresponding source language phrase as ungrammatical. We introduce a type system to delineate a set of intermediate language terms whose evaluation never fails to produce denotation. Being well-typed becomes an alternative criterion of grammaticality. Typeability, unlike reducibility, has an advantage of being applicable to separate phrases rather than whole sentences. Our main result is that both typing and call-by-name are necessary to correctly predict superiority and binding in wh-questions with topicalization, without resorting to thunking or type raising and thus maintaining the uniformity of the analyses.

The current version is August 2008.
gengo-side-effects-cbn.pdf [161K]
Call-by-name linguistic side effects
ESSLLI 2008 Workshop on Symmetric calculi and Ludics for the semantic interpretation. 4-7 August, 2008. Hamburg, Germany.

gengo-cbn-talk.pdf [224K]
Presentation at the workshop, August 6, 2008. Hamburg, Germany.

gengo-side-effects-cbn.elf [19K]
The complete code with all the analyses described in the paper

Call-by-name typed shift/reset calculus
Our intermediate language


Talk: Delimited Continuations in Computer Science and Linguistics

We give a detailed introduction to delimited continuations -- the meanings of partial contexts -- and point out some of their occurrences in multi-processing, transactions, and non-deterministic computations. After briefly touching on the formalism and the logic of delimited continuations, we concentrate on two their particular uses: placing and retrieving multiple contextual marks (dynamic binding, anaphora) and meta-programming (generating code, generating denotations of interrogative sentences and clauses).
The current version is December 4, 2007.
delimcc-tohoku-talk-notes.pdf [246K]
delimcc-tohoku-talk.pdf [166K]
Slides with and without annotations
Slides and notes of the talk given at the Research Center for Language, Brain and Cognition. Tohoku University, Sendai, Japan. December 4, 2007.

Last updated July 8, 2015

This site's top page is
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML