previous   next   up   top

Lambda: the ultimate syntax-semantics interface

 

Day in, day out, the semanticist checks types and simplifies terms. Can the digital computer help with these calculations? In fact, modern programming languages offer exactly the modularity features that make it easy to try fragments out and scale them up. To show how, we express so-called extensible interpreters as functional programs and apply the technique to natural-language semantics and logic. Specifically, we work our way from the simply typed lambda calculus and a context-free grammar to CCG and to a dynamic and continuation treatment of quantification and anaphora. Striving to be comprehensible and informative to both linguists and programmers, we use the programming language Haskell without assuming specific knowledge of it.

This course on computational Montagovian semantics has been presented together with Chung-chieh Shan at the North American Summer School on Logic, Language and Information (NASSLLI 2012) < http://nasslli2012.com/courses/lambda-the-ultimate > at University of Texas, Austin, TX, USA, June 18-22, 2012. This five-day course is an extended version of the two-day short course presented at NASSLLI 2010 in Bloomington, IN in June 2010. This page collects the notes for the course in the form of commented Haskell code.


 

Goals of the course

Our immediate goal is to introduce Montagovian semantics to Computer Scientists and derivation calculators to natural-language semanticists. We hope that a linguist would gain a helpful tool. Building syntactic derivations and normalizing semantic terms by hand consigns us to small fragments of the language. Computers can help to scale up. Programming language researchers could benefit too, by finding an interesting application to build tools for.

Ideally the course would be a beginning of a beautiful friendship -- or collaboration of natural- and programming-language researchers. At the very least, the course would aid their mutual comprehension.

We have a grander goal in mind. We hope that linguists will draw more advantage of the ideas of side effects, continuations, regions, staging (a.k.a. quotation) and dependent types. These ideas happen to have been developed more in programming language theory and are only recently being consciously applied to natural language semantics. Linguistic applications of these ideas are certain to prompt further development, benefiting the programming language theory as well. We look forward to computer scientists learning from linguists how to build theories of programming language competence . Emotional arguments about ``the best'' programming language should to be replaced by a scientific, predictive theory of how programmers perceive and apply a programming language or its feature.

 

Overview

Our plan for the course is to talk about syntax (context-free grammars, or CFG), semantics (Simple Theory of Types, or lambda-calculus) and a calculational way of relating the two. We start with a trivial fragment and show how to write derivations and re-interpret them in various ways: to spell them out or to compute a logical formula. We then grow our fragments and interpreters: we add quantification, anaphora, subject relative clauses to our natural-language fragment; we add state to the semantic calculus; we add the interpretation of a sentence as an imperative program performing an information ``update''. We explain de Groote's dynamic logic analysis of donkey sentences.

slides.pdf [75K]
Slides for the lectures, including the exercises

language-map.pdf [19K]
The map of languages and interpretations

 

Main ideas

All throughout the course we will repeatedly encounter the following points:
Calculemus
Let us calculate yields and denotations.
The multitude of fragments and languages
In the course we will be talking about many languages and fragments: Fragments of the natural language to analyze, languages in which to express meanings, and the language in which to describe building the analyses and the descriptions. We call the latter language our metalanguage. It will be Haskell rather than English. After all, we want to not only describe analyzes but mechanically execute them.
Multiple interpretations
A single CFG derivation can be interpreted to yield a text string in English or Japanese, an audio file -- or a semantic denotation, a formula in classical or dynamic logic.
Growing fragments and languages
We stress stepwise refinement and extensibility of fragments, grammars, and interpretations. When we add new features (such as quantification to our fragment or state to our logic), we preserve all existing interpretations, reusing the previous work rather than re-writing it from scratch.
Interactivity
We emphasize ``command line'' testing of sample phrases and derivations, developing our fragments and interpretations by small increments, immediately testing all new additions. We aim to give an impression of GHCi as a syntax-semantic calculator .
Montagovian tradition
We hope that our exposition could be thought of as a ``rational reconstruction'' of Montague's approach.

language-map.pdf [19K]
The map of languages and interpretations

 

Context-Free Grammars (CFG) derivations as a domain-specific language embedded in Haskell

For linguists not familiar with Haskell, we introduce the language by appealing to the intuition of a calculator. GHCi can calculate with numbers, boolean formulas and strings. A particular strength of the metalanguage is the ability to name frequently occurring expressions. One can think of such definitions as shortcuts, or bookmarks. As we keep entering similar shortcuts, we may want to abstract over their differences -- to parameterize the definitions.

CFG1EN.hs [<1K]
Definitions (or, `bookmarks') and CFG-like derivations

CFG1Sem.hs [<1K]
Semantic interpretation of a CFG derivation

CFG2EN.hs [<1K]
CFG2Sem.hs [<1K]
Same as before, but now with type annotations. The file CFG2Sem.hs tries to repair way too permissive grammar embedding with semantics: ``Using semantics to fix up syntax''

CFG2ENDyn.hs [2K]
Preventing bad derivations `at run time'

CFG3EN.hs [3K]
Introducing type constants; accomplishing the goal that our terms represent all and only valid CFG derivations

CFG3Sem.hs [2K]
Type functions: from syntactic categories to semantic types

CFG4.hs [3K]
Unifying syntax with semantics

 

The syntax and interpretations of semantics

Our language for denotations is essentially Church's ``Simple Theory of Types,'' also known as simply-typed lambda-calculus. It is a form of a higher-order predicate logic, which is often called Ty2.

We have demonstrated how to interpret syntactic (CFG) derivations in several ways. We apply the same approach to semantic forms, interpreting a semantic formula so to evaluate it in a particular world, to print it out, or to simplify it.

Prop.hs [6K]
Warm-up: Embedding Propositional Logic, the language of very simple denotations

Lambda.hs [3K]
Another warm-up: embedding pure lambda-calculus, illustrating higher-order abstract syntax (HOAS)

Semantics.hs [7K]
The grammar of the language of denotations, and its many interpretations

 

Many interpretations of CFG derivations

CFG.hs [4K]
Context-free grammars, in the tagless-final style; syntactic (as English phrase) and semantic interpretations

RickPerry.hs [3K]
Extending the fragment with adjectives and copula for a Rick Perry example from the bootcamp

CFGJ.hs [3K]
Interpreting a CFG derivation as a string in Japanese

 

Context-free grammars with quantifiers

We introduce the first major extension: quantifiers.

QCFG.hs [4K]
Adding QNP in the tradition of Montague

QCFGJ.hs [2K]
Likewise, extending the Japanese interpretation

QHCFG.hs [3K]
A different way to add quantification, relying on higher-order abstract syntax (HOAS). We thus attempt a `rational reconstruction' of Montague's general approach of `administrative pronouns', which gave rise to Quantifier Raising (QR).

 

Dynamic Logic and donkey anaphora

< http://www.inria.fr/rocquencourt/rendez-vous/modele-et-algo/dynamic-logic-a-type-theoretic-view >
Philippe de Groote. Dynamic logic: a type-theoretic view
Talk slides at `Le modele et l'algorithme', Rocquencourt, 2010.

Dynamics.hs [5K]
Implementing de Groote's approach: extending our fragment with pronouns, and the language of denotations with state

CCG.hs [6K]
A sketch of Combinatorial Categorial Grammar (CCG)

Tower.hs [8K]
Chung-chieh Shan's implementation of the continuation semantics of
  Chung-chieh Shan and Chris Barker. 2006. Explaining crossover and superiority as left-to-right evaluation.
  Linguistics and Philosophy 29(1):91-134.
and the tower notation of
  Chris Barker and Chung-chieh Shan. 2008. Donkey anaphora is in-scope binding. Semantics and Pragmatics 1(1):1-46.

 

References and further reading

Closely related to the present course in spirit but with a different subject matter or metalanguage:

Closely related to the present course in subject matter (semantics):

The technique of extensible language embeddings is described in the following publications:

Version
The current version is July 2012.


Last updated July 4, 2012

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML