Natural Language Technology

= Språkteknologi

(Course code CTH: TDA 510 (4.0p), GU: INN 770 (5p))

Course information, Spring 2006

Contents

  • News
  • The goals of the course
  • Teachers
  • Literature
  • Prerequisites
  • Lectures
  • Laborations
  • Exercises
  • Examination
  • Links

  • News

    5/5 Updated version of Exercise 5. Lectures on speech technology and statistics changed order (so that those who want to have speech in their lab2 have more time to do it).

    26/4 Lecture and exercise on Thursday 4 May cancelled. Instead, there is a possibility to participate in the Dialogverkstad at Lindholmen. We can take the ferry ("Älvsnabben") from Rosenlund at 9.40.

    26/4 An up-to-date version of Lab2 PM.

    18/4 Lab1 PM now contains an example of all the phases of the work, if done by using a GF resource grammar (the last section in the PM).

    11/4 Lab 1 reporting system now available at Fire. Please use this for reporting your lab!

    29/3 Lectures of next week swapped: 3/4 is on translation systems and 6/4 on parsing algorithms.

    23/3 New versions available of extract (1.0beta) and GF (2.5). Download or use the ones in lanctec06/bin/ (linux binaries; this gf is a late build of 2.4, with no essential difference from 2.5).

    13/3 Added link to Lab times and updated the Lab 1 description.

    26/2/2006 The first version of the course page. Copied from 2005. Most things need to be updated. The course starts on Thursday 16 March at 10.00 in ES52.


    The goals of the course

    The course gives an introduction to the computer implementation and applications of natural language (such as English, Swedish, Arabic). The students will learn to develop simple applications, and to understand the possibilities and restrictions of main techniques in the area. Participants with a computer science background should also gain a better understanding of the relation between formal and natural languages.

    Applications on which we are going to work include:

    The lectures are in English, but the languages we consider will include both English and Swedish, as well as student's own languages according to what interest there is.

    Official description in Swedish: Kursplan.


    Teachers

    Aarne Ranta, aarne@cs.chalmers.se, office 6115 in the ED building, phone 772 10 82 (1082 inside CTH). Course Responsible (Kursansvarig).

    Literature

    Unfortunately, this book is delayed. But you will be able to start the course without a book.
    Recommended (will be available in Cremona): Nugues, Pierre M., An Introduction to Language Processing with Perl and Prolog An Outline of Theories, Implementation, and Application with Special Consideration of English, French, and German. Springer Verlag, Series: Cognitive Technologies. 2005, Approx. 600 p., Hardcover ISBN: 3-540-25031-X.

    An alternative is Jurafsky, D., Martin, J.M., Speech and Language Processing, Prentice Hall, 2000; paperback ("international edition") 2003. Price £37.99 at Amazon.co.uk. Ask Cremona as well.

    Other material on this web page.


    Accounts and laboration times

    Course accounts to be obtained on the first lecture. Work preferably in pairs, but also possible individually.

    Lab room: 6225A on both Tue 10-12 and Wed 10-12, beginning on Tue 21 March. Here is a list of lab times.

    Supervision times: by individual agreement.


    Prerequisites

    Programming experience, data structures, automata, formal languages. No linguistics required.

    Lectures

  • Mon 13-15 in ES53 and Thu 10-12 in ES52.
  • first lecture: Thursday 16/3.

  • Here is a preliminary lecture schema, also containing links to the lecture slides.


    Laborations

    Two laborations. Their deadlines are Reporting via Fire.


    Exercises

    Thursday 13-15 in ES52.

    See the preliminary lecture and exercise schema for next week's exercises.

    Students are expected to do the exercises in advance, and they can earn extra points in this way. To earn extra points, you mark on a list (circulated in the beginning of each exercise class) which exercises you have done. You may be picked to present your solution on the blackboard: the point is to make the exercise classes lively and interactive. The amount of credits one can earn in this way is 15% of the final mark (which is slightly less than the difference between 3 and 4, and 4 and 5).


    Examination

    Written exam. Date Tuesday 30 May, time 08.30-12.30, place house M.

    Read: course slides, GF Tutorial. Questions will be problem-oriented and similar to the exercises.

    There will be 0-2 questions on each of the following topics: morphology, syntax, semantics, translation, parsing algorithms, statistical methods, speech. Altogether 6 questions, so that CTH students must answer to 4 and GU students to 5.

    The course is graded on the usual scales (3-4-5, G-VG). The grade is jointly determined by the exercises (the number of solved ones), and the written exam.

    A renewal (omtenta) will be arranged upon individual agreement, probably as an oral exam.

    Here is a copy of the Exam of 2004.


    Links

    Grammatical Framework

    Functional Morphology

    Chalmers Language Technology Group

    Computational Linguistics at Gothenburg University. This course will provide prerequisites to some of their advanced courses.

    The Systran Machine Translation System. You can test in on line.


    By Aarne Ranta <aarne@cs.chalmers.se>
    Last modified: Wed Mar 29 22:40:55 CEST 2006