My portfolio moved to

http://sarahhtmd.de.tl

thanks

19. 12. 2006 Grammar (syntax) - External Structure

 

19. 12. 2006 Types of lexical information: Grammar (syntax) – External structure



Syntax

  • Wikipedia (http://en.wikipedia.org/wiki/Syntax 2.1.07) : In linguistics, syntax is the study of the rules, or "patterned relations", that govern the way words combine to form phrases and phrases combine to form sentences. The word originates from the Greek words συν (syn), meaning "co-" or "together", and τάξις (táxis), meaning "sequence, order, or arrangement". The combinatory behavior of words is governed to a first approximation by their part of speech.

  • Syntax is the structure of sentences

  • Syntax= Grammar= the order of words in a sentence



Categories

  • Parts of speech (POS)

  • subcategories

  • phrasal categories



Main relations

  • structural relations

  • paradigmatic

  • syntagmatic

  • semiotic relations

  • interpretation

  • realisation



Semiotics





Words, Context, External Structure

Task: Identify the POS of each word in this text



Mr Bush( noun, name) accepted(verb, past tense) Mr Rumsfeld's (noun, name) resignation(abstract noun) after(preposition, defining the order) November(abstract noun) mid-elections(compound noun) in( preposition, defining the place) which( pronoun) the( definite article) Republicans( noun, name) lost(verb, past tense) control(abstract noun) of( preposition, belonging) both(pronoun, quantifier) the(definite article) House of Representatives( compound noun) and(conjunction, connection) the( definite article) Senate(noun). Public(adjective) discontent(abstract noun) over( preposition) the( definite article) conduct(abstract noun) of( preposition) the(definite article) Iraq war ( compound noun) was seen(verb, auxiliar verb + lexical verb) as(conjunction) a(indefinite article) major(adjective) factor(noun) in(preposition, place, location) the(definite article) defeat (abstract noun).





Determiners

  • Articles ( definite „the“ / indefinite „a(n)&ldquo: define the relation between the reader and the writer, if a writer uses „the“, he or she expects the reader to know what he is writing about, either because it is obvious or because it was mentioned before

  • Possessives ( my, your, his, her, its, our, their): first element in nominal expressions

  • Demonstratives ( proximal (this) / distal (that) )

  • Quantifiers

  • cardinal numbers ( one, two, ...)

  • extencial: some (not many, depends on the set you are talking about), several ( 2 < several< 10), few, many

  • dual: both ( 2)

  • universal: each ( individually), every, all



Adjectives

  • scalar ( small, big, ...): you can say „very“ with scalar adjectives ( very small, very big)

  • polar ( dead, pregnant, ...): you normally cannot say „very“ with polar adjectives or it would have a special meaning ( „ very pregnant“: she has a very huge stomach)

  • appraisive ( good, wonderful, ...): you might use them with „very“ but then it might sound exaggerated or even ironic, no descriptive adjectives (only an attitude),

  • ordinal ( first, second, ...)

  • adverbs of degree ( can be used with scalar adjectives): very, highly, extremely, incredibly, ...





Nouns

  • Proper nouns (names): Places, personal, product, ...

  • Common nouns: Countable nouns ( knife, fork, spoon, ...), uncountable nouns (bread ( a slice of bread), butter (a piece of butter), jam ( a spoonful of jam))

Task: What happens when you count „uncountable“ nouns

  • when you order something ( „two teas please&ldquo

  • when you mean different types of bread ( brown bread, toast, ...)





Pronouns

  • personal pronouns ( I / me, you, he / him, ...)

  • possessive pronouns ( mine, yours, his, ...)

  • demonstrative pronouns ( this ( proximal), that (distal))

  • quantifier pronouns ( cardinal numbers ( one, two, ...), existential (some, several, few, ...), dual (both), universal ( every, each, ...)

  • relative pronouns (like conjunctions)







Verbs



Main Verbs

  • finite forms ( person, number, tense)

  • non- finite forms ( infinitives, participles)



Periphrastic Verbs (auxiliary verb + non- finite main verb)

  • modal ( can, will, ...)

  • aspectual ( be + prespart (continous), have + pastpart ( perfect))

  • passive: be + pastpart



It might have been being repaired“

  • might : modal verb ( -> attitude)

  • have: auxiliary verb

  • been: past participle ( have + been = present perfect)

  • being: continuous

  • repaired: main verb ( being + repaired= present perfect continuous)





Adverbs

  • Deictic

  • Time

  • Place

  • Direction

  • Manner

  • Degree



Deictic ( Wikipedia (http://en.wikipedia.org/wiki/Deictic_expression ) 3.1.07)

  • In linguistics, a deictic expression is an expression that refers to the personal, temporal, or spatial aspect of an utterance, and whose meaning therefore depends on the context in which it is used



Prepositions

  • make nominal expressions into adverbial expressions

  • categories: see adverbs



Task: What is the meaning of the preposition „of“ ?

  • The „Advanced Learner's Dictionary of Current English“, 7th edition, distinguishes between 13 (!) different meanings for the word „of“.

  • belonging to sb „the paintings of Monet“

  • belonging to sth, being part of sth „ the director of the company“

  • coming from a particular background „ the people of Wales“

  • concerning or showing sth/ sb „ a photo of my dog“

  • used to say what sb/ sth is, consists of „ the city of Dublin“

  • used with measurements and expressions of time „2 kilos of potatoes“

  • used to show that sth/ sb belongs to a group „some of his friends“

  • used to show the preposition of sth/ sb in place or time „ just north of Detroit“

  • used after nouns formed from verbs „the arrival of the police“

  • used after some verbs before mentioning sth/ sb in volved in the action „ He was cleared of all blame“

  • used after some adjectives before mentioning sb/ sth that a feeling relates to „to be proud of sth“

  • used to give your on sb' s behaviour „it was kind of you to offer“

  • used when one noun describes a second one „ Where's that idiot of a boy?“



Construct prepositional phrases corresponding to the types of adverbs

  • Deictic ( here, there, now, then): „

  • Time: „ after the match“ ,

  • Place: „above the house“, „a fence around the garden“, „the fox escaped into his hole“

  • Direction: „he hit against his leg“

  • Manner: „ without a trace“ , „like any other day“, „with great enthusiasm“

  • Degree: „the water is warm enough for swimming“



Conjunctions

  • co- ordinating conjunctions ( and, but)

  • sub- ordinating conjunctions: make sentences (clauses) into adjective-like noun modifiers

  • basically: make sentence (clauses) into adverb-like verb modifiers



Task: Find examples of conjunctions of each type





Interjections

  • Interjections link parts of dialogues together ( Hi, ehh, huh)

  • They may also be expressions of subjective reactions ( Ouch, wow)



Task: Find 5 more interjections ( 3.1.07 http://en.wiktionary.org/wiki/Category:English_interjections )

  • ay, aye

  • come on

  • damn

  • gosh

  • oh dear









The structure of language



The sign hierarchy: Ranks

  • Signs are structured in terms of their position in a size hierarchy; the positions in the hierarchy are sometimes referred to as ranks.



Main ranks

Each sign has a structure ( internal/ external) and a semiotic relation ( function and realisation)

  • Dialogue

  • monologue/ text

  • sentence

  • word

  • morpheme

  • phoneme







SIGN rank

Internal Structure

External Structure

Interpretation

Realisation

Dialogue

Turns, texts

Social interaction

Communication

Prosody, gesture

Text

Sentences

Components of dialogues

Speech acts

Prosody, gesture

Sentence

Phrases, words

Parts of narrative, argumentative, etc texts

Propositions

Prosody, rhythm

Word

Stems, affixes

Functional parts of sentences

Complex states, properties, events, ...

Phonemes, word prosody

Morpheme

Phonemes, syllables

Parts of words

Simple states, properties, events, ...

 

Phoneme

Distinctive features

Syllables

Encoding of morphemes into sounds

Phonetic segments, allophones of phonemes



Distinctive features: for example voicing or nasality

Encoding: the meaning of a morpheme

Prosody: speech melody, rhythm, accentuation, ...





Text structure

News homepage

  • the hole page consists of a text structure

  • the smaller articles and links ( texts) embedded in the document are texts as text parts







Structure and Constitutive Relations





Constitutive Relations



Structural relations

  • Syntagmatic relations ( „glue“, combinatory relations which create larger signs (and their realisations and interpretations) from smaller signs (and their realisations and interpretations)

  • Paradigmatic relations ( „choice“, classificatory relations of similarity and difference between signs)



Semiotic relations

  • realisation: ( the visual appearance or acoustic representation of signs (other senses) may also be involved)

  • interpretation: the assignment of meaning to a sign





Syntagmatic relations

  • combinatory relations which create larger signs (and their realisations and interpretations) from smaller signs (and their realisations and interpretations)


  • Phonology: Consonants and vowels are glued together as core and periphery of syllables

  • Morphology: lexical morphemes and affixes are glues together into stems, stems are glued together into compound words, stems and inflections are glued together into words

  • Syntax: verbs and nouns are glued together as the subject and verb of sentences





Structures and syntagmatic relations

Syllable


rhyme

unset

nucleus

coda

s

t

r

ε

ng

θ

s





Morphological Syntagmatic Relations

STEM


Predicate

C- Stem


Verbal


Object

Day

To

day


Bath

Room


Clean

er





Syntactic Dyntagmatic relations

Sentence


predicate

Subject


verbal


object

The

Loud

smoker


Is

Being


A

nuisance



12.12.06 Lexical databases- toolbox

 

12. 12. 06 Lexical Databases: Toolbox



Presentation on „Toolbox“ by Sascha Griffith

  • a computational tool developed by the SIL International (formerly known as the Summer Institute of Linguistics)

  • desined for field work purpose

  • a database application that interlinearizes, analyses and stores text and can convert this into an alphabetically ordered dictionary

  • was called „Shoebox“

  • eases work of linguist

    SIL lists Toolbox



    Toolbox' main functions are

    • viewing and searching ( click on arrows on the top left of the page, click on „database“ and „search“ to search)

    • browsing ( Alt + R)

    • editing ( „Edit“ , „Field&ldquo

    • sorting ( „Database“ , „Sorting“, sorting by field)



    To get the software



    Entering text

    • select text in source document

    • Press ctrl-c

    • Go to text row (tx) and paste (ctrl-v) the text into the right column

    • Enter the title of the text into text identification

    • Enter an abbreviation of the title into reference line (the references can later be automatically numbered)

    • copy and paste selected text passages into Toolbox

    • When the text is added one should press enter so that ft (free translation) reappears

    • After entering the text press Alt-I (mb, ge & ps will reappear.

    • Sometimes Toolbox does not recognize morphemes, then you have to add them by leaving a space between them and add them to the dictionary



    Making an entry

    • Mark a word in the line mb

    • Click on this word using the right mouse button

    • Click 'Insert'

    • Enter the lexical properties into the dictionary field at the bottom of the screen



    Wordlist, Concordance and dictionary

    • The dictionary is entered manually as shown

    • A wordlist can be produced using the menu 'tools' and in this menu 'wordlist' (or by pressing alt-l)

    • A concordance can be produced can be produced by using the menu 'tools' and in this menu 'concordance' (or by pressing ctrl-l)

    • A new text window can be added into a new text file by choosing the menu 'database' and where one will find 'new record'



    Additional Information

    • you can export Toolbox data into a word processor file

    • To add a new data category (e.g. the pronunciation) click on the left column in the text window and press ctrl-e.

       

    Note on dictionary making

    • What is seen in the left column of the text window is called data categories (or datcats), which are called fields in Toolbox, what can be seen on the right side is (language) data or records as Toolbox refers to them. The fields represent the microstructure of a dictionary.



    How are words built?

       

    • Inflection: marks the syntagmatic relation of words to their contexts

    • syntactic context: agreement in person, number, case

    • situational context: temporal relations, quantity

    • form: stem + affix


    • root/ morpheme creation: creates new POS ( parts of speech) and meaning

    • parts of 2 or more existing stems ( „galumph&ldquo

    • -> Jabbawocky


    • Derivation: creates new part of speech

    • stem + affix


    • Compounding: creates meanings and sometimes new POS

    • at least 2 existing stems



    Internal structure of words

    English words consist of a STEM and an INFLECTION

    • STEM has a lexical meaning

    • INFLECTION has a grammatical meaning



    Stems of English words are ...

    • simple ( roots, lexical morphemes) -> boy, table, chair, red

    • complex, at least one of the following

    • Derivations ( a stem and a derivational affix) -> re- write

    • Compounds ( a stem plus another stem) -> table- cloth

    • Both ( synthetic compound) ( derivation plus a stem) -> bus driver

    Words are signs

    • inflected word -> phrase semantics, stress

    • compound word -> lexical semantics, stress ( „Hyde 'Park&ldquo

    • derived word -> lexical semantics, stress

    • morpheme -> lexical semantics, phonemes, stress

       

    A WORD is

    • a stem plus an inflection

       

    A STEM is either

    • a root (lexical morpheme) or

    • a derived stem ( stem plus affix) (derivation) or

    • a compound stem (stem plus stem) ( compounding)

    and nothing else is a stem ( recursive definition)



    A DERIVED STEM is either

    • a root ( zero derivation)

    • or a derived stem with an affix

    and nothing else is a derived stem



    A COMPOUND STEM is

    • a derived stem or a word + a derived stem or a word

    • a compound + a compound stem

    and nothing else is a compound stem

5. 12. 2006 Types of lexical information: morphology ( inflection and word formation)

 

5. 12. 2006 Types of lexical information: morphology ( inflection and word formation)



Morphology

  • structure of words

  • inflect to their environment

  • example „this person – these people“



Word formation

  • creativity how words are constructed

  • Why?

    -> new developments in science ( Finland: new handy telefone ( handy to carry/ use)

    -> German word „Handy&ldquo

  • Who?

  • Scientists

  • Engineers

  • Product branding companies

  • Poets

  • Everybody else



Branding

  • inventing new words to fit new products

  • Product branding companies

  • lexicon branding „Sausalito“ ( most famous one , „Pentium&ldquo

  • the language of advertising

  • swiffer“: sweep swiftly



Jabbawocky ( Alice through the looking glass by Lewis Carroll)

'Twas brillig, and the slithy toves

Did gyre and gimble in the wabe;

All mimsy were the borogoves,

And the mome raths outgrabe.

"Beware the Jabberwock, my son!

The jaws that bite, the claws that catch!

Beware the Jubjub bird, and shun

The frumious Bandersnatch!"

He took his vorpal sword in hand:

Long time the manxome foe he sought-

So rested he by the Tumtum tree,

And stood awhile in thought.

And, as in uffish thought he stood,

The Jabberwock, with eyes of flame,

Came whiffling through the tulgey wood,

And burbled as it came!

One, two! One, two! And through and

through

The vorpal blade went snicker-snack!

He left it dead, and with its head

He went galumphing back.

"And hast thou slain the Jabberwock?

Come to my arms, my beamish boy!

A frabjous day! Callooh! Callay!"

He chortled in his joy.

'Twas brillig, and the slithy toves

Did gyre and gimble in the wabe;

All mimsy were the borogoves,

And the mome raths outgrabe.



  • Galumphing

  • galopping

  • triumphing

  • jumping



  • creation of new basic simple words

  • NOT putting two words together

  • only bits of words

  • onomatopeia ( snicker- snack)

  • redublication

  • repitition




  • German translation:

    • Der Zipferlake von Christian Enzensberger Verdaustig war's und glasse Wieben
    rotterten gorkicht im Gemank; Gar elump war der Pluckerwank, Und die gabben Schweisel frieben. »Hab acht vorm Zipferlak, mein Kind! Sein Maul ist beiß, sein Griff ist bohr! Vorm Fliegelflagel sieh dich vor, Dem mampfen Schnatterrind!« Er zückt' sein scharfbefifftes Schwert, Den Feind zu futzen ohne Saum; Und lehnt' sich an den Dudelbaum, Und stand da lang in sich gekehrt. In sich gekeimt, so stand er hier, Da kam verschnoff der Zipferlak Mit Flammenlefze angewackt Und gurgt in seiner Gier! Mit eins! Mit zwei! und bis aufs Bein! Die biffe Klinge ritscheropf! Trennt er vom Hals den toten Kopf, Und wichernd springt er heim. »Vom Zipferlak hast uns befreit? Komm an mein Herz, aromer Sohn! O blumer Tag! O schlusse Fron!« So kröpfte er vor Freud. Verdaustig war's und glasse Wieben rotterten gorkicht im Gemank; Gar elump war der Pluckerwank, Und die gabben Schweisel frieben.

    Morphological structure


    Branches of morphology


    morphology

    • inflection -> table – tables

    • word formation

    • derivation

    • compounding





    paradigmatic relations

    • relation of similarity and difference

    • classification

    • opposites

    • ryme ( partly similar)


    syntacmatic relations

    • composition of relation

    • combinatory

    • put small pieces together to make a big one

    • combine


    http://www.sil.org/linguistics/GlossaryOfLinguisticterms/WhatIsAParadigmaticLexicalRela.htm

    A paradigmatic lexical relation is a culturally determined pattern of association between lexical units that

    • share one or more core semantic components

    • belong to the same lexical category

    • fill the same syntactic position in a syntactic construction, and

    • have the same semantic function.

    Examples: English

     

    Here is a table showing some common paradigmatic lexical relations in English with example sets and underlying structure:

     

    •  

     

    Lexical relation

    Example set

    Underlying structure

     

    Synonym

    A "happy" synonym set: {happy, joyful, glad}

    simple set

    Scalar property

    A temperature set: {cold, cool, lukewarm, warm, hot}

    scale

    Opposite

    A social relation set: {(student, teacher), (patient, doctor)}

    set of pairs

    Generic-specific

    Animal

    • dog

      • collie

      • terrier

    • cat

      • Persian

      • Siamese

    tree

     

     

     

     

     

     

     

     

     

     

     

     



    Reminder: Signs


    DIALOGUE social relation

    intonation

    TEXT →description

    intonation

    SENTENCE →state/ event

    accent, intonation

    WORD →entity, prop

    phonemes, stress




    Morphology sketch


    Inflection

    • function ( external structure)

    • marks the relation of words to their context

    • no change in the basic meaning of words

    • form ( internal structure)

    • affix ( prefix, suffix, infix), superfix, stem vowel change


    Word formation

    • function ( external structure)

    • creation of new words / parts of speech / meanings

    • in principle infinite extendability of the lexicon

    • Form (internal structure)

    • Root/morpheme creation (blending, abbreviation, ...)

    • Derivation: 1 stem + affix (prefix, suffix, infix), superfix, vowel change

    • Compounding: 2 stems, perhaps with interfix or inflection-like affix



    Internal structure of words

    • smallest word parts : morphemes

    • grammatical morphemes ( structural morpheme)

    • closed set

    • free: prepositions, auxiliary verbs, conjunctions

    • bound: affixes, suffixes

    • lexical morphemes ( content morpheme, root)

    • open set



    Morphemes and allomorphs




28. 11. 2006 Types of lexical information: Pronunciation

 

28. 11. 2006 Types of lexical information: Pronunciation



Surface structure

  • Two levels

  • linguistic description ( -> Metalanguage)

  • units of language ( -> Objectlanguage)



´

Surface structure of ...

  • Dictionaries

  • metalanguage : the typography and layout of a book, hypertext, ...

  • Words in dictionaries

  • object language: spelling, pronunciation



Types of lexical information: Pronunciation

[Model of types of lexical information]





Redering structures

  • Pronunciation rules -> acoustic modality

  • Spelling -> visual modality

  • Sound- Spelling rules -> Inter- modality- conversion





Representation of sounds- prosodic hierarchy

  • phonemes

  • function: „smallest word – distinguishing segment“

  • internal structure: “configuations of distinctive phonetic features”

  • external structure (see syllables)

  • rendering: “contextual variants”, “allophones”

  • syllables

  • function: “word distinguishing phoneme configurations”

  • internal structure: “configurations of sequential features (consonantal, vocalic; voiced, unvoiced; ...) and simultaneous features (tone, accent)

  • external structure (word)

  • rendering: a function of the rendering of phonemes




Basics of English Syllable Structure

  • Basic syllable structure

  • CCCVVCCC, e.g. /streIndZ/ - but affricates /dZ/, count as 1 phoneme, though phonetically they have 2 parts.

  • More detailed syllable structure as a map

  • this kind of map is sometimes called a transition network or a state diagramme - each transition from one circle/node/state describes the correct position of one phoneme.




Phonemes

  • There are several ways of defining phonemes, depending on which of the four sign components is focussed

  1. the minimal word-distinguishing sound segment (based on the contrastive function of phonemes)

  2. The smallest unit of a syllable (based on external sound structure)

  3. Consists of distinctive features (based on the internal sound structure)

  4. Consists of a set of allophones (based on the rendering of phonemes)




Description of sounds

  • For general pronunciation representation in the lexicon -> phonemic transcription

  • just enough phonetic detail to distinguish words

  • For detailed representation of speech pronunciation -> phonetic transcription

  • based on articulatory phonetics (about speech production)

  • remember the other dimensions of speech description:

  • acoustic phonetics (about speech wave transmission)

  • auditory phonetics (about speech perception)




Swallowing“ Characters

  • chbimmim“ -> „Ich bin mit dem“ Auto gefahren

  • you actually don't swallow characters, your tongue moves faster and has no time to pronounce some letters, therefore the sentence is reduced and some characters are left out

     


Spelling- to- Sound rules

  • Spelling: VISUAL modality

  • ghoti ... /fish/ -> „gh“ = „f“ in „tough“ , „o“ = „i“ like in „women“ and „ti“ = „sh“ like in „nation“

  • i before e except after c”, consonant doubling

  • Graphemes:

  • character combination corresponding to a phoneme

  • Transcribe phonemically (without stress marks):

  • If the bread dough is tough, knead it roughly, even though when you’re through you’ll have had enough and will throw it at the ceiling.

  • /If D@ brEd d@U: Iz tVf ni:d It rVfli: i:v@n D@U: wEn ju@ Tru: ju:l @v h{d InVf @nd wIl Tr@U It {t D@ si:lIN/


  • Task

  • make a list of 5 spelling rules

     

  • make a list of 5 main spelling problems




      Basic Rules

Remember this poem to decide if a word should be spelled ie or ei.

Put i before e

Except after c

Or when it sounds like a

As in neighbor or neigh.

 

Examples for line 1:

mischief

believe

field

 

 

Examples for line 2:

receiver

conceited

 

 

Examples for line 3:

eight

weigh

freight

 

Some Exceptions:

friend

neither

leisure

foreign

       

 

 


 

      Follow these steps to decide if a final consonant needs to be doubled when a suffix

    1. or verb ending is added.

      • If the word is one syllable or is stressed on the last syllable (Say the word out loud

      • to determine stress.)

      • And has a single final consonant

      • And that single final consonant is preceded by a single vowel

      • And the suffix begins with a vowel

      • Then double the final consonant.

      • Example: Control + able

      • The stress is on the last syllable – trol

      • There is a single final consonant - l

      • The final consonant has a vowel before it - o

      • The suffix, able, begins with a vowel
        Therefore, you double the l before adding the suffix.

      • Write controllable

      •  

      • Example: enter + ing

      • The stress is on the first syllable - en - not the last

      • Therefore, you do not double the final consonant.

      • Write entering.

      •  

      • How to handle a final e when adding a suffix or verb ending.

      • If the suffix or verb ending begins with a vowel, drop the final e.

 

Examples:

amuse + ing = amusing

 

 

creative + ity = creativity

     

 


 

 


 

        If the suffix or verb ending begins with a consonant, keep the final e.

 

Examples:

measure + ment = measurement

 

 

definite + ly = definitely

     

 



 

 

Examples:

belief = beliefs

 

 

half = halves

      • Most nouns ending in o add s. However, some add es.

        There is no rule to follow here.

 

Examples:

studio = studios

 

 

cargo = cargoes

 

( http://www.mc3.edu/aa/lal/workshops/wksp_spelling/spellingrules.html )






21. 11. 2006 Lexicon Data and their Structure

 

21. 11. 2006 Lexicon data and their structure



Lexicon structure and their data types

  • Microstructure

  • number of lexicon articles/entries/records

  • order of DatCats ( datacategories)

  • Mesostructure

  • Interrelation of lexicon entries

  • relation to external information

  • Macrostructure

  • order of lexicon entries

  • selection of sort key

  • sorting order not trivial! ( cf. Languages which are only spoken -> IPA)


    Sorting NOT trivial, example „ @ „

  • you would expect „ h@me“ close to the word „home“

  • you would expect intern@t close to the word „ internet“

  • you would expect „ @“ home close to the word „ at“

    so where do you sort „@“ ???


Haus -> Häuser

Hauses -> Häuser

Hause -> Häusern

Haus -> Häuser ( which form would you find in a dictionary? -> Haus)


a declination

flamm – a, – ae, - ae, -am, -a, -ae, -arum, - is, - as, -is ( which one would you find here? -> flamma )


Microstructure

  • words (most) ( except for pucture dictionaries)

  • grammatical information: syntax

  • part of speech (POS)

  • inflectional class

  • valence ( which verb takes (how many) objects, transitiv, intransitiv)

  • representation of meaning (formats differ)

  • semantics

  • definition

  • corpus reference := usage examples



Detour: CORPUS

-> collection of language material

  • texts

  • transcripts

  • speech ( transcription in IPA)

  • examples : Oxford corpus, Longman corpus

-> with additional information

  • Part Of Speech

  • lemma ( de- grammaticalized form of a word)

  • transcription

  • annotations

-> with a specific structure

  • interlinar glossing

  • special make up




Other types of lexicons


  • Word frequency lexicon

  • the most frequent one first

  • Lexicon of "phrasal verbs"

  • by part of speech and a special structure

  • rhyming lexicon

  • by word ending

  • picture lexicon

  • by prototype




Problematic issues in lexicography


  • ambiguity

  • synonyms ( two word forms , same meaning)

  • polysemy ( one word form, two or more (slightly) different meanings)

  • homonyms ( one word form, meaning completely different)


  • word search

  • languages with inflectional prefixes

  • orthographic ambiguity

  • picture lexicons?

     

  • Language change

  • new words

  • new meaning



  • Solutions to problems

  • ambiguity : enumeration

  • search word „abitrary“ definition

  • language change: new edition

  • more fundamental solutions




Methods of creating lexicons

  • introspection

  • look inside ( by trained linguist)

  • reflecting one's own language use

  • social“ filter : relevance, importance, adequacy

  • Questionnaire

  • in comparative linguistics

  • typology

  • unknown language -> picture dictionary

  • point at picture ( might be rude in some countries)

     

  • requirements and limitations

  • intended use: researching morphology, use in computer systems, translation

  • intended usergroup: experts, lay, translators,linguists,

  • intended coverage: general, special purpose

  • available sources: availability of language experts (native speakers)

  • example questionnaire :

  • Asking questions for translation, explanation

  • Social filters apply

  • http://www.spectrum.unibielefeld.de/~ttrippel/htmd/questionnaire_short.html


     

  • corpus

     

     


Corpus based lexicon creation

  • "reflect the evidence"

  • include "words" found

  • exclude items not in corpus

  • based on corpora

  • list all words: wordlist

  • words in context: concordance

  • distribution analysis: HMM

  • flat tabular lexicon

  • generalizations in the lexicon

  • declarative lexicons




Hierachy of lexicon and corpus types



Corpus based lexicon creation application

  • SIL toolbox

  • Summer Institute of Linguistics

  • famous for fieldwork tools

  • language database: www.ethnologue.org

  • previously named "shoebox"

  • future: fieldworks

  • Interlinearization of text

  • one line "base" text

  • one line gloss

  • one line morphology

  • ....




Lexicon Database Applications

  • Lists

  • Table

  • Tables

  • Relational Database Management Systems (RDBMS)

  • samples

  • Corpus based lexicon management

  • Graph based lexicons




Relational Model for a Lexicon

  • table structures

  • efficient storage and retrieval in Relational

  • Database Management Systems (RDBMS)

  • often used for technological applications

  • used for some web based lexicons

  • translation = mapping of two different columns

  • example: http://dict.tu-chemnitz.de




Graph based lexicon

  • Lexical information = nodes in a graph

  • microstructure = (labeled) arcs between nodes

  • crossreferences = arcs between nodes

  • mesostructure = reference to external knowledge

  • macrostructure = access structure, starting at each node



Summary

  • Lexicon structures and data types

  • microstructure data types

  • different macrostructures

  • Lexicon creation

  • questionnaire

  • corpus

  • Lexicon representation formats

  • RDBMS

  • graphs

14. 11. 2006 Lexical databases

 

14. 11. 2006 Lexical Databases



Surface structure (appearance, rendering) of dictionaries

  • semasiological dictionary (reader's dictionary, decoding dictionary)

  • onomasiological dictionary (writer's dictionary, encoding dictionary)



An overview of surface structures



The deep structure of dictionaries

semasiological dictionary

  • basic form : table

  • rows: lexical entries with specific microstructure

  • columns: single types of lexical information

  • if orthography or phonology ambiguous

  • either item is repeated with the new information

  • or sub table

  • depends on kind of ambiguity

  • homonymy (homography, homophony)

  • polysemy

  • homonym

  • a word that has the same pronunciation and spelling as another word, but a different meaning.

    Example: The word stalk, meaning either part of a plant or to follow (someone) around.

  • Homograph

  • a word that has the same spelling as another word, but a different meaning. Example: The spelling to cleave may denote to adhere to or to divide or split.

  • Homophone

  • a word that has the same pronunciation as another word, but whose meaning and/or spelling are different, . Example: All of to, too, and two, or there, their, and they’re ( http://en.wikipedia.org/wiki/Homonym )

  • polyseme

  • a word or phrase with multiple, related meanings. http://en.wikipedia.org/wiki/Polysemy



Dictionary Information

  • Metadata: catalogue information about the production of thedictionary, intended for dictionary identification

  • Types of lexical information in dictionary entries:

  • FORM (cf. appearance), e.g. spelling, pronunciation

  • STRUCTURE (cf. formulation), e.g. construction ofwords, place of words in larger constructions (e.g. sentences)

  • CONTENT (cf. Meaning): definition, relations with other words, examples



The task

Exercise: To understand what a database basically is,

create a table with one of the following:

  • a list of your CDs (well, some of them), with name, artist, ...

  • a list of your friends, with names, addresses, etc.


Name

Prename

Birthday

Bechtloff

Corinna

20. 01. 1986

Bryczek

Natalia

04. 03. 1987

Höppner

Marie Luise

12. 09. 1986

Kerker

Kristina

14. 06. 1986

Schneider

Sarah

21. 06. 1987



Basic model of a table

  • table: a list of rows

  • row: a list of fields

  • column: a list of fields in the same row position



How to ...

... create tables in Open Office/ Microsoft Words

  • table“ -> insert -> table -> choose number of rows/ columns -> ceate table


How to ...

... create tables in Ms Excel/ Open Office Calc

  • start program


The html table model


< html >

< head >

<title> Example of the HTML table model < /title>

< /head>

< body>

< table border = 20 >

< tr >

< td > love < / td>

< td> noun < / td>

< td> a feeling of strong affection < / td>

< /tr>

< /table>

< /body>