The following manual describes the guidelines that the editors asked the database contributors to follow in filling in the questionnaire. It defines all the key concepts, discusses a number of potential issues, and gives some examples.
The Leipzig Valency Classes Project (funded by the DFG) is carrying out a large-scale cross-linguistic comparison of valency classes (or "verb types", in terms of Levin 1993). With respect to their valency properties, verbs fall into different classes in all languages. The project is inspired by Levin (1993), a classical study of syntactic classes of verbs in English, which shows that a semantic classification of verbs can be achieved through applying syntactic diagnostics. Yet, this study, as well as an earlier study by Apresjan (1967) on Russian, has not been followed up cross-linguistically, which leaves open the question of which aspects of these classifications are universal and which are language-particular. Similarly, valency dictionaries are few in number and mostly deal with European languages, thus they cannot fill the gap.
↑ top
To make progress in the cross-linguistic study of valency classes, the members of the Valency Classes Project (Andrej Malchukov, Martin Haspelmath, Bernard Comrie, Iren Hartmann) have assembled a group of contributors, specialists of valency patterns in a representative set of languages, who are collaborating to obtain a consistent set of cross-linguistic data. The contributors have been asked to do three things as part of their contribution to the project:
In the following, the database questionnaire is described. (Items (ii) and (iii) are described elsewhere.)
↑ top
The purpose of the Valency Database Questionnaire is to gather comparable valency data on a wide variety of languages. The data will be comparable because the verbs are elicited in response to a list of 70 pre-defined verb meanings, and because the valency information is recorded in a standard way. The 70 verb meanings are described in Table 1, in three columns: (i) the meaning label, (ii) the role frame, and (iii) a typical context. (Since the English verbs used as meaning labels sometimes have different meanings, we have added a sentence for each verb meaning that makes the intended meaning clear. These sentences are not crucial, they are just intended to help the contributors find a context for a counterpart.)
Table 1: The 70 verb meanings
meaning label | role frame | typical context |
---|---|---|
RAIN | (it) rains | It rained yesterday. |
BE DRY | S is dry | The ground is dry. |
BURN | S burns | The house is burning. |
SINK | S sinks | The boat sank. |
ROLL | A rolls | The ball is rolling. |
BE A HUNTER | S is a hunter | This man is a hunter. |
BE HUNGRY | E is hungry | The baby is hungry. |
BE SAD | E is sad | The little girl was sad. |
DIE | S dies | The snake died. |
FEEL COLD | S is cold | I’m cold. |
FEEL PAIN | E feels pain in M | My arm is hurting. = I’m feeling pain in my arm. |
SCREAM | S screams | The man screamed. |
LAUGH | S laughs | The little girl laughed. |
PLAY | S plays | The child is playing. |
LIVE | S lives somewhere (L) | The old people live in town. |
LEAVE | A left L | The boy left the village. |
GO | S goes somewhere (L) | The woman went to the market. |
SING | S sings | The boy sang (a song). |
JUMP | A jumps | The girl jumped. |
SIT DOWN | S sits down (somewhere (L)) | The children sat down on the bench. |
SIT | S sits somewhere (L) | The children sat on the floor. |
RUN | A runs | The horse is running. |
CLIMB | A climbs (up L) | The men climbed (up) the tree. |
COUGH | S coughs | The old man coughed. |
BLINK | S blinks | I blinked (my eyes). |
SHAVE | A shaves (his beard/hair) | The man shaved his beard/cut his hair |
DRESS | A dresses P | The mother dressed her daughter |
WASH | A washes P | The mother washed the baby. |
EAT | A eats P | The boy ate the fruit. |
HELP | A helps X | I helped the boys. |
FOLLOW | A follows X | The boys followed the girls. |
MEET | A meets X | The men met the boys. |
HUG | A hugs P | The mother hugged her little boy. |
SEARCH FOR | A searches for X | The men searched for the women. |
THINK | A thinks about X | The girl thought about her grandmother yesterday. |
KNOW | A knows P | The girl knew the boy. |
LIKE | E likes M | The boy liked his new toy. |
FEAR | E fears M | The man feared the bear. |
FRIGHTEN | A frightens P | The bear frightened the man. |
SMELL | E smells M | The bear smelled the boy. |
LOOK AT | A looks at P | The boy looked at the girl. |
SEE | E sees M | The man saw the bear. |
TALK | A talks (to X) (about Y) | The girl talked to the boy about her dog. |
ASK FOR | A asks (X) for Y | The boy asked his parents for money. |
SHOUT AT | A shouts at X | The woman shouted at the children. |
TELL | A tells (X) Y | The girl told the boy a funny story. |
SAY | A says “...”( to X) | They said “no” to me. |
NAME | A name X (a) Y | The parents called the baby Anna. |
BUILD | A builds P (out of X) | The men built a house out of wood. |
BREAK | A breaks P (with I) | The boy broke the window with a stone. |
KILL | A kills P (with I) | The man killed his enemy with a club. |
BEAT | A beats P (with I) | The boy beat the snake with a stick. |
HIT | A hits P (with I) | The boy hit the snake with a stick. |
TOUCH | A touches P (with I) | The boy touched the snake with a stick. |
CUT | A cuts P (with I) | The woman cut the bread with a sharp knife. |
TAKE | A takes P (from X) | The man took the money from his friend. |
TEAR | A tears P (from X) | The girl tore the page from the book. |
PEEL | A peels (X off) P | The boy peeled the bark off the stick. |
HIDE | A hides T (from X) | The boy hid the frog from his mother. |
SHOW | A shows T (to R) | The girls showed pictures to the teacher. |
GIVE | A gives T to R | We gave the books to the children. |
SEND | A sends T (to X) | The girl sent flowers to her grandmother. |
CARRY | A carries T (to X) | The men carried the boxes to the market. |
THROW | A throws T somewhere (L) | The boy threw the ball into the window. |
TIE | A ties P (to L) (with I) | The man tied the horse with a rope to the tree. |
PUT | A puts T somewhere (L) | I put the cup onto the table. |
POUR | A pours T somewhere (L) | The man poured water into the glass. |
COVER | A covers P (with X) | The woman covered the boy with a blanket. |
FILL | A fills P (with X) | The girl filled the glass with water. |
LOAD | A loads T (onto L) | The farmer loaded hay onto the truck. = The farmer loaded the truck with hay. |
This sample of 70 verbs1, which was in part selected on the basis of the feedback from the contributors2, is hoped to be representative of the verbal lexicon. It is in particular these verb meanings that have been reported to show distinctive grammatical behaviour in the literature. Of course the editors had to try to strike a balance between a temptation to include more interesting verb meanings in the list, and avoiding overload on the part of contributors if confronted with a more extensive list. Further interesting verbs could be added to the database of individual languages (see §9.2).
↑ top
Contributors are asked to provide four types of information, in four different tables of the database:
↑ top
Please find a counterpart verb in your language for each of the pre-defined verb meanings. Note that the meaning definition is given by the role frame (plus optional comments in the meaning comments field, and example), not by the meaning label (COVER, FILL, etc.). The latter is only a label that serves as a heading for the record. The meaning thus includes information about semantic roles, for which the role frame is crucial (e.g. the meaning intended by WASH is 'A washes P', not 'A washes').
For each verb meaning of the database, please find the semantically most fitting non-derived verb in the project language. More generally, the verb should have a "basic" flavour, i.e. verbs that are used very rarely should be avoided if a more commonly verb with similar meaning is available. “Basic”ness of the counterpart verbs is more important than exact semantic equivalence between the pre-defined meanings and the counterparts. The only requirement is that the correspondence is close enough to be helpful for the project.
The relationship between pre-defined meanings and counterpart verbs is many to many. This means that if there are two basic verbs in your language that correspond to a meaning, you should include them both, and if a single verb corresponds to two different meanings, it needs to be given only once.
If there is no counterpart in your language and a paraphrase would have to be used (e.g. if HIDE is expressed as 'put something somewhere where others cannot see it'), please do not give a counterpart. (This means checking the field "No counterpart" in the Meanings layout.)
This does not mean that verbs that are complex in some way should be excluded. If a complex verbal expression, whether a morphologically derived form or a syntactic phrase, is conventional (or lexicalized), it should be included, even if it is productively derived.3 Thus, complex precicates like English carry out or German teil-nehmen 'take part', as well as serial-verb complexes like Yoruba bá ... wí 'rebuke', should be regarded as single verbs.
If a verb has other meanings than the pre-defined meaning for which it is a counterpart, it is helpful to include these other meanings, in the comments field.4 (It is also possible to create new meanings; see §9.3 below for this advanced feature.)
Homonymous verbs should be distinguished by a number, e.g. lie (1) ('be in a horizontal position') and lie (2) ('say something that is untrue').
The "comments" field can contain further non-structured information about all aspects of the verb and its relationship to the pre-defined meaning.
DATABASE FIELDS:
relationships:
↑ top
Please provide glossed examples of sentences illustrating the coding frames and alternations. There should be (at least) one example for each of the counterpart verbs with the basic coding frame, as well as (at least) one example of each of the alternations (with some verb).5
Ideally there would also be an example of each possible verb-alternation pair (where the verb should be shown in the non-basic alternant form). However, since some alternations are very productive and occur with a large number of verbs, this would be too much work for most contributors, so a complete set of verb-alternation pair examples is not required.
DATABES FIELDS:
relationships:
↑ top
In our terminology, the valency of a verb is the list of its arguments with their coding properties (coding frame), their behavioural properties (syntactic-function frame), and with the relationship of the arguments to the roles in the verb's role frame.
Coding properties involve the following techniques (Haspelmath 2005; Malchukov, Haspelmath & Comrie 2010):
Behavioural properties concern the behaviour with respect to patterns like reflexivization, passivization, causativization, argument omission, and various cross-clausal constructions such as control, switch-reference, raising and coordination. As a verb's valency consists of both the coding properties and the behavioural properties of its arguments, all these are in principle relevant to valency. However, in the Leipzig Valency Classes Project, only coding properties and one aspect of the behavioural properties, alternations, are taken into account.
An example of a full specification of a verb's valency is given below for English remind:
English remind
coding frame: | A > V.subj[A] > R > of+T | |
syntactic-function frame: | subjectX direct.objectY oblique.objectZ | |
role frame: |
A reminds R of T
(or: agentA addresseeR themeT
or: A cause (remember R T)) |
|
An argument of a verb is a phrase whose occurrence is made possible by a specific verb, and which therefore cannot occur with a generic verb. This can be tested by attempting to move a phrase into a neighboring clause with an anaphoric verb, as shown in (1a, 1c). Adjuncts, by contrast, are not tied to particular verbs and can therefore be moved out into a clause with an anaphoric verb (1b, 1d):
(1)
If you are unsure whether a participant is an argument or an adjunct, please make an arbitrary choice and include a comment in the verb layout's comment field. (Please do not use language-specific criteria for "argumenthood", because these are really criteria for different concepts labeled "argument", not for comparable 'argument' concepts.)
↑ top
Different verbs have different coding frames, but in all languages, coding frames recur, e.g. in German, the sample coding frames shown in (2a-c) occur in many different verbs:
(2)
(Note: capital letters other than V are used as argument/participant variables.6)
Since coding frames are not unique, but recur, they are in a separate table of the database. Once a coding frame has been set up for a verb, all other verbs using the same coding frame can make use of the same record, i.e. coding frames need to be entered only once and can thereafter be simply selected from the list of existing coding frames.
A coding frame is a list of all the argument variables with their standard coding, i.e. flagging (case-marking or adposition marking), indexing ("agreement" or "cross-referencing"), possibly word order, plus a verb variable.
The following conventions are used:
Note that case labels and index labels are language-specific entities, just like adpositions. There is no need to standardize (or abbreviate) them, because they will eventually have to be defined in full detail anyway (but not within our database). Transparent labels are good for mnemonic reasons, though. If word order is not consistent or particularly distinctive, the order in which the argument variables and the verb variable occur in the coding frame is in principle arbitrary, but for mnemonic reasons, an order that corresponds to the most frequent clause order is advisable.
It is important to note that the coding frame descriptions are necessarily schematic and partial and cannot possibly describe the coding fully. For example, if an index can either be a prefix or a suffix, then a salient or dominant position should be chosen, without worrying about the incompleteness of the coding frame. Another example is word order, which often obeys complex regularities that cannot possibly be captured by simple schemas.
In most cases, arguments are noun phrases or adpositional phrases with specific coding. However, arguments may also be adverbial arguments, in particular locational arguments. These arguments do not have any specific coding, cf. (3). In such cases, all we can say is that the argument is a locational phrase of some sort, and the coding frame is "A V T LOC".
(3)
Whether the arguments are obligatory or optional is not recorded in the coding frames, because it may depend on too many different factors, most of which have nothing to do with the verb-specific coding. If an argument can be omitted only with certain verbs, then this should be regarded as an argument-omission alternation (e.g. object omission with existenitial/generic meaning that is possible for objects of certain verbs in English: He often reads/eats/drinks).
If a verb is associated with several different coding frames that are not related by a regular alternation, then the verb has to be split up (somewhat artificially) into two different homonymous verbs. (In reality, the relationship between verbs and coding frames is many-to-many, but we simplify our database structure by pretending that it is many-to-one.)
The contributors indicate the semantic roles of the arguments by using the same variables as argument variables that are used as participant variables in the role frames of the pre-defined meanings, for instance:
pre-defined meaning label: | ASK FOR |
pre-defined role frame: | 'A asks R for T' |
German counterpart verb: | bitten |
coding frame: | A-nom V R-acc um+T |
In this way, it becomes clear that the nominative argument has the asker role, the accusative argument has the askee role, and the complement of the preposition um has the role of thing asked for.
It is important to be aware that the semantic roles in the role frames do not necessarily correspond to syntactic arguments in your language. (The role frames are not "semantic valency frames", and we do not work with a semantic definition of arguments.) The role frames are intended to convey the verb's meaning and to provide readily identifiable participant variables, not to prejudge the number of syntactic arguments that a counterpart verb will have.
Note that in the role frames, some arguments are given in parentheses. This has no particular meaning, but is intended to draw attention to the fact that we do not expect all roles to correspond to arguments of the counterpart verb of your language. The roles in parentheses seem to us particularly likely not to correspond to arguments, but this is just an expectation based on our impressions. What counts are the facts of your language.
If your verb is associated with two coding frames that are related via an uncoded alternation (e.g. English give, which can have the coding frame "A > V > R > T" or the coding frame "A > V > T > to+R"), the decision of which coding frame to take as basic may not be possible on a principled basis, so an arbitrary decision may have to be taken.9
If your verb has an argument with a semantic role that is not included in the role frame, please use additional role variables and characterize the additional roles in the comment field of the coding verb.
DATABASE FIELDS:
↑ top
For the purposes of our project, a valency alternation is defined as a set of two different coding frames that are productively (or at least regularly) associated with both members of a set of verb pairs sharing the same verb stem. (Thus, we are restricting ourselves to coding alternations.) Valency alternations may or may not be coded in the verb form, and accordingly we distinguish two types of alternations, uncoded alternations (such as the English Dative Shift alternation) and coded alternations (such as the English Passive alternation).
Examples of alternations are the following:
English: | Dative Shift (uncoded) | ||
A > V > R > T | <> | A > V > T > to+R | |
(e.g. give, sell, lend, ...) | |||
Russian: | Anticausative alternation (coded) | ||
A-nom V X-acc | <> | X-nom V-sja | |
(e.g. lomat' 'break', otkryt' 'open', ...) | |||
German: | Passive Voice (coded) | ||
A-nom X-acc V | <> | X-nom von+A V-t+werden | |
(e.g. nehmen, bringen, schicken, lachen, ...) |
Alternations are relevant to the project to the extent they are sensitive to verb classification (e.g., some varieties of differential object marking apply to any transitive verb, which does not yield an interesting clustering of verb types). More generally for our purposes, most relevant are those alternations which are fairly productive (not restricted to a few lexical items), but – most importantly – are sensitive to lexical classes. That is, we are interested in alternations which are distinctive for the verbal lexicon (as sampled here) rather than in those which apply across the board or apply to just few items.
One of the alternants needs to be designated as "basic" for the purposes of the database. In a coded alternation, this will be the alternant with no coding on the verb. In an uncoded alternation (as well as in a doubly coded alternation), the contributor has to choose one alternant as "basic". It is understood that a principled decision will often be difficult or impossible, so an arbitrary decision can be taken.
Contributors will ideally enter all (major) valency alternations of their language into the "alternations" table of the database, but if there are more than ten such alternations, then this will be a lot of work, and they should contact the editors.
Then for each verb, they will have to decide whether the verb occurs in the alternation or not, or whether it applies only marginally.
DATABASE FIELDS:
relationships:
↑ top
In this subsection only uncoded alternations are considered; discussion of coded alternations is postponed to the next subsection. The two subsections should be seen as largely complementary, as alternations will be coded by dedicated markers in some languages (apparently especially languages with a richer morphology) and left uncoded in other languages (like English).
We begin with English, a rather atypical example of a language with a particularly rich set of uncoded alternations. Levin (1993) mentions, in particular, the following alternations. (The list below mentions only fairly productive alternations and does not include verb-coded alternations like the passive alternation; the terminology partially differs from Levin's.)
↑ top
Languages with richer morphology often use coded alternations for many argument alternations left uncoded in English. Thus in Even, a Tungusic language, the “middle alternation” is signaled by the mediopassive marker, the “inchoative-causative alternation” is signaled by the causative marker (in competition with the mediopassive), while equivalents of English verbs allowing for a “reciprocal alternation” commonly involve a lexicalized sociative marker (e.g. baka-lda [find-SOC] ‘meet’).
One can use taxonomy for the domain of verb-coded alternations, as we adopted for (uncoded) coding alternations:
Note that some languages have several causative markers, for example, for building intransitive vs. transitive causatives. These can be used to test for transitivity of less prototypical transitive verbs.
There may be several subtypes of applicatives, depending on which object is promoted (for example, in Hoocąk (Siouan), there are 4 different applicative markers, including the benefactive applicative, the instrumental applicative and two types of locative applicatives). On the other hand, the general applicative in Salish has been claimed to have different meanings depending on the verb’s class. Applicatives may be used to render many of the alternations listed in §8.2, including the dative shift, or locative alternation (cf. (f)). Other languages may use directional markers to code some of these alternations (cf. Russian na-gruzit’ seno na telegu [PREF-load hay on cart] ‘to load the hay on the cart’ vs. za-gruzit’ telegu senom [PREF-load cart hay.INS] ‘to load the cart with hay’; German laden vs. be-laden).
Not all alternations can be easily classified as valency-increasing/rearranging or valency-reducing. For example, some (Austronesian) languages show a variety of “voice” (or “focus”) forms (“actor focus”, “goal focus”, etc), used for ‘promotion’ of different objects to the subject position; for these languages it will be relevant which Vs allow for which voice constructions. On the other hand, head-marking languages of the “hierarchical type” show a direct-inverse alternation triggered by the relative prominence of the A and P arguments. In that case it is relevant to study the use of direct-inverse alternations with different groups of two and three argument verbs (in the latter case, it is also relevant which of the object arguments takes part in the alternation; e.g., Theme or Recipient of a ditransitive verb). But also for the domain of monotransitives some languages may show further differentiation; e.g., some languages (like Tlapanec) may have different inverse forms for different subtypes of 2-argument verbs.
↑ top
For ten additional verb meanings, counterparts may optionally be given, see Table 2.
Table 2: Ten additional verb meanings
BRING | A brings T to R | The girl brought flowers to me. |
PUSH | A pushes P (somewhere (L)) | The boy pushed the girl (into the water). |
DIG | A digs (for X) | The woman is digging for potatoes. |
WIPE | A wipes T (off X) | The women wiped dirt off the table. |
STEAL | A steals T (from X) | The thief stole money from the old lady. |
GRIND | A grinds P (with I) | The women ground the seeds (with mortar and pestle). |
HEAR | E hears M | The boy heard the bear. |
TEACH | A teaches R T | The old lady taught the girl a song. |
COOK | A cooks P | The women cooked the meat. |
BOIL | S boils | The water is boiling. |
↑ top
For each alternation, and for each coding frame, optionally additional verbs may be entered which are not counterparts of the pre-defined meanings and which undergo the alternation and which occur in the coding frame. These further verbs should be linked to newly created meanings (see §9.3).
Ideally, the information provided on these additional verbs would be complete, i.e. it would include the coding frame (also if the verb is given as an additional example of an alternation) and the behaviour in alternations (also if the verb is given as an additional example of a coding frame).
↑ top
In addition to the 70 pre-defined verb meanings, users may create further verb meanings if they want to enter verbs that have other meanings.
DATABASE FIELDS:
↑ top
If a coding frame is used for an open class of verbs (i.e. if it is productive), please specify (optionally) which verbs may belong to this class (in terms of relevant semantic or formal features). This should be done in the field "Coding-frame class". (For highly restricted classes, the best way of characterizing them would be to give additional verbs, see §9.2.)
↑ top
Apresjan, Jurij D. 1967. Eksperementaljnoe issledovanie russkogo glagola. Moskva: Nauka.
Dixon R. M. W. 1991/2002. A New Approach to English Grammar, on Semantic Principles. Oxford: Clarendon Press.
Haspelmath, Martin. 2005. Ditransitive Constructions: The Verb ‘Give’. In: Haspelmath & al. (eds.), 426–429.
Levin, Beth. 1993. English Verb Classes and Alternations. Chicago: University of Chicago Press
Malchukov, Andrej, Martin Haspelmath, and Bernard Comrie. 2010. Ditransitive constructions: a typological overview. To appear in Malchukov, A., M. Haspelmath, and B. Comrie (eds.). Studies in Ditransitive constructions. A comparative handbook. Berlin: De Gruyter Mouton.
Sauerland, Uli. 1994. German diathesis and verb morphology. In: Jones, Doug (ed). 1994. Working Papers and Projects on Verb Class Alternations in Bangla, German, English, and Korean. MIT AI Memo 1517, 37–92.
↑ top