What is T-PAS?
T-PAS is an inventory of Typed Predicate-Argument Structures (T-PAS) for about 1000 average polysemy Italian verbs. Typed Predicate-Argument Structures are linguistic objects composed by a Verb and its argument(s), together with the specification of the of the Semantic Type for each argument. An example of T-PAS can be seen in the figure below. A verb may have 1 to more than 20 TPAS associated with it. T-PASs are corpus-derived objects, i.e., they are acquired through the manual clustering and annotation of corpus instances containing the verb, following the Corpus Pattern Analysis (CPA) methodology proposed by Patrick Hanks. T-PAS are sense-stable objects, they represent one sense of the verb.
What’s inside?
When you access T-PAS the following information is available to you
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
T-PAS Resource Online Interface contains:
- a repository of corpus-derived predicate argument structures for Italian verbs (called patterns) with semantic specification of their argument slots and a sense description, that is, a brief definition of the pattern, provided for each
- 5 good dictionary examples that instantiate the different patterns of the verbs in the inventory
- an inventory of ca. 200 corpus-derived semantic classes (called Semantic Types) organised in a hierarchy (called System of Semantic Types), used for the semantic specification of the arguments. for each semantic type a brief definition and some examples are provided
- a tool for browsing Semantic Types in patterns, that is a search engine in which you can look for a specific type in any or a specific argument position and retrieve the list of patterns in which it is contained
How does it work?
Here is a list of actions you can do in T-PAS:
- look for a specific verb typing its infinitive form in the search engine
- click on the verb you have searched to open the list of patterns for that verb
- read the 5 GDEX for each pattern: click on “show all corpus examples” or open each pattern by clicking on the button on the right
- use the “browse semantic types in patterns” tool: digit a specific semantic type in the search engine and specify a specific argument position (subject, object. etc.) if you want to know in which patterns the semantic type is used as subject, object etc. Leave it empty if you are interested in all the occurrences of that semantic type
- if you have no idea on which semantic type you want to look for, check the “system of semantic types”, open the hierarchy and read the definitions. you can also look for a specific semantic type through the search engine
How to download data?
You can download the resource from here or from the online interface.
You will download a zip folder containing:
- the database with the patterns and all the related semantic and syntactic information, in json format
- the system of semantic types, together with their definitions and examples, in json format
- 5 Good Dictionary Examples for each pattern, in json format
T-PAS Key Notions
In this section you will find some key notions about T-PAS and in particular regarding:
- patterns
- system of semantic types
- GDEX
Patterns
Typed predicate-argument structures are patterns that display the semantic properties of verbs and their arguments: for each meaning of a verb, a specific pattern is provided.
The patterns are corpus-derived, i.e., they are acquired through the manual clustering and annotation of corpus instances, following the CPA methodology (i.e., Corpus Pattern Analysis; Hanks, 2013).
In the resource, each pattern is labelled with a pattern number and connected to a list of corpus instances realising that specific verb meaning. The Skema editor enables the registration of different semantic and lexical information in each pattern:
- the verb, which in T-PAS is generally in its infinitive form – e.g., fare (Eng., ‘to do’)
[Human] fare [Activity]
- the Semantic Types (e.g. [Human], [Activity], always portrayed within square brackets), specifying the semantics of the arguments selected by the verb
[Human] fare [Activity]
- argument positions, which are filled by the Semantic Types in the patterns (they can be optional, but yet registered in the pattern if they are relevant to the sense of the verb):
- subject (in red)
- object (in green)
- adverbial (in grey)
- clausals (in violet)
- prepositional complement (in orange)
- predicative complement (in blue)
- the sense description, i.e. a brief definition of the meaning of the verb in that specific pattern in which the Semantic Types are mantained
[Human] esegue [Activity]
- a lexical set (optional) for each Semantic Type in the pattern, i.e. a selection of the most representative lexical items instantiating that Semantic Type (e.g. vino = ‘wine’ | birra = ‘beer’ | aranciata = ‘orange juice’ are good candidates for the lexical set of [Beverage])
[Animate] bere [Beverage {birra | caffè | tè | bibita | bevanda | aperitivo | cocktail | liquore | vino | acqua | latte | grappino | birretta | spritz | mojito | birrozza | tisana | cappuccino | cioccolata | whisky | vodka | rum | rhum | cognac | pozione | elisir | sangue | liquido | acqua}]
- the roles (optional) played by some specific Semantic Types in certain contexts: in particular, the Semantic Type [Human] can acquire the role of Athlete, Doctor, Musician, Host, Guest, Writer, etc., depending on the verb selecting it as an argument
[Human1 = Doctor | Nurse] curare [Human2 = Patient]
- the features (optional) associated with the Semantic Types, i.e. certain semantic characteristics required by the pattern syntax (e.g. Plural) or by the specific verb meaning (e.g. Female, Negative, Visible)
[Animate1 : Female] partorire [Animate2 {figlio | bambino | neonato}]
- prepositions (for prepositional complements)
[Human] volare a|in|su|per|verso [Location]
- particles (for adverbials)
[Human] buttare via [Abstract Entity]
- complementizers (for clausals)
[Human] dichiarare che|di [Proposition] | : [Proposition]
- obligatory determiners (for lexical sets), which can be implemented according to the specific argument position in question
[Human] mangiare {la foglia}
Relevant Syntactic and Semantic Phenomena
- semantic type alternation: an argument may be systematically realized by more than one semantic type. We call this a semantic type alternation, and indicate it with a vertical bar |
[Human] | [Institution] annunciare [Event]
- syntactic alternation of arguments: different syntactic realizations within the argument position are encoded as alternating subcategorization frames within the same pattern
[Human] finire [Activity] | di [Activity]
- argument optionality: there are regularly occurring concordances in the corpus in which the given argument is omitted: the omission does not affect the sense of the verb, and the argument remains understood
[Human1] | [Institution] dare ([Entity]) (a [Human2])
- idiomatic uses: verbs may enter into fixed constructions, which usually correspond to idiomatic senses of the verb: this means that the meaning of the t-pas is not directly compositional, that is, it is not the sum of the meaning of the parts
[Human] bersi {il cervello}
- phrasal verbs: phrasal verb is a verb that is made up of a main verb together with a particle with adverbial function
[Human] buttare via [Abstract Entity]
- metonymic sub-patterns: metonymic displacements in arguments are encoded as Semantic Type Shifts, which can occur on any of the arguments of a given pattern. Such shifts take place when a Semantic Type is forced by the verb to be understood as a different one (which complies with its selectional requirements)
[Human] leggere [Document]
[Human1] leggere [Human2 = Writer]
System of Semantic Types
The System of Semantic Types used to classify the semantics of arguments is a hierarchy of general semantic categories obtained by manual clustering of the lexical items found in the argument positions of corpus-derived valency structures. The System currently contains ca. 200 Semantic Types that are hierarchically organised on the basis of the ‘is a’ (subsumption) relation (e.g. [Human] is an [Animate]).
Corpus Instances and GDEX
The reference corpus for the resource is the web corpus ItWac (reduced), provided by Sketch Engine. It contains around 935 million tokens.
GDEX examples are 5 examples taken from the reference corpus that instantiate each pattern and help you to better understand the pattern.