Pine
An intuitive web interface for Treebanking.
Built by Alexander Gottlieb.
Terminology
Pine uses Dependency Grammar.
Parent, Head - The word depended on by a child/daughter/descendent through a dependency relation.
Child, Daughter, Descendent = The dependent of a parent/head word.
Artificial Root = The topmost node of a dependency tree. This is not a word in the sentence but is simply drawn to respect the rule that every word must have exactly one parent. See the Universal Dependencies explanation
Root Descendent = The first word descending from the artificial root, usually a verb.
File Support
Pine supports uploading treebanks in the Universal Dependencies CoNLL-U format (.conllu).
When importing a treebank with partial or invalid sentences, some words may be altered to match the CoNLL-U specification. In particular:
- any orphan words (words with no parent specified) are attached to the root descendent
- if multiple words share the artificial root, the first is chosen as the root descendent and all others will descend from this
Multiword Tokens
Multiword tokens are not supported. If a file is imported that contains a multiword token, it will be imported as multiple ordinary words. In this example from Universal Dependencies: 'Vámonos' would be ignored, while 'vamos' and 'nos' would be imported as separate words.
1-2 vámonos _
1 vamos ir
2 nos nosotros
Empty Nodes
Empty nodes (ellipsis) have limited support. They can be imported and exported but not viewed or edited with Pine. This means you may upload and edit a CoNLL-U file containing empty nodes and they will appear in the outputted CoNLL-U file when you come to export.
If you delete a word which precedes 1 or more empty nodes, the empty nodes will be deleted as well. To use an example from Universal Dependencies:
1 Sue Sue
2 likes like
3 coffee coffee
4 and and
5 Bill Bill
5.1 likes like
6 tea tea
Deleting word 5 'Bill' will also delete empty node 5.1 'likes'.
Annotation Data
Pine currently has limited support for the following annotations: Features (FEATS), Enhanced Dependencies (DEPS), Miscellaneous (MISC). These fields will be imported and exported with existing CoNLL-U data but cannot be edited.