This page describes the general schema of the Ġabra database. Since the database is based on JSON, this is not a schema in the traditional sense; it is rather a set of guidelines for what fields can be contained in each collection.
_id
fields are not included here.lexemes
Field | Type | Description | Example / allowed values |
---|---|---|---|
lemma |
string | Main lemma Required |
"bahrad" ,"kiteb" ,"ħarġa" |
alternatives |
array | List of spelling alternatives | ["bahraġ"] |
pos |
string | Part of speech | "ADJ" / "ADP" / "ADV" / "AUX" / "CONJ" / "DET" / "INTJ" / "NOUN" / "NUM" / "PART" / "PRON" / "PROPN" / "PUNCT" / "SCONJ" / "SYM" / "VERB" / "X" |
sources |
array | Source keys | ["Spagnol2011","Falzon2013"] |
glosses |
array | English glosses, with examples | |
root |
object | Root of entry | {"radicals":"k-t-b"} ,{"radicals":"b-ħ-b-ħ","variant":2} |
headword |
object | Headword for entry | {"lemma":"abbozz","pos":"NOUN"} |
form |
string | General form | "mimated" / "comparative" / "verbalnoun" / "diminutive" / "participle" / "accretive" |
derived_form |
integer | Derived form of verb (1–10) | |
gender |
string | "m" / "f" |
|
transitive |
boolean | ||
intransitive |
boolean | ||
ditransitive |
boolean | ||
hypothetical |
boolean | ||
archaic |
boolean | ||
multiword |
boolean | ||
pending |
boolean | Flagged as incorrect or new suggestion | |
phonetic |
string | Phonetic description of lemma | "'skrɛjjɛn" |
apertium_paradigm |
string | Name of paradigm in Apertium lexicon | "epi/ku__adj" |
onomastic_type |
string | Onomastic type (proper nouns) | "toponym" / "organisation" / "anthroponym" / "cognomen" / "other" |
comment |
string | General comment |
Source: lexeme.json
Last updated 2020-10-27T19:07:17.847Z
wordforms
Field | Type | Description | Example / allowed values |
---|---|---|---|
lexeme_id |
object | Should be a valid ID in lexemes collection Required |
|
surface_form |
string | Surface form Required |
"skrejjen" |
alternatives |
array | List of spelling alternatives | ["doxxa","duxxa"] |
gloss |
string | English gloss | |
sources |
array | Source keys | ["Spagnol2011","Falzon2013"] |
gender |
string | m (masculine), f (feminine), mf (both masculine and feminine) | "m" / "f" / "mf" |
number |
string | sg (singular), dl (dual), sgv (singulative), coll (collective), sp (both sg and pl), pl (plural), pl_ind (indeterminate plural - probably not needed), pl_det (determinate plural), pl_pl (plural of plural) | "sg" / "dl" / "pl" / "sgv" / "coll" / "sp" / "pl_ind" / "pl_det" / "pl_pl" |
plural_form |
string | Plural type | "counted" |
subject |
null,object | Subject agreement (verbs) | {"person":"p3","number":"sg","gender":"m"} |
dir_obj |
null,object | Direct object agreement | {"person":"p3","number":"pl"} |
ind_obj |
null,object | Indirect object agreement | {"person":"p1","number":"pl"} |
possessor |
null,object | Agreement for nouns which inflect for possessive | {"person":"p3","number":"sg","gender":"m"} |
form |
string | General morphological form | "comparative" / "superlative" / "diminutive" / "interrogative" / "mimated" / "verbalnoun" |
aspect |
string | Aspect (verbs) | "perf" / "impf" / "imp" / "pastpart" / "prespart" |
polarity |
string | "pos" / "neg" |
|
stem |
string | ||
phonetic |
string | Phonetic transcription | "'skrɛjjɛn" |
pattern |
string | Vowel-consonant pattern | "CCVVCVC" |
hypothetical |
boolean | ||
archaic |
boolean | ||
generated |
boolean | ||
pending |
boolean | Flagged as incorrect or new suggestion |
Source: wordform.json
Last updated 2020-01-27T08:24:49.637Z
roots
Field | Type | Description | Example / allowed values |
---|---|---|---|
radicals |
string | Radicals separated with hyphens Required |
"k-t-b" ,"ċ-p-ċ-p" |
variant |
integer | For distinguishing different roots with same radicals | |
alternatives |
string | Alternative roots or cross-reference | "b-h-r-d" ,"see h-ż-ż" |
type |
string | Root class Required |
"strong" / "geminated" / "weak-initial" / "weak-medial" / "weak-final" / "irregular" |
sources |
array | Source keys (all roots come from Spagnol2011) Required |
["Spagnol2011"] |
Source: root.json
Last updated 2019-06-20T17:55:50.022Z
sources
Field | Type | Description | Example / allowed values |
---|---|---|---|
key |
string | Key Required |
"Spagnol2011" |
author |
string | Full author name | "Michael Spagnol" |
title |
string | Title of resource | "A Tale of Two Morphologies. Verb structure and argument alternations in Maltese" |
year |
integer | Year of release | 2011 |
note |
string | General note | "Germany: University of Konstanz dissertation" |
Source: source.json
Last updated 2019-06-20T17:55:50.022Z
logs
Field | Type | Description | Example / allowed values |
---|---|---|---|
collection |
string | Collection Required |
"lexemes" / "wordforms" / "roots" |
object_id |
ObjectId | Must be valid ID in collection Required |
|
date |
ISODate | Date/time of edit Required |
|
action |
string | Type of edit | "created" / "modified" / "deleted" |
username |
string | Username of user making edit Required |
"john.camilleri" |
ip |
string | IP address of user making edit | "192.168.0.1" |
new_value |
object | New document, if available Required |
Source: log.json
Last updated 2019-06-20T17:55:50.022Z
Note: Because of a previous bug, the collection
field may erroneously show 'lexemes' instead of 'wordforms', in particular for deletions.
See: https://universaldependencies.org/u/pos/
Tag | Description |
---|---|
ADJ | adjective |
ADP | adposition |
ADV | adverb |
AUX | auxiliary verb |
CONJ | coordinating conjunction |
DET | determiner |
INTJ | interjection |
NOUN | noun |
NUM | numeral |
PART | particle |
PRON | pronoun |
PROPN | proper noun |
PUNCT | punctuation |
SCONJ | subordinating conjunction |
SYM | symbol |
VERB | verb |
X | other |