aspol and aspol_filtered which is analysis ready data set of housing policy documents. More info of format in https://universaldependencies.org/format.html
Format
A data frame with rows and columns:
- kunta
Municipality name
- sent
Sentence number per document/municipality
- ID
Word index, integer starting at 1 for each new sentence
- FORM
Word form or punctuation symbol
- LEMMA
Lemma or stem of word form
- UPOSTAG
Universal part-of-speech tag
- XPOSTAG
Language-specific part-of-speech
- FEATS
List of morphological features
- HEAD
Head of the current word, which is either a value of ID or zero (0)
- DEPREL
Universal dependency relation to the HEAD
- DEPS
Enhanced dependency graph in the form of a list of head-deprel pairs
- MISC
Any other annotation
- doc
Document name read from
Examples
aspol
#> # A tibble: 451,660 × 13
#> kunta sent ID FORM LEMMA UPOSTAG XPOSTAG FEATS HEAD DEPREL DEPS MISC
#> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Enont… 1 1 Khall Khall PROPN _ Case… 0 root _ "_"
#> 2 Enont… 1 2 19.4… 19.4… NUM _ _ 1 nmod _ "_"
#> 3 Enont… 1 3 $ $ PUNCT _ _ 4 punct _ "_"
#> 4 Enont… 1 4 126 126 NUM _ NumT… 1 nummod _ "Spa…
#> 5 Enont… 2 1 ( ( PUNCT _ _ 2 punct _ "Spa…
#> 6 Enont… 2 2 N N NOUN _ Abbr… 0 root _ "Spa…
#> 7 Enont… 3 1 Enon… Enon… PROPN _ Case… 0 root _ "Spa…
#> 8 Enont… 4 1 KUNTA kunta NOUN _ Case… 0 root _ "Spa…
#> 9 Enont… 5 1 VUOK… vuok… NOUN _ Case… 2 nmod:… _ "Spa…
#> 10 Enont… 5 2 KEHI… kehi… NOUN _ Case… 0 root _ "Spa…
#> # ℹ 451,650 more rows
#> # ℹ 1 more variable: doc <chr>