Skip to contents

aspol and aspol_filtered which is analysis ready data set of housing policy documents. More info of format in https://universaldependencies.org/format.html

Usage

aspol

Format

A data frame with rows and columns:

kunta

Municipality name

sent

Sentence number per document/municipality

ID

Word index, integer starting at 1 for each new sentence

FORM

Word form or punctuation symbol

LEMMA

Lemma or stem of word form

UPOSTAG

Universal part-of-speech tag

XPOSTAG

Language-specific part-of-speech

FEATS

List of morphological features

HEAD

Head of the current word, which is either a value of ID or zero (0)

DEPREL

Universal dependency relation to the HEAD

DEPS

Enhanced dependency graph in the form of a list of head-deprel pairs

MISC

Any other annotation

doc

Document name read from

Source

Finnish municipalities

Examples

aspol
#> # A tibble: 451,660 × 13
#>    kunta   sent ID    FORM  LEMMA UPOSTAG XPOSTAG FEATS HEAD  DEPREL DEPS  MISC 
#>    <chr>  <int> <chr> <chr> <chr> <chr>   <chr>   <chr> <chr> <chr>  <chr> <chr>
#>  1 Enont…     1 1     Khall Khall PROPN   _       Case… 0     root   _     "_"  
#>  2 Enont…     1 2     19.4… 19.4… NUM     _       _     1     nmod   _     "_"  
#>  3 Enont…     1 3     $     $     PUNCT   _       _     4     punct  _     "_"  
#>  4 Enont…     1 4     126   126   NUM     _       NumT… 1     nummod _     "Spa…
#>  5 Enont…     2 1     (     (     PUNCT   _       _     2     punct  _     "Spa…
#>  6 Enont…     2 2     N     N     NOUN    _       Abbr… 0     root   _     "Spa…
#>  7 Enont…     3 1     Enon… Enon… PROPN   _       Case… 0     root   _     "Spa…
#>  8 Enont…     4 1     KUNTA kunta NOUN    _       Case… 0     root   _     "Spa…
#>  9 Enont…     5 1     VUOK… vuok… NOUN    _       Case… 2     nmod:… _     "Spa…
#> 10 Enont…     5 2     KEHI… kehi… NOUN    _       Case… 0     root   _     "Spa…
#> # ℹ 451,650 more rows
#> # ℹ 1 more variable: doc <chr>