Skip to content

sylvia-ee/SE-WordNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Background

Word frequency consistently demonstrates a power-law distribution where the frequency $(f)$ of a word is approximately inversely proportional to its rank $(r)$ in a frequency table (Zipf, 1936).

$$ f(r) \propto \frac{1}{r^s} $$

Where:

  • $f(r)$ = frequency of the word with rank $(r) $
  • $r$ = rank of the word in a frequency table
  • $s$ = power-law exponent (typically $s \approx 1 $)

Why does this Zipfian distribution emerge? I present a definitional framework and accompanying code to investigate Manin (2008's) hypothesis that Zipf’s law arises from the semantic organization of language.

Specifically, I use Princeton Wordnet 3.0 to derive a quantitative measure of semantic organization from hyponymy and evaluate whether it predicts lemma frequencies in a Zipfian manner.

Definitions

FAQ

About

Schema for testing Manin (2007)'s hypothesis that word frequency is proportional to "semantic extent." Approximates semantic extent using WordNet 3.0's noun DAG.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors