zipf.Rd
A very short function that reproduces Zipf's law: a harmonic rank-probability distribution, formally
$$p(i)=\frac{i^{-1}}{\sum_{i=1}^{N} i^{-1}},\qquad i=1,\ldots,N$$
The volleyball dataset might reasonably be assumed to be zipf, but one can reject this hypothesis at \(5\%\), see the examples.
zipf(n)
Returns a numeric vector summing to one
zipf(icons)
#> NB L PB THC OA WAIS
#> 0.40816327 0.20408163 0.13605442 0.10204082 0.08163265 0.06802721
knownp.test(volleyball,zipf(volleyball))
#>
#> Constrained support maximization
#>
#> data: volleyball
#> null hypothesis: p1 = 0.3534858, p2 = 0.1767429, p3 = 0.1178286, p4 = 0.0883714, p5 = 0.0706972, p6 = 0.0589143, p7 = 0.050498, p8 = 0.0441857, p9 = 0.0392762
#> null estimate:
#> p1 p2 p3 p4 p5 p6 p7
#> 0.35348576 0.17674288 0.11782859 0.08837144 0.07069715 0.05891429 0.05049797
#> p8 p9
#> 0.04418572 0.03927620
#> (argmax, constrained optimization)
#> Support for null: -33.33813 + K
#>
#> alternative hypothesis: sum p_i=1
#> alternative estimate:
#> p1 p2 p3 p4 p5 p6
#> 4.174629e-01 7.590639e-02 4.352590e-01 1.051670e-06 9.634895e-04 1.016641e-06
#> p7 p8 p9
#> 1.000184e-06 2.904115e-02 4.136399e-02
#> (argmax, free optimization)
#> Support for alternative: -25.32838 + K
#>
#> degrees of freedom: 8
#> support difference = 8.009745
#> p-value: 0.04210199
#>