Statistics

Things and Stuff Wiki - An organically evolving personal wiki knowledge base. An on-the-fly taxonomy containing a patchwork trail of topic outlines, descriptions, notes, stubs and breadcrumbs, with links to sites, systems, software, manuals, organisations, people, articles, guides, slides, papers, books, comments, videos, screencasts, webcasts, scratchpads and more. Content is orientated towards mostly free/libre/open, mostly Linux. Quality and age varies drastically. Sometimes old things are first, sometimes last. Use the Table of Contents menu to navigate long pages. Zoom in if text is too small. Dead link? Wayback Machine. I probably need to fix the theme CSS after an update. See also libreav.org. Chat to msg me (not checking tho atm). e

General

https://en.wikipedia.org/wiki/Statistics - the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.[3][4][5] In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments

Probability theory

https://en.wikipedia.org/wiki/Probability_theory

https://news.ycombinator.com/item?id=10319698

https://en.wikipedia.org/wiki/Gaussian_process - a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those (infinitely many) random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

A Visual Exploration of Gaussian Processes - [1]

Statistics

https://en.wikipedia.org/wiki/Mathematical_statistics

https://en.wikipedia.org/wiki/Probability_distribution

https://en.wikipedia.org/wiki/Median

https://en.wikipedia.org/wiki/Partition_function_(statistical_mechanics) - describes the statistical properties of a system in thermodynamic equilibrium. Partition functions are functions of the thermodynamic state variables, such as the temperature and volume. Most of the aggregate thermodynamic variables of the system, such as the total energy, free energy, entropy, and pressure, can be expressed in terms of the partition function or its derivatives. The partition function is dimensionless, it is a pure number.

https://en.wikipedia.org/wiki/Partition_function_(quantum_field_theory) - the generating functional of all correlation functions, generalizing the characteristic function of probability theory.

https://en.wikipedia.org/wiki/Principal_component_analysis - a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors (each being a linear combination of the variables and containing n observations) are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables.

https://en.wikipedia.org/wiki/Likelihood_function

Ergodicity, what's it mean - Avoid Boring People - [2]

What is ergodicity? - Alex Adamou - YouTube -

https://github.com/jtleek/datasharing

https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python [3]

http://www.mcgrayne.com/the_theory_that_would_not_die__how_bayes__rule_cracked_the_enigma_code__hunted_d_107493.htm [4]

http://blogs.scientificamerican.com/cross-check/bayes-s-theorem-what-s-the-big-deal/ [5]

YouTube: Bayes theorem, the geometry of changing beliefs

https://en.wikipedia.org/wiki/Likert_scale - a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term (or more fully the Likert-type scale) is often used interchangeably with rating scale, although there are other types of rating scales.The scale is named after its inventor, psychologist Rensis Likert. Likert distinguished between a scale proper, which emerges from collective responses to a set of items (usually eight or more), and the format in which responses are scored along a range. Technically speaking, a Likert scale refers only to the former.[4][5] The difference between these two concepts has to do with the distinction Likert made between the underlying phenomenon being investigated and the means of capturing variation that points to the underlying phenomenon.

https://en.wikipedia.org/wiki/Principal_component_analysis
- http://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues

http://www.refsmmat.com/statistics/

http://spin.atomicobject.com/2015/02/12/central-limit-theorem-intro/ [6]

http://www.edwardtufte.com/tufte/gould

https://scientistseessquirrel.wordpress.com/2015/02/09/in-defence-of-the-p-value/

http://www.uv.es/sestio/TechRep/tr14-03.pdf [7]

https://news.ycombinator.com/item?id=9544987

https://news.ycombinator.com/item?id=11753805

https://en.wikipedia.org/wiki/Frequency_domain - refers to the analysis of mathematical functions or signals with respect to frequency, rather than time. Put simply, a time-domain graph shows how a signal changes over time, whereas a frequency-domain graph shows how much of the signal lies within each given frequency band over a range of frequencies. A frequency-domain representation can also include information on the phase shift that must be applied to each sinusoid in order to be able to recombine the frequency components to recover the original time signal.

YouTube: Microscopy: Fourier Space (Bo Huang)

https://en.wikipedia.org/wiki/Simpson's_paradox - a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.This result is often encountered in social-science and medical-science statistics and is particularly problematic when frequency data is unduly given causal interpretations. The paradox can be resolved when causal relations are appropriately addressed in the statistical modeling. Simpson's paradox has been used as an exemplar to illustrate to the non-specialist or public audience the kind of misleading results mis-applied statistics can generate. Martin Gardner wrote a popular account of Simpson's paradox in his March 1976 Mathematical Games column in Scientific American.

https://en.wikipedia.org/wiki/Kriging - or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances. Under suitable assumptions on the priors, kriging gives the best linear unbiased prediction of the intermediate values. Interpolating methods based on other criteria such as smoothness (e.g., smoothing spline) may not yield the most likely intermediate values. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as Wiener–Kolmogorov prediction, after Norbert Wiener and Andrey Kolmogorov. Example of one-dimensional data interpolation by kriging, with confidence intervals. Squares indicate the location of the data. The kriging interpolation, shown in red, runs along the means of the normally distributed confidence intervals shown in gray. The dashed curve shows a spline that is smooth, but departs significantly from the expected intermediate values given by those means.

The theoretical basis for the method was developed by the French mathematician Georges Matheron in 1960, based on the Master's thesis of Danie G. Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa. Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes. The English verb is to krige and the most common noun is kriging; both are often pronounced with a hard "g", following an Anglicized pronunciation of the name "Krige". The word is sometimes capitalized as Kriging in the literature.Though computationally intensive in its basic formulation, kriging can be scaled to larger problems using various approximation methods.

Psychometrics

https://en.wikipedia.org/wiki/Psychometrics

Statistics

Contents

General

Probability theory

Statistics

Psychometrics

Navigation menu

Search