Minimum-distance Estimation - Wikipedia

Method for fitting a statistical model to data

Minimum-distance estimation (MDE) is a conceptual method for fitting a statistical model to data, usually the empirical distribution. Often-used estimators such as ordinary least squares can be thought of as special cases of minimum-distance estimation.

While consistent and asymptotically normal, minimum-distance estimators are generally not statistically efficient when compared to maximum likelihood estimators, because they omit the Jacobian usually present in the likelihood function. This, however, substantially reduces the computational complexity of the optimization problem.

Definition

[edit]

Let X 1 , … , X n {\displaystyle \displaystyle X_{1},\ldots ,X_{n}} be an independent and identically distributed (iid) random sample from a population with distribution F ( x ; θ ) : θ ∈ Θ {\displaystyle F(x;\theta )\colon \theta \in \Theta } and Θ ⊆ R k ( k ≥ 1 ) {\displaystyle \Theta \subseteq \mathbb {R} ^{k}(k\geq 1)} .

Let F n ( x ) {\displaystyle \displaystyle F_{n}(x)} be the empirical distribution function based on the sample.

Let θ ^ {\displaystyle {\hat {\theta }}} be an estimator for θ {\displaystyle \displaystyle \theta } . Then F ( x ; θ ^ ) {\displaystyle F(x;{\hat {\theta }})} is an estimator for F ( x ; θ ) {\displaystyle \displaystyle F(x;\theta )} .

Let d [ ⋅ , ⋅ ] {\displaystyle d[\cdot ,\cdot ]} be a functional returning some measure of "distance" between its two arguments. The functional d {\displaystyle \displaystyle d} is also called the criterion function.

If there exists a θ ^ ∈ Θ {\displaystyle {\hat {\theta }}\in \Theta } such that d [ F ( x ; θ ^ ) , F n ( x ) ] = inf { d [ F ( x ; θ ) , F n ( x ) ] ; θ ∈ Θ } {\displaystyle d[F(x;{\hat {\theta }}),F_{n}(x)]=\inf\{d[F(x;\theta ),F_{n}(x)];\theta \in \Theta \}} , then θ ^ {\displaystyle {\hat {\theta }}} is called the minimum-distance estimate of θ {\displaystyle \displaystyle \theta } .

(Drossos & Philippou 1980, p. 121)

Statistics used in estimation

[edit]

Most theoretical studies of minimum-distance estimation, and most applications, make use of "distance" measures which underlie already-established goodness of fit tests: the test statistic used in one of these tests is used as the distance measure to be minimised. Below are some examples of statistical tests that have been used for minimum-distance estimation.

Chi-square criterion

[edit]

The chi-square test uses as its criterion the sum, over predefined groups, of the squared difference between the increases of the empirical distribution and the estimated distribution, weighted by the increase in the estimate for that group.

Cramér–von Mises criterion

[edit]

The Cramér–von Mises criterion uses the integral of the squared difference between the empirical and the estimated distribution functions (Parr & Schucany 1980, p. 616).

Kolmogorov–Smirnov criterion

[edit]

The Kolmogorov–Smirnov test uses the supremum of the absolute difference between the empirical and the estimated distribution functions (Parr & Schucany 1980, p. 616).

Anderson–Darling criterion

[edit]

The Anderson–Darling test is similar to the Cramér–von Mises criterion except that the integral is of a weighted version of the squared difference, where the weighting relates the variance of the empirical distribution function (Parr & Schucany 1980, p. 616).

Theoretical results

[edit]

The theory of minimum-distance estimation is related to that for the asymptotic distribution of the corresponding statistical goodness of fit tests. Often the cases of the Cramér–von Mises criterion, the Kolmogorov–Smirnov test and the Anderson–Darling test are treated simultaneously by treating them as special cases of a more general formulation of a distance measure. Examples of the theoretical results that are available are: consistency of the parameter estimates; the asymptotic covariance matrices of the parameter estimates.

See also

[edit]
  • Maximum likelihood estimation
  • Maximum spacing estimation

References

[edit]
  • Boos, Dennis D. (1982). "Minimum anderson-darling estimation". Communications in Statistics – Theory and Methods. 11 (24): 2747–2774. doi:10.1080/03610928208828420. S2CID 119812213.
  • Blyth, Colin R. (June 1970). "On the Inference and Decision Models of Statistics". The Annals of Mathematical Statistics. 41 (3): 1034–1058. doi:10.1214/aoms/1177696980.
  • Drossos, Constantine A.; Philippou, Andreas N. (December 1980). "A Note on Minimum Distance Estimates". Annals of the Institute of Statistical Mathematics. 32 (1): 121–123. doi:10.1007/BF02480318. S2CID 120207485.
  • Parr, William C.; Schucany, William R. (1980). "Minimum Distance and Robust Estimation". Journal of the American Statistical Association. 75 (371): 616–624. CiteSeerX 10.1.1.878.5446. doi:10.1080/01621459.1980.10477522. JSTOR 2287658.
  • Wolfowitz, J. (March 1957). "The minimum distance method". The Annals of Mathematical Statistics. 28 (1): 75–88. doi:10.1214/aoms/1177707038.
  • v
  • t
  • e
Statistics
  • Outline
  • Index
Descriptive statistics
Continuous data
Center
  • Mean
    • Arithmetic
    • Arithmetic-Geometric
    • Contraharmonic
    • Cubic
    • Generalized/power
    • Geometric
    • Harmonic
    • Heronian
    • Heinz
    • Lehmer
  • Median
  • Mode
Dispersion
  • Average absolute deviation
  • Coefficient of variation
  • Interquartile range
  • Percentile
  • Range
  • Standard deviation
  • Variance
Shape
  • Central limit theorem
  • Moments
    • Kurtosis
    • L-moments
    • Skewness
Count data
  • Index of dispersion
Summary tables
  • Contingency table
  • Frequency distribution
  • Grouped data
Dependence
  • Partial correlation
  • Pearson product-moment correlation
  • Rank correlation
    • Kendall's τ
    • Spearman's ρ
  • Scatter plot
Graphics
  • Bar chart
  • Biplot
  • Box plot
  • Control chart
  • Correlogram
  • Fan chart
  • Forest plot
  • Histogram
  • Pie chart
  • Q–Q plot
  • Radar chart
  • Run chart
  • Scatter plot
  • Stem-and-leaf display
  • Violin plot
Data collection
Study design
  • Effect size
  • Missing data
  • Optimal design
  • Population
  • Replication
  • Sample size determination
  • Statistic
  • Statistical power
Survey methodology
  • Sampling
    • Cluster
    • Stratified
  • Opinion poll
  • Questionnaire
  • Standard error
Controlled experiments
  • Blocking
  • Factorial experiment
  • Interaction
  • Random assignment
  • Randomized controlled trial
  • Randomized experiment
  • Scientific control
Adaptive designs
  • Adaptive clinical trial
  • Stochastic approximation
  • Up-and-down designs
Observational studies
  • Cohort study
  • Cross-sectional study
  • Natural experiment
  • Quasi-experiment
Statistical inference
Statistical theory
  • Population
  • Statistic
  • Probability distribution
  • Sampling distribution
    • Order statistic
  • Empirical distribution
    • Density estimation
  • Statistical model
    • Model specification
    • Lp space
  • Parameter
    • location
    • scale
    • shape
  • Parametric family
    • Likelihood (monotone)
    • Location–scale family
    • Exponential family
  • Completeness
  • Sufficiency
  • Statistical functional
    • Bootstrap
    • U
    • V
  • Optimal decision
    • loss function
  • Efficiency
  • Statistical distance
    • divergence
  • Asymptotics
  • Robustness
Frequentist inference
Point estimation
  • Estimating equations
    • Maximum likelihood
    • Method of moments
    • M-estimator
    • Minimum distance
  • Unbiased estimators
    • Mean-unbiased minimum-variance
      • Rao–Blackwellization
      • Lehmann–Scheffé theorem
    • Median unbiased
  • Plug-in
Interval estimation
  • Confidence interval
  • Pivot
  • Likelihood interval
  • Prediction interval
  • Tolerance interval
  • Resampling
    • Bootstrap
    • Jackknife
Testing hypotheses
  • 1- & 2-tails
  • Power
    • Uniformly most powerful test
  • Permutation test
    • Randomization test
  • Multiple comparisons
Parametric tests
  • Likelihood-ratio
  • Score/Lagrange multiplier
  • Wald
Specific tests
  • Z-test (normal)
  • Student's t-test
  • F-test
Goodness of fit
  • Chi-squared
  • G-test
  • Kolmogorov–Smirnov
  • Anderson–Darling
  • Lilliefors
  • Jarque–Bera
  • Normality (Shapiro–Wilk)
  • Likelihood-ratio test
  • Model selection
    • Cross validation
    • AIC
    • BIC
Rank statistics
  • Sign
    • Sample median
  • Signed rank (Wilcoxon)
    • Hodges–Lehmann estimator
  • Rank sum (Mann–Whitney)
  • Nonparametric anova
    • 1-way (Kruskal–Wallis)
    • 2-way (Friedman)
    • Ordered alternative (Jonckheere–Terpstra)
  • Van der Waerden test
Bayesian inference
  • Bayesian probability
    • prior
    • posterior
  • Credible interval
  • Bayes factor
  • Bayesian estimator
    • Maximum posterior estimator
  • Correlation
  • Regression analysis
Correlation
  • Pearson product-moment
  • Partial correlation
  • Confounding variable
  • Coefficient of determination
Regression analysis
  • Errors and residuals
  • Regression validation
  • Mixed effects models
  • Simultaneous equations models
  • Multivariate adaptive regression splines (MARS)
  • Template:Least squares and regression analysis
Linear regression
  • Simple linear regression
  • Ordinary least squares
  • General linear model
  • Bayesian regression
Non-standard predictors
  • Nonlinear regression
  • Nonparametric
  • Semiparametric
  • Isotonic
  • Robust
  • Homoscedasticity and Heteroscedasticity
Generalized linear model
  • Exponential families
  • Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance
  • Analysis of variance (ANOVA, anova)
  • Analysis of covariance
  • Multivariate ANOVA
  • Degrees of freedom
Categorical / multivariate / time-series / survival analysis
Categorical
  • Cohen's kappa
  • Contingency table
  • Graphical model
  • Log-linear model
  • McNemar's test
  • Cochran–Mantel–Haenszel statistics
Multivariate
  • Regression
  • Manova
  • Principal components
  • Canonical correlation
  • Discriminant analysis
  • Cluster analysis
  • Classification
  • Structural equation model
    • Factor analysis
  • Multivariate distributions
    • Elliptical distributions
      • Normal
Time-series
General
  • Decomposition
  • Trend
  • Stationarity
  • Seasonal adjustment
  • Exponential smoothing
  • Cointegration
  • Structural break
  • Granger causality
Specific tests
  • Dickey–Fuller
  • Johansen
  • Q-statistic (Ljung–Box)
  • Durbin–Watson
  • Breusch–Godfrey
Time domain
  • Autocorrelation (ACF)
    • partial (PACF)
  • Cross-correlation (XCF)
  • ARMA model
  • ARIMA model (Box–Jenkins)
  • Autoregressive conditional heteroskedasticity (ARCH)
  • Vector autoregression (VAR) (Autoregressive model (AR))
Frequency domain
  • Spectral density estimation
  • Fourier analysis
  • Least-squares spectral analysis
  • Wavelet
  • Whittle likelihood
Survival
Survival function
  • Kaplan–Meier estimator (product limit)
  • Proportional hazards models
  • Accelerated failure time (AFT) model
  • First hitting time
Hazard function
  • Nelson–Aalen estimator
Test
  • Log-rank test
Applications
Biostatistics
  • Bioinformatics
  • Clinical trials / studies
  • Epidemiology
  • Medical statistics
Engineering statistics
  • Chemometrics
  • Methods engineering
  • Probabilistic design
  • Process / quality control
  • Reliability
  • System identification
Social statistics
  • Actuarial science
  • Census
  • Crime statistics
  • Demography
  • Econometrics
  • Jurimetrics
  • National accounts
  • Official statistics
  • Population statistics
  • Psychometrics
Spatial statistics
  • Cartography
  • Environmental statistics
  • Geographic information system
  • Geostatistics
  • Kriging
  • Category
  • icon Mathematics portal
  • Commons
  • WikiProject

Tag » What Is The Minimum Distance