Geometric Distribution - Wikipedia

Not to be confused with Hypergeometric distribution.

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

  • The probability distribution of the number X {\displaystyle X} of Bernoulli trials needed to get one success, supported on N = { 1 , 2 , 3 , … } {\displaystyle \mathbb {N} =\{1,2,3,\ldots \}} ;
  • The probability distribution of the number Y = X − 1 {\displaystyle Y=X-1} of failures before the first success, supported on N 0 = { 0 , 1 , 2 , … } {\displaystyle \mathbb {N} _{0}=\{0,1,2,\ldots \}} .
Geometric
Probability mass function
Cumulative distribution function
Notation G e o m ( p ) {\displaystyle \mathrm {Geom} (p)}
Parameters 0 < p ≤ 1 {\displaystyle 0<p\leq 1} success probability (real) 0 < p ≤ 1 {\displaystyle 0<p\leq 1} success probability (real)
Support k trials where k ∈ N = { 1 , 2 , 3 , … } {\displaystyle k\in \mathbb {N} =\{1,2,3,\dotsc \}} k failures where k ∈ N 0 = { 0 , 1 , 2 , … } {\displaystyle k\in \mathbb {N} _{0}=\{0,1,2,\dotsc \}}
PMF ( 1 − p ) k − 1 p {\displaystyle (1-p)^{k-1}p} ( 1 − p ) k p {\displaystyle (1-p)^{k}p}
CDF 1 − ( 1 − p ) ⌊ x ⌋ {\displaystyle 1-(1-p)^{\lfloor x\rfloor }} for x ≥ 1 {\displaystyle x\geq 1} , 0 {\displaystyle 0} for x < 1 {\displaystyle x<1} 1 − ( 1 − p ) ⌊ x ⌋ + 1 {\displaystyle 1-(1-p)^{\lfloor x\rfloor +1}} for x ≥ 0 {\displaystyle x\geq 0} , 0 {\displaystyle 0} for x < 0 {\displaystyle x<0}
Mean 1 p {\displaystyle {\frac {1}{p}}} 1 − p p {\displaystyle {\frac {1-p}{p}}}
Median

⌈ − 1 log 2 ⁡ ( 1 − p ) ⌉ {\displaystyle \left\lceil {\frac {-1}{\log _{2}(1-p)}}\right\rceil }

(not unique if − 1 / log 2 ⁡ ( 1 − p ) {\displaystyle -1/\log _{2}(1-p)} is an integer)

⌈ − 1 log 2 ⁡ ( 1 − p ) ⌉ − 1 {\displaystyle \left\lceil {\frac {-1}{\log _{2}(1-p)}}\right\rceil -1}

(not unique if − 1 / log 2 ⁡ ( 1 − p ) {\displaystyle -1/\log _{2}(1-p)} is an integer)
Mode 1 {\displaystyle 1} 0 {\displaystyle 0}
Variance 1 − p p 2 {\displaystyle {\frac {1-p}{p^{2}}}} 1 − p p 2 {\displaystyle {\frac {1-p}{p^{2}}}}
Skewness 2 − p 1 − p {\displaystyle {\frac {2-p}{\sqrt {1-p}}}} 2 − p 1 − p {\displaystyle {\frac {2-p}{\sqrt {1-p}}}}
Excess kurtosis 6 + p 2 1 − p {\displaystyle 6+{\frac {p^{2}}{1-p}}} 6 + p 2 1 − p {\displaystyle 6+{\frac {p^{2}}{1-p}}}
Entropy − ( 1 − p ) log ⁡ ( 1 − p ) − p log ⁡ p p {\displaystyle {\tfrac {-(1-p)\log(1-p)-p\log p}{p}}} − ( 1 − p ) log ⁡ ( 1 − p ) − p log ⁡ p p {\displaystyle {\tfrac {-(1-p)\log(1-p)-p\log p}{p}}}
MGF p e t 1 − ( 1 − p ) e t , {\displaystyle {\frac {pe^{t}}{1-(1-p)e^{t}}},} for t < − ln ⁡ ( 1 − p ) {\displaystyle t<-\ln(1-p)} p 1 − ( 1 − p ) e t , {\displaystyle {\frac {p}{1-(1-p)e^{t}}},} for t < − ln ⁡ ( 1 − p ) {\displaystyle t<-\ln(1-p)}
CF p e i t 1 − ( 1 − p ) e i t {\displaystyle {\frac {pe^{it}}{1-(1-p)e^{it}}}} p 1 − ( 1 − p ) e i t {\displaystyle {\frac {p}{1-(1-p)e^{it}}}}
PGF p z 1 − ( 1 − p ) z {\displaystyle {\frac {pz}{1-(1-p)z}}} p 1 − ( 1 − p ) z {\displaystyle {\frac {p}{1-(1-p)z}}}
Fisher information 1 p 2 ( 1 − p ) {\displaystyle {\tfrac {1}{p^{2}(1-p)}}} 1 p 2 ( 1 − p ) {\displaystyle {\tfrac {1}{p^{2}(1-p)}}}

These two different geometric distributions should not be confused with each other. Often, the name shifted geometric distribution is adopted for the former one (distribution of X {\displaystyle X} ); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.

The geometric distribution gives the probability that the first occurrence of success requires k {\displaystyle k} independent trials, each with success probability p {\displaystyle p} . If the probability of success on each trial is p {\displaystyle p} , then the probability that the k {\displaystyle k} -th trial is the first success is

Pr ( X = k ) = ( 1 − p ) k − 1 p {\displaystyle \Pr(X=k)=(1-p)^{k-1}p}

for k = 1 , 2 , 3 , 4 , … {\displaystyle k=1,2,3,4,\dots }

The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:

Pr ( Y = k ) = Pr ( X = k + 1 ) = ( 1 − p ) k p {\displaystyle \Pr(Y=k)=\Pr(X=k+1)=(1-p)^{k}p}

for k = 0 , 1 , 2 , 3 , … {\displaystyle k=0,1,2,3,\dots }

The geometric distribution gets its name because its probabilities follow a geometric sequence. It is sometimes called the Furry distribution after Wendell H. Furry.[1]: 210 

Contents

  • 1 Definition
  • 2 Properties
    • 2.1 Memorylessness
    • 2.2 Moments and cumulants
      • 2.2.1 Proof of expected value
    • 2.3 Summary statistics
  • 3 Entropy and Fisher's information
    • 3.1 Entropy (geometric distribution, failures before success)
    • 3.2 Fisher's information (geometric distribution, failures before success)
    • 3.3 Entropy (geometric distribution, trials until success)
    • 3.4 Fisher's information (geometric distribution, trials until success)
    • 3.5 General properties
  • 4 Related distributions
  • 5 Statistical inference
    • 5.1 Method of moments
    • 5.2 Maximum likelihood estimation
    • 5.3 Bayesian inference
  • 6 Random variate generation
  • 7 Applications
  • 8 See also
  • 9 References

Tag » When To Use Geometric Distribution