In this section we re-present Bolthausen’s original proof of Berry Esseen using Stein’s method and fill in some computational details. I hope to make the argument more transparent to people trying to learn Stein’s method and Berry-Esseen theorem. The letter will denote a constant that varies from line to line, but will be a fixed absolute constant.
The classical Berry-Esseen theorems gives bound between the normal distribution function and the cumulative distribution function of , where are iid with mean 0 variance 1 and . It states
where is universal. Various efforts have been spent on finding the optimal , which is certainly less than . But we are only interested in the existence of such a .
First we can restate (1)as follows. Define
- is the collection of all length sequences of iid variables with prescribed first three absolute moments, ,,and respectively.
- is the indicator that , and
- , where is the standard normal density function, is the same as .
As a check of your understanding, the above supremum is bounded by in (1).
Next we want to get a recursive bound on etc. This will be achieved by manipulating the additive structure . In fact if we can prove
where we can choose a bigger on the right hand side, because the geometric sum can be bounded by , for some (the geometric decay dominates the growth). To bound , we will introduce some continuous approximations of the heavyside function . We define
where means taking minimum of the two sides, and maximum. In words, is the same as for and linearly interpolates and in between. We also denote , as the natural functional limit.
We also define the direct analogue of in the -smoothed case,
The last bound uses the fact that the normal density is bounded uniformly by , and that is supported on the interval and bounded by . The case is treated similarly. Hence taking the supremum over we get (8).
Now we can make further abstraction of (3) by defining , then we can rewrite
Thus we are reducing the problem of bounding difference of distribution functions to that of bounding the associated Stein operator. From now on we will write for , and when , we will write out explicitly.
which combined with (11) gives . By symmetry, the above inequalities hold for as well. So in summary we have , , and are all universally bounded.
Also from the definition , we get by a simple change of variable
where (note the difference from ). Therefore (19)becomes
The next comes the crucial steps in establishing the recursive bound (4). We will bound and the later integral separately:
- Using (18),
We use two comparisons with the Gaussian distribution to get, for any constant ,
Here we again use the uniform bound of the normal density. Observe that the final bound is in terms of instead of . This is due to (17)and nicely relates ‘s to ‘s. Also note that
Thus by conditioning first on , we get
- Now we bound the second term in (19) similarly, using (18)again:
This is the only place where is used.
So in sum, we have proved that
because by Holder’s inequality. Recalling (10), and the relation between and , we can take the supremum over all for a fixed , and , to get
Finally using the bound (8)between and , we can relate with :
It remains then to optimize over . Choosing , we arrive at (4), which concludes the proof of Berry-Esseen’s theorem.