Given a sequence of real numbers c_1, ldots, c_n, that a_1, a_2, ldots, a_n are iid Bernoulli random variables, taking the values of {+1, -1}, we want to estimate the following expression:

sup_{x in RR} P(|a_1 c_1 + ldots + a_n c_n – x| < 1)

This innocent looking expression basically asks for the mode of the nth partial sum probability distribution, if we assume that the two sequences c_i, a_i, are infinite.

When all the a_1 are equal and nonzero, this supreme is exactly n^{-1/2} asymptotically. It is also known that if a_i form a arithmetic progression, then the supremum is asymptotically n^{-3/2} up to a constant factor.

The classical result of Littlewood and Offord states that for any sequence c_i, the supremum above is bounded above by n^{-1/2}. The idea is combinatorial and uses a notion called antichain in set theory. Basically an antichain is a collection of subsets of a set A such that there is no containment relation between any two elements. One would get an estimate on the maximum number of antichains of a set A in terms of its cardinality |A|, then observe that the collection of A subset {1,2,ldots, n} such that

sum_{i in A} c_i – sum_{i notin A} c_i in (x-1, x+1)

form an antichain. (See Tao and Vu’s Additive Combinatorics for detail)

The analogue I considered is on a compact Lie group like SU(2), or SO(3). But to state the problem properly, we need to make some accomodation in this noncommutative setting.

First we notice that multiplication is no longer commutative. Thus a sum of the form

sum_{i=1}^n a_i c_i,

c_i in SO(3) no longer makes sense unless we interpret it as ordered sum. But then the order has to be fixed in advance. Alternatively one could consider a probability measure on the set of all words in two generators (x,y) of length n, i.e., words of the form

z_1 z_2 z_3 ldots z_n

where z_i in {x,x^{-1}, y, y^{-1}} can be either x, y, or their inverse, with equal probability. Thus each particular word has probability 4^{-n}. Under this setup, we basically have a discrete time, continuous state space markov chain on SO(3) with each transition step given by 1/4(x + x^{-1} + y + y^{-1}), where x for example denotes multiplication by x.

Now we have two cases to consider:

1. if the x, y topologically span the entire group SO(3), then we know that the Markov chain is ergodic, hence by Markov chain theory, the running distribution converges to the unique stationary distribution, which in this case is the Haar distribution on SO(3). This is checked as follows: if we start with the Haar distribution, then convolving with one step of the Markov kernel is equivalent to the averaging of the convolution with each of the four multipliers, each of which clearly preserves the Haar distribution since Haar measure is by definition invariant under left or right multiplication. Thus as n goes to infinity, the following probability

P(w(x,y) in B(A,epsilon))

where A is an element of SO(3) and w(x,y) denotes a uniformly chosen unreduced word described above in terms of the Markov chain, is constant with respect to A. Hence the supremum approaches mu(B(A,epsilon)), where mu is the Haar probability measure.

2. if x,y topologically span a lower dimensional subgroup of SO(3), then we would simply get that the supremum equals mu_1(B(A,epsilon)) for A in the lower dimensional subgroup (which in the case of SO(3) can only be SO(2) which is the circle) and mu_1 is the Haar probability measure of this lower dimensional subgroup.

Next we remark that the uniform distribution on the unreduced words of length n is somewhat unintelligent because many words will be equivalent at the free group level; for example, x x^{-1}= id. So we could consider instead the uniform distribution on the reduced words, which are of the form

x^{k_1} y^{k_2} ldots x^{k_n} y ^{k_{n+1}}, where k_i are any nonnegative integer. The alternating nature of x and y eliminates the possibility of adjacent cancellation.

One observation I just made is that the set of reduced words can be obtained as the nth running state of a Markov chain on a space of four elements, namely x, x^{-1}, y, y^{-1}. The rule of the transition is then given by, if the current state is x, then the next one cannot be x^{-1}, and similarly for the other states, namely stupid adjacent cancellation is not allowed but everthing is else is ok.

We can then consider the augmented state space ZZ/4ZZ times SO(3), with the obvious extended version of the Markov chain considered before. The nth running distribution starting at the identity element is obviously going to be a uniformly chosen reduced word of length n filled out by x and y. The only thing to worry about is whether this Markov chain is ergodic. Thus we need to demonstrate that if x,y span SO(3) topologically, then for any element A in SO(3), one could come up with a reduced word ending in any of the four elements x, y, x^{-1}, y^{-1} that’s arbitrarily close to A. But this is trivial because we can simply reduce to the case of x^{-1} A etc in the unreduced case, and the ergodicity is proved already in that case. So by standard abstract nonsense of Markov Chain theory, we again conclude that the nth running distribution of this augmented Markov chain converges to the uniform Haar measure on SO(3), since obviously the uniform measure on ZZ/ 4ZZ times Haar(SO(3)) is stationary with respect to this Markov transition kernel, as can be checked by symmetry consideration. So in conclusion, in either of the two notions of noncommutative Littlewood Offord problem, the answer is trivially proportional to the Haar measure of the target set in question, assuming nondegeneracy of the summand set.