### Sobol (Quasi Random) sequence simplified

A number of major investment banks, hedge funds and Investment research firms use Sobol as their primary Random number generator for pricing OTC products especially for low dimension problems.

There are several technical papers already available online on Sobol sequence generation algorithm. Few such links have been provided towards the end of the article. This article however, does not intend to be too technical but rather aims at providing a simplified 'Dummies' guide to implementing Sobol sequence generation.

Before getting into the heart of the algorithm, let us briefly describe the following concepts.

##### Bitwise XOR operator

Represented by a sign. BitXor is used to perform a logical exclusion on two expressions. For example,

##### Primitive polynomials over GS(2)

[GS(2) simply means that the coeffecients a_{k}(s) in the polynomial below can either be 0 or 1]

To generate sobol quasi random sequence, we'll have to source in primitive polynomials over GS(2) as input to the algorithm. Various researchers have already generated and published these polynomials and can be used directly (and we recommend) while implementing the algorithm. One such primitive polynomial table is published at A011260.

So what if we have these primitive polynomials already available with us, it is nevertheless good to understand what these are. At this stage you could read on for further information or skip this section if you are in a hurry.

So what is a primitive polynomial?

Consider an irreducible polynomial over GS(2) of degree s_{j}

x^{sj} + a_{1}x^{sj-1} + a2x^{sj-2+}.....+ a_{sj-1} x + 1

This polynomial is said to be primitive if it has the order 2^{sj} - 1.

[Order of a polynomial??? The order of a polynomial P(x) with P(0) <> 0 is the smallest integer e for which P(x) completely divides x^{e} + 1 . For example, consider 1 + x + x^{2} completely divides x^{3} - 1 and hence its order is 3 (3 = 2^{2} - 1,S_{j} = 2). ]

Given below is a list of a few primitive polynomials over GS(2).

##### s_{j} Primitive polynomials

1 1 + x

2 1 + x + x^{2}

3 1 + x + x^{3} , 1 + x^{2} + x^{3}

4 1 + x + x^{4} , 1 + x^{3} + x^{4}

5 1 + x^{2} + x^{5} , 1 + x + x^{2} + x^{3} + x^{5} , 1 + x^{3} + x^{5} , 1 + x + x^{3} + x^{4} + x^{5} ,

1 + x^{2} + x^{3} + x^{4} + x^{5} , 1 + x + x^{2} + x^{4} + x^{5}

##### Direction numbers vi(s) and initialisation number mi(s).

So now we understand what primitive polynomials are, what do we need next?

**Direction numbers!!!!**

To generate sobol low discrepancy quasi random sequence, we will now have to assume an initial set of initialisation numbers m^{i}(s) - m_{1}, m_{2}, m_{3}, . . . m_{sj}. (Remember s_{j} is also the degree of primitive polynomial we discussed previously). Each assumed m_{i} (i^{th} initialisation number), however, must satisfy two criterias - it should be an odd integer and it should be less than 2^{i}. That is, m_{1} < 2, m_{2} < 2^{2}, m_{3} < 2^{3} and so on.

However, this does not mean that any integer that satisfies these two criterias would generate a good quality low discrepancy sequence. In fact, the quality of the random number depends quite a bit on the assumed set of initialisation numbers. There are a few authors who have researched and arrived at a set of initial numbers which produce quality random numbers. See [3] for example.

The values for msj+1 and beyond are determined using the following recursive equation:

m_{k} = 2a_{1}m_{k-1} 2^{2}a_{2}m_{k-2} ... 2s^{j-1}a_{sj-1}m_{k-sj+1} 2^{sj} m_{k-sj} m_{k-sj}

where a_{k} are the coeffecients in the primitive polynomial as represented in x_{sj} + a_{1}x^{sj-1 +} a_{2}x^{sj-2}+.....+ a^{sk-1} x + 1.

**Example**

Suppose we have a primitive polynomial x^{3} + x + 1. (a_{1} = 0 and a_{2} = 1) so that the recursive equation for mi simplifies to the following:

m_{i} = 4m_{i-2} 8m_{i-3} m_{i-3},

At this stage we assume the initialisation values m_{1} = 1, m_{2} = 3, and m_{3} = 7.

We calculate m_{4}, m5 .. etc using the recursive equation above.

m_{4} = 12 8 1 = 5,

m_{5} = 28 24 3 = 7, and so on..

Now that we know how to calculate initialisation numbers m_{i}(s), we shall move on to understand how direction numbers are calculated.

Direction numbers v_{k} are simply defined by the equation v_{k} = m_{k}/2^{k}

So that, if m_{1} = 1, m_{2} = 3 , m_{3} = 7, m_{4} = 5, m_{5} = 7 then the direction numbers v_{1} = 0.5, v_{2} = 0.75, v_{3} = 0.875, v_{4} = 0.3125 and v_{5} = 0.21875.

##### The Gray code

They gray code for a number n is defined by the equation:

G(n) = n[n/2] [.] represents the integral part function

Using the above equation we get, G(1) = 1, G(2) = 3, G(3) = 2, G(4) = 6, G(5) = 7 and so on..

##### The Gray code algorithm

The Sobol low discrepancy sequence can now be generated using the equation:

x^{n} = g_{1}v_{1} g_{2}v_{2} g_{3}v_{3} ........

where ....g_{3}g_{2}g_{1} is the binary representation of the G(n)

Using this we get, x^{1} = 0.5

x^{2} = 0.5 XORoperator0.75 = 0.25 [G(2) = 3]

x^{3} = 0.75 [G(3) = 2]

x^{4} = 0.75 0.875 = 0.125 [G(4) = 6] .. etc and so on.

##### A still faster method

So now we know how to generate the sequence. But there is still a more efficient way of implementing the alorithm. This was shown by Antonov and Saleev [2]. Before that let us define one more sequece {c_{i}}

c_{i} = index of the first 0 digit from the right in the binary representation of i = (. . . *i _{3}i_{2}i_{1})2*. We have c

_{0}= 1, c

_{1}= 2, c

_{2}= 1, c

_{3}= 3, c

_{4}= 1, c

_{5}= 2, etc.

The sequence can now be generated using the following equation :

x^{n+1} = x^{n} v_{cn}

As we will show, the above equation also generates the same sequence as that generated using the equation mentioned in the previous section.

The starting point of the sequence is assumed to be x^{0} = 0.

x_{1} = v_{c0} = v_{1} = 0.5

x_{2} = 0.5 v_{c1} = 0.5 v_{2} = 0.5 0.75 = 0.25

x_{3} = 0.25 v_{c2} = 0.25 v_{1} = 0.25 0.5 = 0.75

x_{4} = 0.75 v_{c3} = 0.75 v_{3} = 0.75 0.875 = 0.125

##### Higher dimension problems

We showed you how to generate sobol sequence for one dimension. So how do we generate this sequence for more dimensions?

The answer is simple. For each dimension use one primitive polynomial, assume a set of initialisation numbers based on the criteria above, generate rest of the initialisation numbers using the recursive equation described before, calculate direction numbers vi(s) and finally use the equation stated in the previous section to generate the sequence.

##### Using Sobol low discrepancy sequence for Monte Carlo simulations

A lot of Monte Carlo based pricers use sobol for spot path generations. It would thus be justified to make a final comment, before closing this article, on how to use sobol within the monte carlo closed pricer.

Lets suppose we need to simulate spot paths for 'm' underlyings with 'n' time steps. Simply generate the sobol sequences for mxn dimensions and use random numbers from a unique dimension for each time step and underlying.

##### Random Number Generator (xll) Excel addin

DeltaQuants' Random Number Generator library can be downloaded from here. The library has been implemented in C++. The library documentation as well as other excel libraries can be downloaded from here.

##### References

- Bratley, P. and Fox, B. L. (1988), "Algorithm 659: Implementing Sobol’s quasirandom sequence generator".
- Antonov, I.A. and Saleev, V.M. (1979) "An economic method of computing LPτ-sequences".
*Zh. Vych. Mat. Mat. Fiz*. 19: 243–245 (in Russian);*U.S.S.R Comput. Maths. Math. Phys*. 19: 252–256 (in English). - http://web.maths.unsw.edu.au/~fkuo/sobol/
- SOBOL’, I. M. Points which uniformly fill a multidimensional cube. Math. Cybern. 2 (1985) Znanie, Moscow (in Russian).
- S. Joe and F. Y. Kuo, Remark on Algorithm 659: Implementing Sobol's quasirandom sequence generator, ACM Trans. Math. Softw. 29, 49-57 (2003).