Kim K Tang on Automated Trading: Ist KDB+/Q really that fast?

Let's try to answer this question by implementing the algorithm to generate gaussian distributed random numbers. There are actually two ways using box-muller transform:

Polar form: Given u and v, independent and uniformly distributed in the closed interval [−1, +1], set s = R² = u² + v². If s = 0 or s ≥ 1, throw u and v away and try another pair (u, v).
After that calculate (u;v)*sqrt -2f*log[s] % s
Let's see the implementation in q:
polarform:{:x#1_raze{ u*sqrt -2f*log[s] % s:{u$u::-1+2?2.0}/[1<;2]}\[`int$ x % 2 ;0] };

Basic form: Given u and v, independent and uniformly distributed on (0, 1], just calculate:
sqrt[-2f*log u]*(sin r;cos r:2f*

3.14159*v)

Let's see the implementation in q:
basicform:{hn:`int$ x % 2;pi2:2f*3.14159;
                   x#raze exec s*f,s*g from select s:sqrt -2.0*log u1,f:cos pi2 * u2,g:sin pi2 * u2 from ([]u1:hn?1.0;u2:hn?1.0)};

Executing the command in q with (value "\\t polarform 1000000";value "\\t basicform 1000000")
i got on my machine: 3277 155.

The implementation in basic form is about 21 times faster. Why is this the case?
Actually i dont know the answer. Most people will say the basic form avoids using scan and over. And this seems to be the bottleneck in the algorithm. Kdb+/Q is fast for problems that can be vectorized.

And is the implementation in basic form really that fast?
In c++ boost library there is a function to generate gaussian random numbers. Let's extend kdb+ with this function:

kx::K randn(kx::K k)
{
    kx::vector<kx::qtype::float_> result(kx::value<kx::qtype::int_>(k));
    boost::random::normal_distribution<double> dist;
    std::generate(result.begin(),result.end(),boost::phoenix::bind(dist,gen));

    return result();
}

Now load this function in kb+ with

randn: dll 2: (`$"randn";1)

and compare the speed from the boost library against the basic form using
(value "\\t .bst.randn 1000000";value "\\t basicform 1000000").
The result is 161 160. The basic form is as fast as the function in boost.

Let's increase the number of random numbers to 5 million:

command    time
"\\t basicform 5000000"    908
"\\t polarform 5000000"    17595
"\\t .bst.randn 5000000"    795

There is a little improvement using the function from boost.

Can we still improve the speed?

Actually the cuda library thrust has a function to generate gaussian random numbers.

struct nrd : public thrust::unary_function<unsigned int,float> {   __host__ __device__   float operator()(unsigned int thread_id)   {     unsigned int seed = hash(thread_id);     thrust::minstd_rand rng(seed);     //thrust::default_random_engine rng(seed);     thrust::random::experimental::normal_distribution<float> u(0.0f,1.0f);     return u(rng);   }; };

Now let's extend kdb+ with this function.

As you can see from the table the fastest one is using the function from cuda:

command time "

t basicform 5000000" 908 "

t polarform 5000000" 17595 "

t .cda.randn 5000000" 596 "

t .bst.randn 5000000" 795

Kim K Tang on Automated Trading

Dienstag, 11. Oktober 2011

Ist KDB+/Q really that fast?

Keine Kommentare:

Kommentar veröffentlichen

Über mich

Blog-Archiv