c++ - Making templatized optimization more maintainable -

sometimes piece of code can better-optimized compiler using templatized internal implementation invariant. example, if have known number of channels in image, instead of doing like:

image::dooperation() {     (unsigned int = 0; < numpixels; i++) {         (unsigned int j = 0; j mchannels; j++) {             // ...         }     } }

you can this:

template<unsigned int c> image::dooperationinternal() {     (unsigned int = 0; < numpixels; i++) {         (unsigned int j = 0; j < c; j++) {             // ...         }     } }  image::dooperation() {     switch (mchannels) {         case 1: dooperation<1>(); break;         case 2: dooperation<2>(); break;         case 3: dooperation<3>(); break;         case 4: dooperation<4>(); break;     } }

which allows compiler generate different unrolled loops different channel counts (which can in turn vastly improve runtime efficiency , open different optimizations such simd instructions , forth).

however, can expand large case statements, , method has been optimized in way must have unrolled case statement. so, let's instead had enum format known image formats (where value of enum happens map channel count). since enum has range of known values, there temptation try this:

template<image::format f> image::dooperationinternal() {     (unsigned int = 0; < numpixels; i++) {         (unsigned int j = 0; j < static_cast<unsigned int>(f); j++) {             // ...         }     } }  image::dooperation() {     const format f = mformat;     dooperationinternal<f>(); }

however, in case compiler (rightfully) complains f not constant expression, though has finite range , in theory compiler generate switch logic cover of enumerated values.

so, question: there alternate approach allow compiler generate invariant-value-optimized code without requiring switch-case explosion per function invocation?

make jump table array, invoke. goal create array of various functions, array lookup , call 1 want.

first, i'll c++11 one. c++1y contains own integral sequence types, , has easy write auto return types: c++11 1 return void.

our functor class looks this:

struct example_functor {   template<unsigned n>   static void action(double d) const {     std::cout << n << ":" << d << "\n"; // or whatever, n compile time constant   } };

in c++11, want boilerplate:

template<unsigned...> struct indexes {}; template<unsigned max, unsigned... is> struct make_indexes:make_indexes< max-1, max-1, is... > {}; template<unsigned... is> struct make_indexes<0, is...>:indexes<is...> {};

to create , pattern match packs of indexes.

the interface looks like:

template<typename functor, unsigned max, typename... ts> void invoke_jump( unsigned index, ts&&... ts );

and called like:

invoke_jump<example_functor, 10>( 7, 3.14 );

we first create helper:

template<typename functor, unsigned... is, typename... ts> void do_invoke_jump( unsigned index, indexes<is...>, ts&&... ts ) {   static auto table[]={ &(functor::template action<is>)... };   table[index]( std::forward<ts>(ts)... ) } template<typename functor, unsigned max, typename... ts> void invoke_jump( unsigned index, ts&&... ts ) {   do_invoke_jump( index, make_indexes<max>(), std::forward<ts>(ts)... ); }

which creates static table of functor::action lookup on them , invokes it.

in c++03 don't have ... syntax, have more things manually, , no perfect forwarding. i'll create std::vector table instead.

first, cute little program runs functor.action<i>() in [begin, end) in order:

template<unsigned begin, unsigned end, typename functor> struct foreach:foreach<begin, end-1, functor> {   foreach(functor& functor):     foreach<begin, end-1, functor>(functor)   {     functor->template action<end-1>();   } }; template<unsigned begin, typename functor> struct foreach<begin,begin,functor> {};

which admit overly cute (the chain implicitly created constructor dependencies).

we use build vector up.

template<typename signature, typename functor> struct populatevector {   std::vector< signature* >* target; // change signature here whatever want   populatevector(std::vector< signature* >* t):target(t) {}   template<unsigned i>   void action() {     target->push_back( &(functor::template action<i>) );   } };

we can hook 2 up:

template<typename signature, typename functor, unsigned max> std::vector< signature* > make_table() {   std::vector< signature* > retval;   retval.reserve(max);   populatevector<signature, functor> worker(&retval);   foreach<0, max>( worker ); // runtime work done on line   return retval; }

which builds our jump table std::vector.

we can call ith element of jump table easily.

struct example_functor {   template<unsigned i>   static void action() {     std::cout << << "\n";   } }; void test( unsigned ) {   static std::vector< void(*)() > table = make_table< void(), example_functor, 100 >();   if (i < 100)     table[i](); }

which when passed integer i prints , newline.

the signature of function in table can whatever want, can pass in pointer type , invoke method, i being compile-time constant. action method have static, can call non-static based methods of arguments.

the big differences in c++03 need different code different signatures of jump table, lot of machinery (and std::vector instead of static array) build jump table.

when doing serious image processing, you'll want have scanline functions generated way, per-pixel operations possibly embedded in somewhere in generated scanline function. doing jump-dispatch once per scanline fast enough, unless images 1 pixel wide , billion pixels tall.

the above code still needs auditing correctness: written without being compiled.

Search This Blog

KBPS

c++ - Making templatized optimization more maintainable -

Comments

Post a Comment

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -