高效C++委托的原理

写这篇学习心得源自于一篇老外的文章

Member Function Pointers and the Fastest Possible C++ Delegates

，网上也有它对应的中文翻译，我随意找一篇：

http://www.cnblogs.com/jans2002/archive/2006/10/13/528160.html

。

文章中提到了高效C++委托，不过讲得太生涩了，以至于第一次看，没怎么明白。这里我把一些理解记录下来，并且以易理解的方式介绍一下高效C++委托。

一.我理解的委托

所谓委托，就是把工作交给别人来完成。我的理解就是在程序中，把本类的函数指针交给其它类，让其它类在需要的时候调用这个函数，完成某个功能。说简单点，就是一个函数回调的机制。为了实现这样一个回调的功能，我们可以通过对外暴露接口的方式，让外部持有该类的接口，必要时进行接口的回调。但是这样增加了调用者（受托者）和被调用者（几乎是委托者）的耦合关系，因为调用者必须知道这个接口，需要include接口头文件。为了减少这种耦合，常用的技术就是使用boost::function对象。委托者把要委托的函数指针打包成一个function对象，扔给受托者，让受托者在合适的时机完成功能，这样两边都可以不知道对方是如何实现的，没有接口的耦合，大家都只要#include function对象就行了。

二 function的原理

boost::function或许都用过，觉得用起来还可以，但是一看它的源代码，真的头晕，我怀疑它的作者写这个就是不想让别人看懂的，所以这里建议初学看loki库的functor。其实function的代码核心没那么复杂，如果没有宏的嵌套，没有文件迭代，一下就能看明白。我这里写一个简单的模拟boost::function的例子：

template<class T> class function;  
template<typename R, typename A0, typename T>  
class  function<R (T::*)(A0) >   
{  
public:  
    typedef R(T::*fun)(A0);  
     function (fun p, T* pthis):m_ptr(p), m_pThis(pthis){}  
  
    R operator()(A0 a)      {(m_pThis->*m_ptr)(a);}  
    fun m_ptr;  
    T* m_pThis;  
};

class CA
{
public:
	void Fun(int a) {cout << a;}
};

CA a;
function<void (CA::*)(int)>   f(&CA::Fun, &a);
f(4);  // 等价于a.Fun(4);

其实function的作用就是封装了this指针和函数指针保存起来，到必要的时候再调用。但也不要以为真正的boost::function和boost::bind就是如此简单，这里只把它的核心机制进行了模拟（甚至都没有实现解耦）。

二成员函数指针的效率

boost::function是对成员函数指针的一层封装，重载的()操作符对函数参数进行了一次转发，因此，function的效率和直接调用成员函数指针差不多，甚至还要慢。即使对每个参数加上引用转发，也不能改变这个现状。那么成员函数指针调用的效率有多少呢？

看下面的实验

class B1  
{  
public: virtual void fb1()
        {
            printf("fb1");
        }
};  
class B2  
{
public:
    virtual void fb2(){}
};  
class D1 : public B2, public B1  
{
public: virtual void fb1()
        {
            printf("d1fb1");
        }
};  
class D2 : virtual public B1  
{  
public: virtual void fb1()
        {
            printf("d2fb1");
        }
};

画成类图就是

其中，D2是虚继承于B1，用粗箭头表示。

然后我们再写三个成员函数指针的调用，

第一个是调用自己的虚函数，第二个是调用多继承中的覆盖的第二个父类的虚函数，第三个是调用

覆盖的

虚基类的虚函数

    B1* pb1 = new B1;  
    void (B1::*fpb1)() = &B1::fb1;  
    (pb1->*(fpb1))();
    assert(assert(sizeof(fpb1) == 4));

    D1* pd1 = new D1;  
    void (D1::*fpd1)() = &D1::fb1; 
    (pd1->*(fpd1))();  
    assert(sizeof(fpd1) == 8));


    D2* pd2 = new D2;  
    void (D2::*fpd2)() = &D2::fb1;  
    (pb2->*(fpd2))();
    assert(sizeof(fpd2) == 12));

对于这三个调用，在VC编译器下看它的release反汇编代码如下。

第一个：

0040656B mov ecx,dword ptr [ebp-10h]

0040656E call dword ptr [ebp-14h]

该处是取this指针，调用函数地址

第二个：

004065F0 mov ecx,dword ptr [ebp-18h]

004065F3 add ecx,dword ptr [ebp-20h]

004065F8 call dword ptr [ebp-24h]

该处是取this指针，偏移this指针，再调用函数地址

第三个：

004066BD mov ecx,dword ptr [ebp-2Ch]

004066C0 mov edx,dword ptr [ecx]

004066C2 mov eax,dword ptr [ebp-34h]

004066C5 mov ecx,dword ptr [ebp-2Ch]

004066C8 add ecx,dword ptr [edx+eax]

004066CB add ecx,dword ptr [ebp-38h]

004066CE mov esi,esp

004066D0 call dword ptr [ebp-3Ch]

该处是取this指针，移到虚基类列表，取虚基类地址，取虚基类偏移，调用函数地址。

（有关原理请参考我的另一个学习心得文章：

C++内存布局生成步骤

）

可见，成员函数指针在不同的情况下保存了不同的信息，做了不同的事。

第一种情况：

typedef void (B1::*FPB1)();

{

mem function address; // 4 bytes，成员函数地址

}

第二种情况：

typedef void (D1::*FPB1)();

{

mem function address; // 4 bytes;

，成员函数地址

delta added to this pointer; // 4 bytes，this指针偏移

}

第三种情况：

typedef void (D2::*FPB1)();

{

mem function address; // 4 bytes;

，成员函数地址

delta added to this pointer; // 4 bytes，this指针偏移

index in virtual base class table; // 4 byptes ,虚基类列表中的序号

}

大家也许做过成员函数指针转化为void*的事，无论怎么做，都编译不通过。只有解了成员函数指针的构成才会知道，void*才4个字节，无法容纳成员函数指针的信息。

在上面的三种情况中，第二种情况比第一种低效，第三种情况比第二种更低效。有没有什么方式让所有的成员函数指针调用都像第一种一样高效呢？答案是有的，即把所有的成员函数指针都当成第一类情况来处理。

三

构造快速的委托

我们可以定义一个通用的类GenericClass，把所有的其它类指针及成员函数指针转换为GenericClass类指针及它的成员函数指针。因此我们需要改造为下面的function

class GenericClass {};

template<class T> class function;  
template<typename R, typename T>  
class  function<R (T::*)() >   
{  
public:  
    typedef R(GenericClass::*fun)();  
    function (fun p, T* pthis):m_ptr(p), m_pThis(pthis){}  

    R operator()()      {(m_pThis->*m_ptr)();}  
    fun m_ptr;  
    GenericClass* m_pThis;  
};

接下来，需要一个转换函数，把对应的类指针和其成员函数指针转换为GenericClass的指针。这一个转换函数需要写成一个模板函数，以便对应不同的成员函数指针有不同的转换算法。

template <int N>
struct SimplifyMemFunc 
{    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) { }
};

上面的实现有两处用到了模板参数，第一处是SimplifyMemFunc的模板参数，它让不同sizeof的函数指针用到不同的

SimplifyMemFunc类。第二处是Convert方法，使得它有能力接受不同类型的this指针和成员函数指针，Convert方法的三个参数分别代表转换前的this指针，转换前的成员函数指针，转换后的通用成员函数指针，Convert的返回是转换后的通用this指针。

对于成员函数指针大小只有4的时候，转换函数如下：

const int SINGLE_MEMFUNCPTR_SIZE = sizeof(void (GenericClass::*)());
template <>
struct SimplifyMemFunc<SINGLE_MEMFUNCPTR_SIZE>
{
    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func)
    {
            bound_func = reinterpret_cast<GenericMemFuncType>(function_to_bind);
            return reinterpret_cast<GenericClass *>((X*)pthis);
    }
};

这时只需要强转指针和函数指针。

当成员函数指针大小是8的时候，情况要复杂一些。这时，需要把this指针的偏移加上，再进行强转

template<>
struct SimplifyMemFunc< SINGLE_MEMFUNCPTR_SIZE + sizeof(int) >  {
    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) { 
            union {
                XFuncType func;
                struct {	 
                    GenericMemFuncType funcaddress; // points to the actual member function
                    int delta;	     // #BYTES to be added to the 'this' pointer
                }s;
            } u;
            u.func = function_to_bind;
            bound_func = u.s.funcaddress;
            return reinterpret_cast<GenericClass *>(reinterpret_cast<char *>(pthis) + u.s.delta); 
    }
};

根据前面的推理，当成员函数指针大小为8的时候，后4个字节存放的是this指针的偏移，因此，这种情况下，构造了一个联合体将成员函数指针解析出来，分成函数地址和偏移两部分。再将偏移加在this上，作为转换后的通用this指针。

当成员函数指针大小是12的时候，情况最为复杂，已经不是单纯地加上一个偏移就能解决的了，这需要查一次虚基偏移表，虽然我们可以用代码写一串加法运算完成这一点，但这样太不优雅了，比较优雅一点的方法就是模拟一次通用类的调用，让编译器替我们去算这个偏移

struct MicrosoftVirtualMFP {
    void (GenericClass::*codeptr)(); // points to the actual member function
    int delta;		// #bytes to be added to the 'this' pointer
    int vtable_index; // or 0 if no virtual inheritance
};

struct GenericVirtualClass : virtual public GenericClass
{
    typedef GenericVirtualClass * (GenericVirtualClass::*ProbePtrType)();
    GenericVirtualClass * GetThis() { return this; }
};

template <>
struct SimplifyMemFunc<SINGLE_MEMFUNCPTR_SIZE + 2*sizeof(int) >
{

    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) {
            union {
                XFuncType func;
                GenericClass* (X::*ProbeFunc)();
                MicrosoftVirtualMFP s;
            } u;
            u.func = function_to_bind;
            bound_func = reinterpret_cast<GenericMemFuncType>(u.s.codeptr);
            union {
                GenericVirtualClass::ProbePtrType virtfunc;
                MicrosoftVirtualMFP s;
            } u2;

            u2.virtfunc = &GenericVirtualClass::GetThis;
            u.s.codeptr = u2.s.codeptr;
            return (pthis->*u.ProbeFunc)();
    }
};

这里构造函数指针u和u2,u指向的就是传入的函数指针。u2指向的是&GenericVirtualClass::GetThis。这时，把u2的函数地址赋值给u.u的指针结构就产生了如图的变化。

这时再对this指针调用这个u函数指针，就达到了偏移的效果

下面给出全部的代码：

class B1  
{  
public: virtual void fb1()
        {
            printf("fb1");
        }
};  
class B2  
{
public:
    virtual void fb2(){}
};  
class D1 : public B2, public B1  
{
public: virtual void fb1()
        {
            printf("fb2");
        }
};  
class D2 : virtual public B1  
{  
public: virtual void fb1()
        {
            printf("d2fb1");
        }
}; 

class GenericClass {};

template<class T> class function;  
template<typename R, typename T>  
class  function<R (T::*)() >   
{  
public:  
    typedef R(GenericClass::*fun)();  
    function (fun p, T* pthis):m_ptr(p), m_pThis(pthis){}  

    R operator()()      {(m_pThis->*m_ptr)();}  
    fun m_ptr;  
    GenericClass* m_pThis;  
};  

template <int N>
struct SimplifyMemFunc 
{
    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) { }
};


const int SINGLE_MEMFUNCPTR_SIZE = sizeof(void (GenericClass::*)());
template <>
struct SimplifyMemFunc<SINGLE_MEMFUNCPTR_SIZE>
{
    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func)
    {
            bound_func = reinterpret_cast<GenericMemFuncType>(function_to_bind);
            return reinterpret_cast<GenericClass *>(pthis);
    }
};


template<>
struct SimplifyMemFunc< SINGLE_MEMFUNCPTR_SIZE + sizeof(int) >  {
    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) { 
            union {
                XFuncType func;
                struct {	 
                    GenericMemFuncType funcaddress; // points to the actual member function
                    int delta;	     // #BYTES to be added to the 'this' pointer
                }s;
            } u;
            u.func = function_to_bind;
            bound_func = u.s.funcaddress;
            return reinterpret_cast<GenericClass *>(reinterpret_cast<char *>(pthis) + u.s.delta); 
    }
};




struct MicrosoftVirtualMFP {
    void (GenericClass::*codeptr)(); // points to the actual member function
    int delta;		// #bytes to be added to the 'this' pointer
    int vtable_index; // or 0 if no virtual inheritance
};

struct GenericVirtualClass : virtual public GenericClass
{
    typedef GenericVirtualClass * (GenericVirtualClass::*ProbePtrType)();
    GenericVirtualClass * GetThis() { return this; }
};

template <>
struct SimplifyMemFunc<SINGLE_MEMFUNCPTR_SIZE + 2*sizeof(int) >
{

    template <class X, class XFuncType, class GenericMemFuncType>
    inline static GenericClass *Convert(X *pthis, XFuncType function_to_bind, 
        GenericMemFuncType &bound_func) {
            union {
                XFuncType func;
                GenericClass* (X::*ProbeFunc)();
                MicrosoftVirtualMFP s;
            } u;
            u.func = function_to_bind;
            bound_func = reinterpret_cast<GenericMemFuncType>(u.s.codeptr);
            union {
                GenericVirtualClass::ProbePtrType virtfunc;
                MicrosoftVirtualMFP s;
            } u2;

            u2.virtfunc = &GenericVirtualClass::GetThis;
            u.s.codeptr = u2.s.codeptr;
            return (pthis->*u.ProbeFunc)();
    }
};

template<typename R, typename T, typename X>
function<R (GenericClass::*)() >  MakeDelegate(T* pthis, R (X::*mem_fun)())
{
    typedef R (GenericClass::*GenericMemFuncType) ();

    GenericMemFuncType pGenFun;
    GenericClass* pGenThis = SimplifyMemFunc<sizeof(mem_fun)>::Convert(pthis, mem_fun, pGenFun);

    
    return function<R (GenericClass::*)() >(pGenFun, pGenThis);
}

int _tmain(int argc, _TCHAR* argv[])
{
    B1 b1;
    D1 d1;
    D2 d2;

    function<void (GenericClass::*)() > f1 = MakeDelegate(&b1, &B1::fb1);
    function<void (GenericClass::*)() > f2 = MakeDelegate(&d1, &D1::fb1);
    function<void (GenericClass::*)() > f3 = MakeDelegate(&d2, &D2::fb1);
    
    f1();
    f2();
    f3();
	return 0;
}

三

与boost的效率比较

下载了FastDelegate库，将它与boost运行效率作比较，发现效率提升了1倍多。

D2* pd21 = new D2;
boost::function<int ()> bf = boost::bind(&D2::fb1, pd21);
for (int i = 0; i < 100000000; ++i)
{
bf();           // 1392 ms
}
D2* pd22 = new D2;
for (int i = 0; i < 100000000; ++i)
{
pd22->fb1();     // 726 ms
}
D2* pd23 = new D2;
fastdelegate::FastDelegate0<int> MyDelegate;
MyDelegate.bind(pd23, &D2::fb1);
for (int i = 0; i < 100000000; ++i)
{
MyDelegate();    // 551ms
}

当然，一个应用程序的瓶颈绝不应在此，这里只是提出了一种在编译期优化问题的思想，可以借鉴学习。

原文链接：https://blog.csdn.net/cyxisgreat/article/details/7506672

你可能也喜欢