在看《C程序性能优化》一书时,作者提到使用gcc编译器选项-fomit-frame-pointer能够提高程序性能,自己有些不解,决定探个究竟。
假设有如下简单程序:
#include <stdio.h>
int add(int a, int b)
{
return a + b;
}
int main()
{
int sum = 0;
sum = add(1,2);
printf("%d\n",sum);
return 0;
}
不使用-fomit-frame-pointer选项编译出的二进制经过反汇编的代码如下:
00000000 <add>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 8b 45 0c mov 0xc(%ebp),%eax
6: 8b 55 08 mov 0x8(%ebp),%edx
9: 01 d0 add %edx,%eax
b: 5d pop %ebp
c: c3 ret
0000000d <main>:
d: 55 push %ebp
e: 89 e5 mov %esp,%ebp
10: 83 e4 f0 and $0xfffffff0,%esp
13: 83 ec 20 sub $0x20,%esp
16: c7 44 24 1c 00 00 00 movl $0x0,0x1c(%esp)
1d: 00
1e: c7 44 24 04 02 00 00 movl $0x2,0x4(%esp)
25: 00
26: c7 04 24 01 00 00 00 movl $0x1,(%esp)
2d: e8 fc ff ff ff call 2e <main+0x21>
32: 89 44 24 1c mov %eax,0x1c(%esp)
36: b8 00 00 00 00 mov $0x0,%eax
3b: 8b 54 24 1c mov 0x1c(%esp),%edx
3f: 89 54 24 04 mov %edx,0x4(%esp)
43: 89 04 24 mov %eax,(%esp)
46: e8 fc ff ff ff call 47 <main+0x3a>
4b: b8 00 00 00 00 mov $0x0,%eax
50: c9 leave
51: c3 ret
加上编译选项-fomit-frame-pointer反汇编得到的代码如下:
00000000 <add>:
0: 8b 44 24 08 mov 0x8(%esp),%eax
4: 8b 54 24 04 mov 0x4(%esp),%edx
8: 01 d0 add %edx,%eax
a: c3 ret
0000000b <main>:
b: 55 push %ebp
c: 89 e5 mov %esp,%ebp
e: 83 e4 f0 and $0xfffffff0,%esp
11: 83 ec 20 sub $0x20,%esp
14: c7 44 24 1c 00 00 00 movl $0x0,0x1c(%esp)
1b: 00
1c: c7 44 24 04 02 00 00 movl $0x2,0x4(%esp)
23: 00
24: c7 04 24 01 00 00 00 movl $0x1,(%esp)
2b: e8 fc ff ff ff call 2c <main+0x21>
30: 89 44 24 1c mov %eax,0x1c(%esp)
34: b8 00 00 00 00 mov $0x0,%eax
39: 8b 54 24 1c mov 0x1c(%esp),%edx
3d: 89 54 24 04 mov %edx,0x4(%esp)
41: 89 04 24 mov %eax,(%esp)
44: e8 fc ff ff ff call 45 <main+0x3a>
49: b8 00 00 00 00 mov $0x0,%eax
4e: c9 leave
4f: c3 ret
可以看到不加-fomit-frame-pointer选项编译出来的代码少了一些,最主要的区别是少了栈帧的切换和栈地址的保存,栈是从高地址向低地址扩展,而堆是从低地址向高地址扩展。在x86体系结构中,栈顶寄存器是esp,栈底寄存器位ebp,esp的值要小于ebp的值。函数调用时先将函数返回值、传入参数依次压入栈中,CPU访问时采用0x8(%esp)方式访问传入的参数,使用-fomit-frame-pointer会由于没有保存栈调用地址,而导致无法追踪函数调用顺序,我想gcc,vs等编译器记录函数调用顺序都是采用这种方式吧。
转载于:https://www.cnblogs.com/islandscape/p/3444122.html