且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

为什么int类型在BSS段占用8个字节,而在DATA段占用4个字节

更新时间:2022-11-01 20:08:08

它不会占用4个字节,无论它在哪个分段。您可以使用 nm 工具(来自GNU binutils包)与 -S 参数进行比较,以获取目标文件中所有符号的名称和大小。您可能会看到编译器的次要影响,包括或不包括某些其他符号的原因。

  $ cat a1.c 
int x;
$ cat a2.c
int x = 1;
$ gcc -c a1.c a2.c
$ nm -S a1.o a2.o

a1.o:
0000000000000004 0000000000000004 C x

a2.o:
0000000000000000 0000000000000004 D x

一个目标文件在未初始化的数据段( C )中有一个名为 x 的4字节对象,而另一个对象文件具有在初始化数据段( D )中一个名为 x 的4字节对象。


I am trying to learn the structure of executable files of C program. My environment is GCC and 64bit Intel processor.

Consider the following C code a.cc.

#include <cstdlib>
#include <cstdio>

int x;

int main(){
  printf("%d\n", sizeof(x));
  return 10;
}

The size -o a shows

 text      data     bss     dec     hex filename
 1134       552       8    1694     69e a

After I added another initialized global variable y.

int y=10; 

The size a shows (where a is the name of the executable file from a.cc)

 text      data     bss     dec     hex filename
 1134       556      12    1702     6a6 a

As we know, the BSS section stores the size of uninitialized global variables and DATA stores initialized ones.

  1. Why int takes up 8 bytes in BSS? The sizeof(x) in my code shows that the int actually takes up 4 bytes.
  2. The int y=10 added 4 bytes to DATA which makes sense since int should take 4 bytes. But, why does it adds 4 bytes to BSS?

The difference between two size commands stays the same after deleting the two lines #include ....

Update: I think my understanding of BSS is wrong. It may not store the uninitialized global variables. As the Wikipedia says "The size that BSS will require at runtime is recorded in the object file, but BSS (unlike the data segment) doesn't take up any actual space in the object file." For example, even the one line C code int main(){} has bss 8.

Does the 8 or 16 of BSS comes from alignment?

It doesn't, it takes up 4 bytes regardless of which segment it's in. You can use the nm tool (from the GNU binutils package) with the -S argument to get the names and sizes of all of the symbols in the object file. You're likely seeing secondary affects of the compiler including or not including certain other symbols for whatever reasons.

For example:

$ cat a1.c
int x;
$ cat a2.c
int x = 1;
$ gcc -c a1.c a2.c
$ nm -S a1.o a2.o

a1.o:
0000000000000004 0000000000000004 C x

a2.o:
0000000000000000 0000000000000004 D x

One object file has a 4-byte object named x in the uninitialized data segment (C), while the other object file has a 4-byte object named x in the initialized data segment (D).