This is the mail archive of the
glibc-linux@ricardo.ecn.wfu.edu
mailing list for the glibc project.
Re: Library mapping
> Regarding the things you've said below, I wonder who decides where these
> segments get mapped. I've been checking the source code for elf binary
> handler functions, load_elf_interp and load_elf_binary, they seem to be
> the functions making do_mmap() calls, reading from the elf header of an
> elf executable.
Notice that this is but one third of the entire problem. After
load_elf_interp has finished, control is transferred to user mode,
into the shared linker (ld.so). This is part of the glibc, and takes
care of loading the vast majority of shared libraries. Use strace(1)
to see the sequence of system calls made when loading an executable;
here is a fragment
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\330"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=1384168, ...}) = 0
old_mmap(NULL, 1201988, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x400fb000
mprotect(0x40216000, 42820, PROT_NONE) = 0
old_mmap(0x40216000, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x11a000) = 0x40216000
old_mmap(0x4021d000, 14148, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4021d000
close(3) = 0
munmap(0x40016000, 102788) = 0
As you can see, it invokes old_mmap with a NULL address, asking the
kernel to assign an address.
> I am really confused, where and how the addresses where libraries' text,
> data, bss are decided. What algorithm is used in doing it and where in
> source tree is it? Is it do_mmap() who decides it?
Almost. I'm not sure which specific call you are referring to, perhaps
map_addr = do_mmap(filep, ELF_PAGESTART(addr),
eppnt->p_filesz + ELF_PAGEOFFSET(eppnt->p_vaddr), prot, type,
eppnt->p_offset - ELF_PAGEOFFSET(eppnt->p_vaddr));
in elf_map. Notice that the virtual address is passed as an argument
to elf_map, e.g. from
map_addr = elf_map(interpreter, load_addr + vaddr, eppnt, elf_prot, elf_type);
inside load_elf_interp. load_addr is zero until the load_addr
assignment is executed. So it iterates over all program headers. If
you do 'readelf -e /lib/ld-linux.so.2', you'll see
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x136b1 0x136b1 R E 0x1000
LOAD 0x0136c0 0x000146c0 0x000146c0 0x002dc 0x006c8 RW 0x1000
DYNAMIC 0x013778 0x00014778 0x00014778 0x000b0 0x000b0 RW 0x4
So on the first program header, vaddr is 0, thus giving an argument of
0 for do_mmap. In this case, do_mmap will assign a virtual
address. How it does so depends on the kernel version, for discussion,
please refer to the kernel version you are looking at.
In 2.4.10, it delegates to do_mmap_pgoff, which in turn delegates to
get_unmapped_area. Unless overridden by the architecture, this is in
mm/mmap.c. Therefore, if addr is initially 0, it starts off with
TASK_UNMAPPED_BASE, and finds the first unmapped area above that
address. For i386, this depends on CONFIG_05GB. If this is not set, it
is
#define TASK_UNMAPPED_BASE (TASK_SIZE / 3)
which is in turn defined by
#define TASK_SIZE (PAGE_OFFSET)
which is in turn defined by
#define PAGE_OFFSET ((unsigned long)__PAGE_OFFSET)
which is in turn defined by
#define __PAGE_OFFSET (PAGE_OFFSET_RAW)
which is in turn defined by
#define PAGE_OFFSET_RAW 0xC0000000
(if CONFIG_05GB is not defined).
That gives the address of the first mapping. Later mappings get their
addresses based on availability, using the algorithm in find_vma.
HTH,
Martin