Returning a struct
Whenever a return value won't fit in a register, some special arrangements have to be made. In some cases, space for the returned object is allocated by the caller; sometimes by the callee.
GCC/Sparc
With GCC/Sparc, the returned object is allocated by the caller, and the address of the return value is passed to the called function at %sp+64 (for the caller), and referenced in the callee as %fp+64. The size of the object seems to be stored just after the call and the delay slot instruction. Thus, instead of returning with the conventional jmp %i7+8 (usually written as the ret synthetic instruction), the function returns with jmp %i7+12.Example:
typedef struct _person
{
char name[20];
float age;
} person;
void work(person p)
{
printf("Name is %s\n", p.name);
printf("Age is %.1f\n", p.age);
}
void work2(person p, int i)
{
if (i == 0)
work(p);
else
work2(p, i-1);
}
person setperson(char newname[], float newage)
{
person* pp = new person;
strcpy((char*)&pp->name, newname);
pp->age = newage;
return *pp;
}
void main()
{
person me;
strcpy(me.name, "Nobody");
me.age = 0;
me = setperson("Michael J", 38.75);
work2(me, 2);
}
compiles to:
setperson__FPcf() 10878: 9d e3 bf 88 save %sp, -120, %sp 1087c: e0 07 a0 40 ld [%fp + 64], %l0 ... 108e0: b0 10 00 10 mov %l0, %i0 108e4: 81 c7 e0 0c jmp %i7 + 12 108e8: 81 e8 00 00 restore main() 108ec: 9d e3 bf 60 save %sp, -160, %sp ... 108f0: 40 00 40 c6 call strcpy ... 10914: 90 07 bf d8 add %fp, -40, %o0 10918: d0 23 a0 40 st %o0, [%sp + 64] ... 1092c: 7f ff ff d3 call setperson__FPcf 10930: 01 00 00 00 nop 10934: 00 00 00 18 unimp 0x18 10938: d0 07 bf d8 ld [%fp - 40], %o0 1093c: d0 27 bf c0 st %o0, [%fp - 64] 10940: d0 07 bf dc ld [%fp - 36], %o0 10944: d0 27 bf c4 st %o0, [%fp - 60] ... 10960: d0 07 bf ec ld [%fp - 20], %o0 10964: d0 27 bf d4 st %o0, [%fp - 44] 10968: 92 07 bf c0 add %fp, -64, %o1 1096c: 90 10 00 09 mov %o1, %o0 10970: 92 10 20 02 mov 2, %o1 10974: 7f ff ff 91 call work2__FG7_personi ...Note that the address of the returned structure is returned in %i0 as usual, but space for that object (0x18 = 24 bytes of it) is allocated in the stack of the caller (i.e. main()) with the add %fp,-40,%o0 instruction.
Many times, the returned value (%o0 in the caller) is not used, because the returned object is a struct, and the fields of the struct can be referenced directly. This is the case in the above example, where it is known that the returned struct is at %fp-40, and the caller uses this directly.
When the struct 'me' is passed to function work2, it is copied to another place on the stack (%fp-64, from %fp-40). This is in case the called function modifies the passed parameter; the C standard says that value parameters are not affected by the caller. Since there are only 24 bytes involved, 6 4-byte load/store instructions are used for the copy.
At the end of the function that returns the struct, there will be a call (or jump or call with a restore in the delay slot) to a function whose name is similar to .stret4 (Implemented in libc.so.1). These functions check for the (optional!) unimp after the call to the func that returns the struct, and if present, uses this as the size to copy that many bytes from %o0 to [fp - 64]. .stret4 copies 4 bytes at a time, and there are other versions that copy 1, 2, and 8 bytes at a time.
GCC/X86
The X86 version of the same file is more difficult to follow. Part of the reason for this is that GCC does not restore the stack after a call to a function; it just leaves the old parameters on the stack, and adjusts the stack pointer (based on the ebp register) at the end of each function. In this unoptimised code, structs are sometimes copied with memcpy, and sometimes by the movsl (long block move) instruction. Sometimes strings are copied with strcpy, and sometimes with several move instructions.As with the Sarc version, a hidden parameter is used to pass the address where the called function can copy the struct to be returned. However, here is is passed as the "zeroth" parameter, rather than in a special part of the stack frame (as with Sparc). Unlike the Sparc version, the size of the return value does not appear to be stored explicitly.
setperson__FPcf() 8048a25: 55 pushl %ebp ... 8048a2b: 8b 7d 08 movl 8(%ebp),%edi ; hidden param to edi 8048a2e: 8b 5d 0c movl 12(%ebp),%ebx ; newname in ebx 8048a31: 6a 18 pushl $0x18 8048a33: e8 81 00 00 00 call 0x81Again, when the struct 'me' is passed to function work2, there is a second copy made on the stack, and copied by the movsl instruction.8048a38: 8b f0 movl %eax,%esi ; pp in esi 8048a3a: 53 pushl %ebx 8048a3b: 56 pushl %esi 8048a3c: e8 6c fe ff ff call strcpy 8048a41: d9 45 10 flds 16(%ebp); newage 8048a44: d9 5e 14 fstps 20(%esi); pp->age 8048a47: 6a 18 pushl $0x18 8048a49: 56 pushl %esi 8048a4a: 57 pushl %edi 8048a4b: e8 6d fe ff ff call memcpy ; Copy ret struct 8048a50: 8b c7 movl %edi,%eax 8048a52: 8d 65 f4 leal -12(%ebp),%esp ; Restore stack ... 8048a58: c9 leave 8048a59: c3 ret main() 8048a5d: 55 pushl %ebp 8048a5e: 8b ec movl %esp,%ebp 8048a60: 83 ec 18 subl $0x18,%esp ; space for 'me' ... 8048a65: a1 cc 8b 04 08 movl 0x8048bcc,%eax ; "Nobo" 8048a6a: 89 45 e8 movl %eax,-24(%ebp) ; me.name[0..3] 8048a6d: 66 a1 d0 8b 04 08 movw 0x8048bd0,%ax ; "dy" 8048a73: 66 89 45 ec movw %ax,-20(%ebp) ; me.name[4..5] 8048a77: a0 d2 8b 04 08 movb 0x8048bd2,%al ; Null 8048a7c: 88 45 ee movb %al,-18(%ebp) ; me.name[6] 8048a7f: c7 45 fc 00 00 00 00 movl $0x0,-4(%ebp) ; me.age = 0 8048a86: 8d 75 e8 leal -24(%ebp),%esi ; &me (ret addr) 8048a89: 68 00 00 1b 42 pushl $0x421b0000 ; 38.75 8048a8e: 68 d3 8b 04 08 pushl $0x8048bd3 ; "Michael J" 8048a93: 56 pushl %esi 8048a94: e8 8c ff ff ff call setperson 8048a99: 6a 02 pushl $0x2 8048a9b: 83 c4 e8 addl $0xe8,%esp ; space for copy of 'me' 8048a9e: 8b fc movl %esp,%edi 8048aa0: fc cld 8048aa1: b9 06 00 00 00 movl $0x6,%ecx ; 0x18/4 8048aa6: f3 a5 repz movsl (%esi),(%edi) 8048aa8: e8 34 ff ff ff call work2 8048aad: 33 c0 xorl %eax,%eax 8048aaf: 8d 65 e0 leal -32(%ebp),%esp ; restore stack ... 8048ab4: c9 leave 8048ab5: c3 ret
Static Links
Consider this piece of pascal code:
PROCEDURE staticlink;
var j: integer;
procedure inner;
begin
writeln('J is ', j);
end;
begin
j := 5;
inner;
end;
It compiles under gpc (with optimisation) to:
Inner.17() 11b28: 9d e3 bf 78 save %sp, -136, %sp 11b2c: c4 27 bf ec st %g2, [%fp - 20] 11b30: 90 10 20 01 mov 1, %o0 11b34: d0 23 a0 5c st %o0, [%sp + 92] 11b38: d0 00 bf f8 ld [%g2 - 8], %o0 11b3c: 17 00 00 5b sethi %hi(0x16c00), %o3 11b40: 92 10 20 03 mov 3, %o1 11b44: 94 10 20 08 mov 8, %o2 11b48: 96 12 e2 90 or %o3, 656, %o3 11b4c: 98 10 20 00 clr %o4 11b50: 9a 10 20 05 mov 5, %o5 11b54: d0 23 a0 60 st %o0, [%sp + 96] 11b58: 90 10 20 04 mov 4, %o0 11b5c: d0 23 a0 64 st %o0, [%sp + 100] 11b60: 11 00 00 a4 sethi %hi(0x29000), %o0 11b64: 40 00 00 67 call _p_write 11b68: 90 12 23 4c or %o0, 844, %o0 11b6c: 81 c7 e0 08 ret 11b70: 81 e8 00 00 restore Staticlink.14() 11b74: 9d e3 bf 88 save %sp, -120, %sp 11b78: c4 27 bf ec st %g2, [%fp - 20] 11b7c: 90 10 20 05 mov 5, %o0 11b80: d0 27 bf e8 st %o0, [%fp - 24] 11b84: 7f ff ff e9 call Inner.17 11b88: 84 07 bf f0 add %fp, -16, %g2 11b8c: 81 c7 e0 08 ret 11b90: 81 e8 00 00 restore
