[NTLUG:Discuss] The wrong computation example from the newsgroup
Christopher Browne
cbbrowne at localhost.brownes.org
Sun Mar 18 13:02:19 CST 2001
On Sun, 18 Mar 2001 12:16:14 CST, the world broke into rejoicing as
Fred James <fredjame at concentric.net> said:
> Having read all of the replies to date I must admit to feeling silly
> for having been sucked in - but sucked in I was.
> I appreciate the cool heads in the group for seeing through to the
> core issue and sharing that knowledge.
> Computers do what they are told, and since people must decide whether
> the results are appropriate an understanding of how the computer
> accomplishes a task is necessary.
Be aware that there may very well be a problem; it is rather surprising
that the compiler's macro processor comes up with a different value than
you get when you run a "compiled" version of the computation.
You can get a more precise view of what is going on "under the covers,"
by the way, if rather than compiling to object code, you compile to
assembler.
% gcc -c -S ctest.c
% more ctest.s
For this, I get:
.file "ctest.c"
.version "01.01"
gcc2_compiled.:
.section .rodata
.LC2:
.string "%d = %d\n"
.align 8
.LC0:
.long 0x33333333,0x3fd33333
.align 8
.LC1:
.long 0x66666666,0x3fe66666
.text
.align 4
.globl main
.type main, at function
main:
pushl %ebp
movl %esp,%ebp
subl $40,%esp
movl $60,-4(%ebp)
movl $6,-8(%ebp)
movl $10,-12(%ebp)
addl $-4,%esp
movl -4(%ebp),%eax
leal -8(%ebp),%ecx
cltd
idivl (%ecx)
movl %eax,-20(%ebp)
fildl -20(%ebp)
fldl .LC0
fmulp %st,%st(1)
fildl -12(%ebp)
fldl .LC1
fmulp %st,%st(1)
faddp %st,%st(1)
fnstcw -22(%ebp)
movw -22(%ebp),%dx
orw $3072,%dx
movw %dx,-24(%ebp)
fldcw -24(%ebp)
fistpl -20(%ebp)
movl -20(%ebp),%eax
fldcw -22(%ebp)
pushl %eax
pushl $10
pushl $.LC2
call printf
addl $16,%esp
addl $-12,%esp
pushl $1
call exit
addl $16,%esp
.p2align 4,,7
.L2:
leave
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 2.95.3 20010219 (prerelease)"
The optimized version, coming via:
% gcc -O2 -S ctest.c
is rather shorter:
.file "ctest.c"
.version "01.01"
gcc2_compiled.:
.section .rodata
.LC2:
.string "%d = %d\n"
.text
.align 4
.globl main
.type main, at function
main:
pushl %ebp
movl %esp,%ebp
subl $24,%esp
addl $-4,%esp
pushl $10
pushl $10
pushl $.LC2
call printf
addl $-12,%esp
pushl $1
call exit
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 2.95.3 20010219 (prerelease)"
Color me "not vastly intimate with i386 assembler;" it is not _overly_
difficult to make out what's going on here.
The first version of the program has a fair bunch of floating point
instructions; the complete _lack_ of FP instructions in the second is
quite conspicuous.
In the second version, _all_ the calculations are performed by the
compiler. The crucial bit is thus:
pushl $10
pushl $10
pushl $.LC2
call printf
This is what sets up the printf() call; it pushes two copies of $10
onto the stack, pushes a reference, $.LC2, to the format string, and
then calls printf. In effect, the computations got optimized out; the
compiler figured out that both
(int) (((60/6)*0.3) + (10*0.7))
(int) ((( a/b)*0.3) + ( c*0.7))
were in fact calculating _exactly the same thing_, and so computed the
value, 10, better known here as $10, and put that into the assembler
code.
The optimized version does no computation whatsoever; all the program
does is to push value 10 onto the stack twice, then prints those
values.
For those that are curious, the Alpha equivalent looks like:
--> Unoptimized:
.file 1 "ctest.c"
.set noat
.set noreorder
.section .rodata
$LC0:
.ascii "%d = %d\12\0"
.align 3
$LC1:
.t_floating 2.99999999999999988898e-1
.align 3
$LC2:
.t_floating 6.99999999999999955591e-1
.text
.align 5
.globl main
.ent main
main:
.frame $15,48,$26,0
.mask 0x4008000,-48
ldgp $29,0($27)
$main..ng:
lda $30,-48($30)
stq $26,0($30)
stq $15,8($30)
mov $30,$15
.prologue 1
lda $1,60
stl $1,16($15)
lda $1,6
stl $1,20($15)
lda $1,10
stl $1,24($15)
ldl $24,16($15)
ldl $25,20($15)
divl $24,$25,$27
mov $27,$1
addl $1,$31,$2
stq $2,32($15)
ldt $f11,32($15)
cvtqt $f11,$f10
lda $1,$LC1
ldt $f11,0($1)
mult $f10,$f11,$f10
lds $f12,24($15)
cvtlq $f12,$f12
cvtqt $f12,$f11
lda $1,$LC2
ldt $f12,0($1)
mult $f11,$f12,$f11
addt $f10,$f11,$f10
cvttqc $f10,$f11
stt $f11,32($15)
ldq $1,32($15)
mov $1,$2
addl $2,$31,$1
lda $16,$LC0
lda $17,10
mov $1,$18
jsr $26,printf
ldgp $29,0($26)
lda $16,1
jsr $26,exit
ldgp $29,0($26)
$L2:
mov $15,$30
ldq $26,0($30)
ldq $15,8($30)
lda $30,48($30)
ret $31,($26),1
.end main
.ident "GCC: (GNU) 2.95.3 20010125 (prerelease)"
And, optimized, on Alpha:
.file 1 "ctest.c"
.set noat
.set noreorder
.section .rodata
$LC0:
.ascii "%d = %d\12\0"
.text
.align 5
.globl main
.ent main
main:
.frame $30,16,$26,0
.mask 0x4000000,-16
ldgp $29,0($27)
$main..ng:
lda $30,-16($30)
lda $16,$LC0
lda $17,10
lda $18,10
stq $26,0($30)
.prologue 1
jsr $26,printf
ldgp $29,0($26)
lda $16,1
jsr $26,exit
ldgp $29,0($26)
.end main
.ident "GCC: (GNU) 2.95.3 20010125 (prerelease)"
And this is quite exactly equivalent to the Intel version, albeit with
using "lda" to push the values onto the stack rather than "pushl."
--
(concatenate 'string "aa454" "@freenet.carleton.ca")
http://vip.hex.net/~cbbrowne/oses.html
Rules of the Evil Overlord #17. "When I employ people as advisors, I
will occasionally listen to their advice."
<http://www.eviloverlord.com/>
More information about the Discuss
mailing list