Warp3D: Compiler C/C++ [eng]

Some of you might wonder why disassemble executables ?

Well, simply because our C/C++ compiler produce slow code.

Today I disassembled StarShipW3D which was compiled by Alain Thellier with an old Gcc version, the 2.90.27...

Well, the generated code looks as two peas in a pod like the one from Warp3D. I am almost certain that all the 4.2 version has been compiled at the time with this version of Gcc...

Look now this edifying example in the W3D_Permedia2.library. This is the Per2_SetState function which is concerned. We must understand that our compilers are "mechanical", this means that they convert C/C++ code without thinking !

Per2_SetState weighs 134 bytes :

By looking closely and by understanding the routine, it's clear that we are facing bits tests. By thinking a little bit, it's possible to group all this tests in a single !

Every time a d0 bit is set to one, the routine does a "moveq #0,d0". For a better understanding, simply convert all this comparisons in binary :

$00002000 = %0000000000000000 0010000000000000 (W3D_BLENDING)
$00000400 = %0000000000000000 0000010000000000 (W3D_GOURAUD)
$00000100 = %0000000000000000 0000000100000000 (W3D_TEXMAPPING)
$00000010 = %0000000000000000 0000000000010000 (W3D_GLOBALTEXENV)
$00000200 = %0000000000000000 0000001000000000 (W3D_PERSPECTIVE)
$00000800 = %0000000000000000 0000100000000000 (W3D_ZBUFFER)
$00001000 = %0000000000000000 0001000000000000 (W3D_ZBUFFERUPDATE)
$02000000 = %0000001000000000 0000000000000000 (W3D_SCISSOR)
$00080000 = %0000000000001000 0000000000000000 (W3D_DITHERING)
$00004000 = %0000000000000000 0100000000000000 (W3D_FOGGING)
$00400000 = %0000000001000000 0000000000000000 (W3D_ALPHATEST)
$04000000 = %0000010000000000 0000000000000000 (W3D_CHROMATEST)
$08000000 = %0000100000000000 0000000000000000 (W3D_CULLFACE)

Then you just need to gather all the different bit to be tested in a single digit, which gives :

%0000111001001000 0111111100010000 = $E487F10

In order to remove the last "moveq #0,d0", we must inverse this digit with a not.l :

not.l $E487F10 = $F1B780EF

Here is it a nicely optimised routine which has less of 12 bytes instead of the 134 from the beginning :

Well then, of course, is a rather special case, but nevertheless gives a good idea of human capacities to improve what C/C++ compiler robots does...

An Amiga coder told me that the 68k Macintosh CodeWarrior compiler produced a quality code, which I have not been able to verify...Maybe it should be adapted on ours Amiga ?

So, is that Warp3D will end up being faster overall ?

We muste believe, there's a lots and lots of work anyway !!

(translated by Squaley)

Pages

vendredi 25 avril 2014

Compiler C/C++ [eng]

Aucun commentaire:

Enregistrer un commentaire