vendredi 25 avril 2014

Compiler C/C++ [eng]

Some of you might wonder why disassemble executables ?

Well, simply because our C/C++ compiler produce slow code.

Today I disassembled StarShipW3D which was compiled by Alain Thellier with an old Gcc version, the 2.90.27...

Well, the generated code looks as two peas in a pod like the one from Warp3D. I am almost certain that all the 4.2 version has been compiled at the time with this version of Gcc...

Look now this edifying example in the W3D_Permedia2.library. This is the Per2_SetState function which is concerned. We must understand that our compilers are "mechanical", this means that they convert C/C++ code without thinking !

Per2_SetState weighs 134 bytes :

By looking closely and by understanding the routine, it's clear that we are facing bits tests. By thinking a little bit, it's possible to group all this tests in a single !

Every time a d0 bit is set to one, the routine does a "moveq #0,d0". For a better understanding, simply convert all this comparisons in binary :
  • $00002000 = %0000000000000000 0010000000000000 (W3D_BLENDING)
  • $00000400 = %0000000000000000 0000010000000000 (W3D_GOURAUD)
  • $00000100 = %0000000000000000 0000000100000000 (W3D_TEXMAPPING)
  • $00000010 = %0000000000000000 0000000000010000 (W3D_GLOBALTEXENV)
  • $00000200 = %0000000000000000 0000001000000000 (W3D_PERSPECTIVE)
  • $00000800 = %0000000000000000 0000100000000000 (W3D_ZBUFFER)
  • $00001000 = %0000000000000000 0001000000000000 (W3D_ZBUFFERUPDATE)
  • $02000000 = %0000001000000000 0000000000000000 (W3D_SCISSOR)
  • $00080000 = %0000000000001000 0000000000000000 (W3D_DITHERING)
  • $00004000 = %0000000000000000 0100000000000000 (W3D_FOGGING)
  • $00400000 = %0000000001000000 0000000000000000 (W3D_ALPHATEST)
  • $04000000 = %0000010000000000 0000000000000000 (W3D_CHROMATEST)
  • $08000000 = %0000100000000000 0000000000000000 (W3D_CULLFACE)

Then you just need to gather all the different bit to be tested in a single digit, which gives :
  • %0000111001001000 0111111100010000 = $E487F10

In order to remove the last "moveq #0,d0", we must inverse this digit with a not.l :
  • not.l $E487F10 = $F1B780EF

Here is it a nicely optimised routine which has less of 12 bytes instead of the 134 from the beginning :

Well then, of course, is a rather special case, but nevertheless gives a good idea of human capacities to improve what C/C++ compiler robots does...

An Amiga coder told me that the 68k Macintosh CodeWarrior compiler produced a quality code, which I have not been able to verify...Maybe it should be adapted on ours Amiga ?

So, is that Warp3D will end up being faster overall ?

We muste believe, there's a lots and lots of work anyway !!

(translated by Squaley)
   

Aucun commentaire:

Enregistrer un commentaire

Laissez vos commentaires ici :