(out-of-band) performance with gcc3

Marc-Antoine Parent maparent at acm.org
Wed Jan 21 16:11:37 CET 2004


> Also, what do you mean by "variance" here - variance between your
> consecutive runs, or the difference between -fguess-branch-probability
> and -fno-guess-branch-probability?

Between runs :-(

>> I guess things changed because iirc that was not the case when I tried
>> it last.  I also guess that you are talking only about emulate.cc.  I
> yes.

Oh, so it was a mistake to apply it to the whole code... That might 
change results!
While we're at it, would this flag remove the need for 
-finline-limit=500 ?

OK, I tried. (with the caveat above.)

New values: All of mozart without -fno-guess-branch-probability

Recall: emulator  without -fno-guess-branch-probability, but with  
-finline-limit=500
Bridge: 3720
Compiler: 2920
Diff: 3310
FD: 5490
Knights: 2700
Nrev: 5170
Port: 4820
Rec: 5240
Tak: 3710
Tak Thread: 2720

New run of the same (to give an idea of the variance)
Bridge: 4050
Compiler: 3190
Diff: 3470
FD: 5880
Knights: 2910
Nrev: 5280
Port: 5230
Rec: 5610
Tak: 3790
Tak Thread: 2430


emulator w/o either flag
Bridge: 3800
Compiler: 2980
Diff: 3470
FD: 5570
Knights: 2770
Nrev: 5210
Port: 5030
Rec: 5360
Tak: 3790
Tak Thread: 2290

emulator w/o  -finline-limit=500 but with -fno-guess-branch-probability
Bridge: 4010
Compiler: 3050
Diff: 3370
FD: 5780
Knights: 3060
Nrev: 5160
Port: 5130
Rec: 5370
Tak: 4000
Tak Thread: 2430

emulator with both  -finline-limit=500 and -fno-guess-branch-probability
Bridge: 3760
Compiler: 2860
Diff: 3350
FD: 5590
Knights: 2780
Nrev: 5180
Port: 4830
Rec: 5280
Tak: 3730
Tak Thread: 2220

I would say the differences fall within variations I have seen. I admit 
I do not otherwise disable my computer when running those benchmarks, 
which allows for spikes;
Nut I nice -10 to give them priority.

I notice that the effect of  -finline-limit=500 is quite limited. It 
was more evident when Denys and I did the testing, a while ago.
But this is a slightly newer version of the compiler, and a new version 
of the OS which I know has impact on the task scheduler (Hopefully 
nothing that my unnice cannot compensate for.)

>> I was curious re what the effect would be on Darwin-ppc.
>> I am using CVS head, and I introduced that flag
>> (-fno-guess-branch-probability) in the Makefile.
>> FYI, I am using gcc version 3.3 20030304 (Apple Computer, Inc. build
>> 1495)
>> and the other flags are -O3 -fstrict-aliasing -fschedule-insns2
>> -fomit-frame-pointer
>> (adding  -finline-limit=500 on the emulator)
>>
>> The results (using Deny's benchmark) are... mixed, mostly negative
>> actually:
>> (But the variance on the results is scary, to be honest)
>
> Well, I'd say the keyword here is "gcc version 3.3 20030304 (Apple
> Computer, Inc. build 1495)", as opposed to "vanilla" gcc 3.3.2 - as it
> comes from gcc.gnu.or ;-[

On our platform, we pretty much have to use the Apple gcc to benefit 
from the way it handles shared libraries and all.
I have built and used the gnu gcc before, but it is not pretty to use 
it here, and I would rather this were not a requirement.
AFAIK, it has little impact on code generation besides linking.

> Could you tell us the flags that are in effect (can be seen in
> emulate.s)?

; GNU C++ version 3.3 20030304 (Apple Computer, Inc. build 1495) 
(ppc-darwin)
;       compiled by GNU C version 3.3 20030304 (Apple Computer, Inc. 
build 1495).
; GGC heuristics: --param ggc-min-expand=30 --param 
ggc-min-heapsize=131072
; options passed:  -I/sw/include -I.
; -I/Users/maparent/Programmation/mozart/platform/emulator -D__GNUC__=3
; -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=0 -D__APPLE_CC__=1495
; -D__DYNAMIC__ -DHAVE_CONFIG_H -D__OBJC_EXCEPTIONS__
; -D__CONSTANT_CFSTRINGS__ -DMAC_OS_X_VERSION_MIN_REQUIRED=1030
; -D__GNUG__=3 -fobjc-exceptions -fconstant-cfstrings -fPIC -mtune=G4
; -auxbase -O3 -fno-exceptions -fno-implicit-templates -fstrict-aliasing
; -fschedule-insns2 -fomit-frame-pointer -finline-limit=500 
-fverbose-asm
; -D__private_extern__=extern
; options enabled:  -fdefer-pop -fomit-frame-pointer
; -foptimize-sibling-calls -fcse-follow-jumps -fcse-skip-blocks
; -fexpensive-optimizations -fthread-jumps -fstrength-reduce -fpeephole
; -fforce-mem -ffunction-cse -fkeep-static-consts -fcaller-saves
; -freg-struct-return -fgcse -fgcse-lm -fgcse-sm -floop-optimize
; -fcrossjumping -fif-conversion -fif-conversion2 -frerun-cse-after-loop
; -frerun-loop-opt -fdelete-null-pointer-checks -fschedule-insns
; -fschedule-insns2 -fsched-spec -fbranch-count-reg -fPIC
; -freorder-functions -frename-registers -fcprop-registers -fcommon
; -fverbose-asm -fgnu-linker -fregmove -foptimize-register-move
; -fargument-alias -fstrict-aliasing -falign-loops-max-skip
; -falign-jumps-max-skip -fmerge-constants -fzero-initialized-in-bss
; -fident -fpeephole2 -fguess-branch-probability -fmath-errno
; -ftrapping-math -fbranch-predictions -fcoalesce -fweak-coalesced
; -fcoalesce-templates -fss-const-prop -mpowerpc -mpowerpc-gfxopt
; -mnew-mnemonics -msched-prolog -msched-epilog -mtune=G4

(This last flag I set for my private builds only, not for releases)

> It would be immensely interesting to know how does the 2.95.3 behaves
> on you platform. If you would take the trouble of obtaining that gcc..

It was possible to ask to run with gcc 2.95.2 in Mac OS X 10.2, but 
that support is not now installed with the Apple development tools in 
10.3.
I had built with gcc2 in that time, and yes Mozart was faster. Of 
course, ABI incompatibilities meant compiling everything in that mode, 
which is a nuisance.
Again, I wish we'd stick with the platform default compiler.

Marc-Antoine
-
Please send submissions to hackers at mozart-oz.org
and administriva mail to hackers-request at mozart-oz.org.
The Mozart Oz web site is at http://www.mozart-oz.org/.





More information about the mozart-hackers mailing list