You are not logged in.
Lost Password?


Register To Post



 Bottom   Previous Topic   Next Topic

#21
Re: A newer GCC compiler.
Posted on: 2016/4/2 22:24
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
Quote:

blitter wrote:

Do you mean grouping "ld" instructions together to speed up the data fetch pipeline? "in" doesn't follow those rules?


Yes.

And no, it doesn't follow the same rules according to the instruction cycle timings in the V810 Architecture manual.
Top

#22
Re: A newer GCC compiler.
Posted on: 2016/4/8 18:57
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
The Good:

The new stack-frame layout is implemented (slightly changed from my original proposal), and R2 is now the permanent-frame-pointer instead of the compiler just using R29 whenever a frame-pointer is needed.


*****************************

GCC 1999-ABI V850 STACK FRAME

CALLER
          incoming
-arg0
ap
->      16-bytes-reserved

CALLEE
          saved
-lp
          saved
-??
fp->      saved-fp
          local
-variables
          outgoing
-arg?
          
outgoing-arg0
sp
->      16-bytes-reserved

*****************************

GCC 2016-ABI V810 STACK FRAME

CALLER
fp
-> ap-> incoming-arg0

CALLEE
          saved
-fp
          saved
-lp
          saved
-??
          
local-variables
          outgoing
-arg?
          
outgoing-arg0
sp
->      4-bytes-reserved

*****************************


"-mprolog-function" is working, but I've stopped it from being automatically-enabled whenever any optimization is requested.

The new stack frame layout reduces the code-size of the prolog functions so that there's a good chance that they'll stay in the V810's instruction cache more often. Note: the new prolog functions always save the FP and the LP when they're used.

A stack backtrace is now possible when either "-fno-omit-frame-pointer" or "-mprolog-function" is used.

Any C "leaf" functions (i.e. functions that don't call other functions) will omit the prolog function if they don't destroy any callee-saved register, and so small-fast-utility-code will still run as-fast-as-possible.

The NEC-standard register conventions are still the same, except for R2 now being the FP.

Any assembly langauge code that reads arguments off the stack will need to subtract 16 from their offset.


The Bad:

Any C "interrupt-handler" functions are probably broken at the moment, until I get around to fixing them.

Does anyone actually write interrupt-handlers in C???

The compiler generates some pretty slow register-saving code for them, so I sort-of assume that folks just write then in assembly. Am I wrong?


The Future (long term):

I'd like to add a few compiler intrinsics for some of the V810 opcodes, particularly the string opcodes and the in/out opcodes. That would allow the compiler to easily in-line some stuff that people have to drop into assembly to do.

It would also be a thought to contemplate changing the standard register usage so that R26-R29 are not callee-saved registers, and so avoid the compiler from having to save them on the stack whenever someone wants to use a string opcode. But doing so would break all current assembly-language code, and I suspect that people wouldn't want that. "Yes", the change in stack-offset in the new ABI also breaks things ... but that's an easy thing to find/fix. Changing ALL the registers would be a much more complicated thing to fix.
Top

#23
Re: A newer GCC compiler.
Posted on: 2016/4/8 22:03
Nintendoid!
Joined 2007/8/8
Great Britain
201 Posts
CoderContributorHOTY09 EntryLong Time User (11 Years) App Coder
Hi Elmer

Welcome, and great work.

Quote:
Does anyone actually write interrupt-handlers in C???


Yes, using the interrupt_handler function attribute.

Quote:
The compiler generates some pretty slow register-saving code for them, so I sort-of assume that folks just write then in assembly. Am I wrong?


From what I remember, when a function is declared with the interrupt_handler attribute the compiler generates save_interrupt and restore_interrupt prolog/epilogs, which save/restore only four or five registers. What have you observed?

dasi
Top

#24
Re: A newer GCC compiler.
Posted on: 2016/4/8 23:26
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
Quote:

dasi wrote:

Welcome, and great work.


Thanks!


Quote:
From what I remember, when a function is declared with the interrupt_handler attribute the compiler generates save_interrupt and restore_interrupt prolog/epilogs, which save/restore only four or five registers. What have you observed?


Yep, there are those calls, and if your function actually calls anything else that isn't inlined, then the compiler has to save the LP ... and that triggers the generation of calls to save_all_interrupt/restore_all_interrupt which save all the other registers.

Now, please remember that I have no idea about VirtualBoy programming, and that I'm more used to consoles that produce traditional TV-output ... but in that world, you've got the hblank interrupt ... which needs to be blindingly fast, and you've got the vblank interrupt ... which usually does a lot of stuff and calls a lot of different things.

I can see that having the save_all_interrupt and restore_all_interrupt functions doesn't really hurt when the compiler needs to use them, because you're going to take a pretty big hit anyway with all of those registers.

But I don't really understand the use of the basic save_interrupt and restore_interrupt functions ... they just seem to slow the down the interrupt-handling, and don't save very much code space (just how many different interrupt-handler functions are used in a single program that cause you to be worried about a few bytes???).

Anyway ... whatever ... I guess that I should fix the compiler's handling of the prolog/epilog expansion for the interrupt_handler functions.
Top

#25
Re: A newer GCC compiler.
Posted on: 2016/4/8 23:47
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
FYI, I use a custom-modified version of Mednafen for debugging ... basically it just uses a larger font in the debugger so that it's more readable for folks with tired eyes.

It only supports a few platforms (PC Engine, PC-FX, and now VirtualBoy) ... but if anyone is interested, I can add a link to it.

Here is the main VirtualBoy debugger screen ...

Open in new window



Here is the VirtualBoy memory editor screen ...

Open in new window
Top

#26
Re: A newer GCC compiler.
Posted on: 2016/4/10 20:09
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
Quote:

dasi wrote:

Yes, using the interrupt_handler function attribute.


I fixed the "interrupt_handler" to where it's working again, although I'm not using the helper-functions anymore, because I really can't see the point.

I could make the code a tiny bit smarter ... but IMHO it's already a little bit better than GCC's V850 code, so any further work on it can wait.


************************************

volatile int __attribute__ ((zda)) zda_frame_count 0;

__attribute__ ((interrupt_handler)) void my_irq1 (void)
{
  for (
int i 0100i++)
    
zda_frame_count++;
}

_my_irq1add -4,sp
          st
.w r1,0[sp]
          
add -8,sp
          st
.w r10,0[sp]
          
movea 100,r0,r10
          st
.w r11,4[sp]
.
L7:      ld.w zdaoff(_zda_frame_count)[r0],r11
          add 
-1,r10
          add 1
,r11
          st
.w r11,zdaoff(_zda_frame_count)[r0]
          
cmp 0,r10
          bne 
.L7
          ld
.w 0[sp],r10
          ld
.w 4[sp],r11
          add 8
,sp
          ld
.w 0[sp],r1
          add 4
,sp
          reti

************************************

volatile int sda_frame_count 0;

__attribute__ ((noinline)) void increment_sda_frame_count (void)
{
  
sda_frame_count++;
}

__attribute__ ((interrupt_handler)) void my_irq2 (void)
{
  for (
int i 0100i++)
    
increment_sda_frame_count();
}

_increment_sda_frame_count:
          
ld.w sdaoff(_sda_frame_count)[gp],r10
          add 1
,r10
          st
.w r10,sdaoff(_sda_frame_count)[gp]
          
jmp [r31]

_my_irq2add -4,sp
          st
.w r1,0[sp]
          
mov sp,r1
          addi 
-72,sp,sp
          st
.w r29,-12[r1]
          
st.w fp,-4[r1]
          
movea 100,r0,r29
          mov r1
,fp
          st
.w r6,-72[r1]
          
st.w r7,-68[r1]
          
st.w r8,-64[r1]
          
st.w r9,-60[r1]
          
st.w r10,-56[r1]
          
st.w r11,-52[r1]
          
st.w r12,-48[r1]
          
st.w r13,-44[r1]
          
st.w r14,-40[r1]
          
st.w r15,-36[r1]
          
st.w r16,-32[r1]
          
st.w r17,-28[r1]
          
st.w r18,-24[r1]
          
st.w r19,-20[r1]
          
st.w r30,-16[r1]
          
st.w lp,-8[r1]
.
L3:      add -1,r29
          jal _increment_sda_frame_count
          cmp 0
,r29
          bne 
.L3
          ld
.-4[fp],r1
          ld
.-72[fp],r6
          ld
.-68[fp],r7
          ld
.-64[fp],r8
          ld
.-60[fp],r9
          ld
.-56[fp],r10
          ld
.-52[fp],r11
          ld
.-48[fp],r12
          ld
.-44[fp],r13
          ld
.-40[fp],r14
          ld
.-36[fp],r15
          ld
.-32[fp],r16
          ld
.-28[fp],r17
          ld
.-24[fp],r18
          ld
.-20[fp],r19
          ld
.-16[fp],r30
          ld
.-12[fp],r29
          ld
.-8[fp],lp
          mov fp
,sp
          mov r1
,fp
          ld
.w 0[sp],r1
          add 4
,sp
          reti

************************************
Top

#27
Re: A newer GCC compiler.
Posted on: 2016/4/17 11:29
Virtual Freak
Joined 2014/8/31
USA
82 Posts
Long Time User (4 Years) App Coder
I'm going to be perfectly honest: I've lost track of the number of GCC versions for VB there are lol. Perhaps we should document them somewhere in a sticky thread?

I can think of 4 offhand:
* 2.9.5 that's existed for ages
* blitter's 4.4
* Dasi's 4.7
* Elmer's 4.7

IIRC, the startup code is more or less the same between all but the last one (in fact, I believe blitter's even reuses the 2.9.5 file for this and relocations).

Last year, I started my own port, that didn't get far b/c of real life. I would be interested in trying an LLVM port tho at some point, even if it has already been done. V810 is one of the only CPUs where I could reasonably succeed in such a port.
Top

#28
Re: A newer GCC compiler.
Posted on: 2016/4/17 18:05
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
Quote:

cr1901 wrote:

I'm going to be perfectly honest: I've lost track of the number of GCC versions for VB there are lol.


From my POV, it's all about the "dialect" of C that you want to program in.

GCC 2.9.5 is C89/ANSI-C.
GCC 4.7 is C99 with most of C11.

You can also see from the examples of the generated-code that I've shown, that GCC 4.7 is a little smarter than GCC 2.95 about moving some loop-invariant calculations outside the loop itself for speed.

I don't know if the GCC 2.95 version has any problems, but it has been around for a long time, and it's a "classic" good version of GCC that was used for a lot of game development in the early 2000s (with various patches).

OTOH ... It's getting really, really hard to compile a working GCC 2.95 anymore because modern linux toolchains barf on some of the early-GCC-specific code that's in there. You pretty much need to find an old GCC 3.x compiler from somewhere.

The GCC 4.4 port seems to have a few problems with it, and so (by his own admission) does Dasi's GCC 4.7 port.

The GCC 4.4 port is also producing some pretty inefficient code for some reason.

My GCC 4.7 port hasn't received enough widespread testing yet to see what bugs I've introduced ... but Alex Marshall's "liberis" examples all work properly, and another user has ported a simple shoot-em-up to the PC-FX with no apparent problems.


Quote:
IIRC, the startup code is more or less the same between all but the last one (in fact, I believe blitter's even reuses the 2.9.5 file for this and relocations).


As I mentioned before, my interest is in the PC-FX, and not the VirtualBoy, and so the linker scripts and the startup code have been tailored to that platform.

It shouldn't be hard to create a "VirtualBoy" patch that changes them into something that works better for the VB.

BTW ... I did add the VirtualBoy's custom Nintendo instructions to binutils.

If someone is interested in being the "goto guy" for a VirtualBoy version of my patches, then I'd love to hear it.

AFAIK they're stable and working, and I'm not planning on doing anything more to them for a while because I've got other stuff that needs to be done.
Edited by ElmerPCFX on 2016/4/17 18:27
Top

#29
Re: A newer GCC compiler.
Posted on: 2016/4/17 19:26
Nintendoid!
Joined 2007/8/8
Great Britain
201 Posts
CoderContributorHOTY09 EntryLong Time User (11 Years) App Coder
Quote:
My GCC 4.7 port hasn't received enough widespread testing yet to see what bugs I've introduced ... but Alex Marshall's "liberis" examples all work properly, and another user has ported a simple shoot-em-up to the PC-FX with no apparent problems.

. . .

If someone is interested in being the "goto guy" for a VirtualBoy version of my patches, then I'd love to hear it.


I'd be happy to help with that and put a build together for testing. There are a few reasonably large Virtual Boy projects around which should give your patches a good workout. :)

dasi
Edited by dasi on 2016/4/17 19:38
Top

#30
Re: A newer GCC compiler.
Posted on: 2016/4/17 20:12
VB Gamer
Joined 2016/3/13
42 Posts
Long Time User (3 Years)
I'll send you a PM.
Top

 Top   Previous Topic   Next Topic


Register To Post