You are not logged in.
Lost Password?


Register To Post



 Bottom   Previous Topic   Next Topic

#1
A newer GCC compiler.
Posted on: 2016/3/24 23:20
VB Gamer
Joined 2016/3/13
34 Posts
Long Time User (2 Years)
Hi, I'm new to the site ... until yesterday I didn't even know that anyone had updated the GCC V810 patches passed the original GCC 2.95 patches that were (AFAIK) done by a bunch of Japanese guys in 2000 (for the PC-FX, I believe).

Anyway ... I'm trying to "open-up" the PC-FX for development and have done my own update of the old 2.95 patches to binutils 2.23.2 and GCC 4.7.4, in order to get a "modern" C compiler with C99 capability, and with nearly-all of C11.

It occurs to me that you guys over here with a love for the VirtualBoy may be interested in the work that I've done, and that you might be a larger group to provide a test-bed, rather than the PC-FX community, where I'm pretty-much the only assembler-capable developer.


I've had a quick "chat" with KR155E, and with his help, I've found the following threads ...

"experimental gcc4 patches"
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=3883

"gccVB optimization options and assembly code"
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5055

"Compiling gccvb 4.4.2 under Cygwin"
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5328


From what-I-can-see, I don't think that my patches are experiencing any of the problems that have been reported in those threads, except for the "movhi optimization" issue ... which isn't really an "issue" as such, it's because the compiler doesn't know where the labels are going to resolve to, so it has to generate full 32-bit loads.

That may be something that the linker can resolve with "whole-program-optimization", but I've not been brave-enough to even try to compile the toolchain with that feature enabled.


The new patches are built with mingw64/msys2, and not cygwin, so they're Windows-native programs.

In trying to clean-up the code so that I could understand (and debug) what was going on, I removed a bunch of pointless options that don't make any sense to the PC-FX (or VirtualBoy), such as the long-call, long-jump, GHS, and app-regs. Hopefully nobody here cares about those.

I have a version that uses the old GCC 2.95 ABI (with the 16-bytes of stack reserved for r6-r9), and I just completed the transition to the new GCC ABI from 2010 that removes that redundant stack space.

My next task is to change the ABI even more so that I can get useful stack-frames and actually implement a working backtrace function for debugging.

So ... I have a couple of technical questions for the assembly-capable developers here.

I've not seen the VirtualBoy SDK (and don't particularly want to wade through it) ... but are the V810's registers R2 and R5 actually used in whatever VirtualBoy libraries you guys use?

Does the VirtualBoy have single-cycle RAM, or does it have wait-states that slow down RAM access?

Are you using any Nintendo binary-only libraries, or can you re-assemble/re-compile whatever libraries/engines that you're using?
Top

#2
Re: A newer GCC compiler.
Posted on: 2016/3/25 0:36
PVB Elite
Joined 2003/7/26
USA
1446 Posts
PVBCC EntryCoderContributorSpecial AchievementTop10 PosterHOTY09 EntryLong Time User (15 Years) App Coder20+ Game RatingsPVBCC 2013 Entry
Quote:

ElmerPCFX wrote:
So ... I have a couple of technical questions for the assembly-capable developers here.


I'm only barely "assembly-capable" on the v810, but I'll take a shot at answering these.

Quote:
I've not seen the VirtualBoy SDK (and don't particularly want to wade through it) ... but are the V810's registers R2 and R5 actually used in whatever VirtualBoy libraries you guys use?


There really is no "the VirtualBoy SDK" due to a lot of fragmenting, but, TMK, most of the existing code out there avoids direct access to registers except in the necessary setting of hardware ports for control of the peripheral hardware. If you want to make use of these registers for a specific purpose (especially if it means improving memory usage of generated code), I'm sure existing projects could be made compatible quite easily.

Quote:
Does the VirtualBoy have single-cycle RAM, or does it have wait-states that slow down RAM access?


The cartride ROM has either 1 or 2 (the default) wait-states, selectable in software. The RAM used by the video hardware (the "VIP") has 2-5 waits, depending on what part of the display rendering cycle it's currently in. All other areas have a fixed wait-state of 1.

Quote:
Are you using any Nintendo binary-only libraries, or can you re-assemble/re-compile whatever libraries/engines that you're using?


None of the existing, publicly-available, homebrew VB software uses any Nintendo code, binary or otherwise. I can't speak for what anyone has on their personal PCs, though.
Top

#3
Re: A newer GCC compiler.
Posted on: 2016/3/25 4:24
VB Gamer
Joined 2016/3/13
34 Posts
Long Time User (2 Years)
Quote:
RunnerPack wrote:
I'm only barely "assembly-capable" on the v810, but I'll take a shot at answering these.

Thanks!



Quote:
There really is no "the VirtualBoy SDK" due to a lot of fragmenting, but, TMK, most of the existing code out there avoids direct access to registers except in the necessary setting of hardware ports for control of the peripheral hardware. If you want to make use of these registers for a specific purpose (especially if it means improving memory usage of generated code), I'm sure existing projects could be made compatible quite easily.

Ah ... I'm going by the Nintendo Seminar docs, and the PC-FX SDK docs, and the GCC docs ... all of which follow NEC's V810 Architecture Manual, where R2 is reserved as the "Handler Stack Pointer", and R5 is reserved as the "Text Pointer" (which means the address of the start of the program code).

Now, the PC-FX BIOS and the official SDK libraries (which I'm going to ignore), never actually use either of these registers, they're just wasted.

Newer versions of GCC (well after 2.95, I think) added an option "-app-regs" that lets the compiler use these 2 registers for the code that it generates.

I'd be quite surprised if anyone here is relying on that option.

I have my own ideas of how I'd like to use those registers.

I'd like to move the Frame Pointer to R2 (right next to the Stack Pointer in R3), and I'd like to use R5 to replace the V850's EP register ... and basically gain another 32KB of fast-access variable space, particularly for use as thread-local variables.

This isn't going to cause any problems on the PC-FX ... but I'm curious if it will cause any problems on the VirtualBoy.

If you're programming bare-metal with no BIOS or Nintendo libraries ... then it shouldn't really cause you guys any trouble, either.

As for "memory usage" ... how "cramped" are you guys? Are you using the "optimize-for-space" option and/or the "prolog-function" option?



Quote:
The cartridge ROM has either 1 or 2 (the default) wait-states, selectable in software. The RAM used by the video hardware (the "VIP") has 2-5 waits, depending on what part of the display rendering cycle it's currently in. All other areas have a fixed wait-state of 1.

OK, thanks! I guess that Nintendo went a little cheap on the memory (again).

The PC-FX runs everything from RAM, so I'm more worried about pipeline-stalls than I am about memory access times.

I guess that you guys have different issues, and that the VirtualBoy's memory timing dwarfs the occasional modify-then-read pipeline-stall.

That means that I should definitely keep the frame-pointer "optional" rather than "required" (which is a pity, because it's so darned useful when implemented properly).



Quote:
None of the existing, publicly-available, homebrew VB software uses any Nintendo code, binary or otherwise. I can't speak for what anyone has on their personal PCs, though.

Excellent, you've got a completely clean-and-legal toolkit, and that means that you've got the source-code to make any changes if you use the new 2010 ABI, or whatever I come up with (if it's an improvement).
Top

#4
Re: A newer GCC compiler.
Posted on: 2016/3/25 7:30
Nintendoid!
Joined 2007/12/14
166 Posts
CoderLong Time User (10 Years) App Coder
Hi ElmerPCFX, welcome to PVB! Glad to meet another fellow programmer with an interest in improving our little homebrew toolchain. :)

Please check out my thread here http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5252 about GCC 4 and generating PC-relative jumps. I threw together a tool that hacks around the problem by poking at the output ELF file, but if you have a solid grasp of what GCC actually does behind the scenes to exhibit this unwanted behavior, then hopefully you perhaps have a better idea of what a proper solution might be. This bug (among others) is kind of what has put the brakes on my VB development since I'd much rather be fighting my *own* code versus GCC's code.

I've since switched to writing purely in assembly, and FWIW I completely ignore NEC's register allocations, save for those used by the mul/mulu/bitstring/etc. instructions. The VB itself doesn't care either. One of my patches to GCC was to rename 'ep' so that the assembler would recognize 'r30' as a valid alias! :)
Top

#5
Re: A newer GCC compiler.
Posted on: 2016/3/25 16:28
VB Gamer
Joined 2016/3/13
34 Posts
Long Time User (2 Years)
Hi blitter,

It's always good to see someone else that's comfortable in assembly-language.

Quote:

blitter wrote:
Please check out my thread here http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5252 about GCC 4 and generating PC-relative jumps.


I looked at the thread, and it's pretty obvious that the problem is in the symbol-relocation code in binutils.

A quick comparison of the binutils 2.20.1 patch, my binutils 2.23.2 patch, and the current V850 code shows that there's a bug in the binutils 2.20.1 patch in the R_V810_26_PCREL relocation.


insn 
|= (((addend 0xfffe) << 16) | ((addend 0x3f0000) >> 16));


should be


insn 
|= (((addend 0xfffe) << 16) | ((addend 0x3ff0000) >> 16));


The patch that you're using loses the top 4 bits of the 26-bit relative address.

With the bug, the maximum relocation is 0x003fffff ... which corresponds nicely to your observed bad-offset of 0x00400000.

I don't know-for-sure that fixing that will make your problem go away, but I think that it's pretty likely.

I don't know how-easy it is for you to recompile binutils and test that ... I've had a lot of trouble compiling old versions of binutils and GCC with newer versions of the GNU build tools.

I'm using msys2, which keeps very current on all the latest versions of the GNU tools, and I need a bunch of extra patches to compile binutils 2.32.2, and any GCC that's older than 4.7.
Edited by ElmerPCFX on 2016/3/25 16:41
Top

#6
Re: A newer GCC compiler.
Posted on: 2016/3/25 21:42
VB Gamer
Joined 2016/3/13
34 Posts
Long Time User (2 Years)
blitter: Can I ask what your thinking was behind you 2011-11-23 patch to change the HARD_FRAME_POINTER_REGNUM from 29 to 25?

Do you have an example of whatever the problem was that this was designed to fix?
Top

#7
Re: A newer GCC compiler.
Posted on: 2016/3/26 2:44
PVB Elite
Joined 2008/12/28
Slovenia
617 Posts
Highscore Top ScoreHighscore Top ScoreCoderContributor10+ Game RatingsLong Time User (9 Years) App CoderPVBCC 2010 EntryPVBCC 2013 Entry
Quote:

ElmerPCFX wrote:
I'd like to move the Frame Pointer to R2 (right next to the Stack Pointer in R3), and I'd like to use R5 to replace the V850's EP register ... and basically gain another 32KB of fast-access variable space, particularly for use as thread-local variables.

I don't see a need for a frame pointer, and neither did NEC apparently.

And you can already access a 64K range with a single register by using negative displacements. Commercial VB games set register 4 to 0x05008000 and use it to access global variables anywhere in the WRAM (which is 64K long).
Quote:

blitter: Can I ask what your thinking was behind you 2011-11-23 patch to change the HARD_FRAME_POINTER_REGNUM from 29 to 25?

Do you have an example of whatever the problem was that this was designed to fix?

Bitstring instructions, probably.
Top

#8
Re: A newer GCC compiler.
Posted on: 2016/3/26 3:42
VB Gamer
Joined 2016/3/13
34 Posts
Long Time User (2 Years)
Quote:

HorvatM wrote:
I don't see a need for a frame pointer, and neither did NEC apparently.

NEC didn't mandate a specific register for the frame-pointer ... there's a huge difference between that and saying that they didn't see the need for a frame-pointer.

Just because you don't see the need for frame pointers and backtraces doesn't change the fact that I do, and so a huge proportion of experienced C/C++ programmers. In-system backtraces are useful for a whole bunch of things.

I don't know what compiler Nintendo shipped with the VirtualBoy, but it was probably the Green Hills suite.

Which supports frame pointers, as does GCC ... and every C compiler that I know of. Sometimes the compiler absolutely needs to use a frame-pointer ... which GCC does automatically, even when you use the "omit-frame-pointers" option.

Just because the guys that added V850 support to GCC back in the 1990's goofed on the stack order of the saved registers and made the frame-pointer unusable for doing a backtrace, doesn't mean that we need to keep following that mistake in 2016.


Quote:
And you can already access a 64K range with a single register by using negative displacements. Commercial VB games set register 4 to 0x05008000 and use it to access global variables anywhere in the WRAM (which is 64K long).


I didn't know that the VirtualBoy only had 64KB RAM, thanks!

So you guys don't need anything more than the existing SDA segment (gp-register-relative) support, that's good to know.

But the PC-FX has 2MB RAM, so I could use something a bit more sophisticated.

And you're ignoring the whole point of a thread-local-variable area ... which is another reason to move the TDA segment to R5 on the V810 instead of R30 on the V850.


Quote:
Bitstring instructions, probably.


I'm sorry, but that's a completely unhelpful answer.

Sure ... he's trying to move the hard-frame-pointer away from the register that are used by the bitstring instructions.

Why? Are you guys doing bitstring instructions in inline-assembly within the C code? If so ... are you telling the compiler what registers you clobber?

Are you doing bitstring instructions from assembly? ... If so, it makes little difference whether the compiler puts its frame-pointer in R29 ... especially since you're probably compiling with "omit-frame-pointers" anyway.

Do you realize the effect that moving that definition has on the compiled-code when the compiler does need a frame pointer ... especially if you're using function-prologues?
Top

#9
Re: A newer GCC compiler.
Posted on: 2016/3/26 4:09
Nintendoid!
Joined 2007/12/14
166 Posts
CoderLong Time User (10 Years) App Coder
Quote:

HorvatM wrote:
Bitstring instructions, probably.


Precisely.

Quote:

ElmerPCFX wrote:
I'm sorry, but that's a completely unhelpful answer.

Sure ... he's trying to move the hard-frame-pointer away from the register that are used by the bitstring instructions.

Why? Are you guys doing bitstring instructions in inline-assembly within the C code? If so ... are you telling the compiler what registers you clobber?


Yes, and yes. It has been quite a while but as I recall either r29 was ignored when I specified it in the clobber list or I got some kind of error.

Quote:

Are you doing bitstring instructions from assembly? ... If so, it makes little difference whether the compiler puts its frame-pointer in R29 ... especially since you're probably compiling with "omit-frame-pointers" anyway.


Anything I'm doing from non-inline assembly the compiler should not touch, period, other than to assemble it. But for what it's worth I use -fomit-frame-pointers in my Makefiles. Again, it's been a while so I don't remember the exact problem moving the frame register solved, but it was definitely related to the bitstring instructions.

Quote:

Do you realize the effect that moving that definition has on the compiled-code when the compiler does need a frame pointer ... especially if you're using function-prologues?


Frame pointers and backtraces in my experience are pretty useless in VB homebrew since source-level debugging is pretty nonexistent above the assembly code level. Maybe it was possible with Nintendo's official tools and development hardware, but I for one have never used their tools much less *seen* official VB dev hardware, and I can't name any forum regulars who have either. Thus, and I'm sorry, but I care little about what happens to the frame pointer. :) As for function prologues, I ran some simple tests before publicizing those patches and didn't run into problems, but caveat emptor, YMMV, etc.
Top

#10
Re: A newer GCC compiler.
Posted on: 2016/3/26 4:16
Nintendoid!
Joined 2007/12/14
166 Posts
CoderLong Time User (10 Years) App Coder
Quote:

ElmerPCFX wrote:
I looked at the thread, and it's pretty obvious that the problem is in the symbol-relocation code in binutils.

A quick comparison of the binutils 2.20.1 patch, my binutils 2.23.2 patch, and the current V850 code shows that there's a bug in the binutils 2.20.1 patch in the R_V810_26_PCREL relocation.


insn 
|= (((addend 0xfffe) << 16) | ((addend 0x3f0000) >> 16));


should be


insn 
|= (((addend 0xfffe) << 16) | ((addend 0x3ff0000) >> 16));


The patch that you're using loses the top 4 bits of the 26-bit relative address.

With the bug, the maximum relocation is 0x003fffff ... which corresponds nicely to your observed bad-offset of 0x00400000.

I don't know-for-sure that fixing that will make your problem go away, but I think that it's pretty likely.


Thank you! I might be able to bring up a gccVB build chain this weekend and test that fix for myself. I agree; it looks likely.

Quote:

I don't know how-easy it is for you to recompile binutils and test that ... I've had a lot of trouble compiling old versions of binutils and GCC with newer versions of the GNU build tools.

I'm using msys2, which keeps very current on all the latest versions of the GNU tools, and I need a bunch of extra patches to compile binutils 2.32.2, and any GCC that's older than 4.7.


I do all my VB dev in Mac OS X. Specifically, I build the toolchain in 10.6 with an older version of GCC installed via macports. The build products continue to work in the latest version of OS X El Capitan, plus as a bonus I can build PPC versions too.
Top

 Top   Previous Topic   Next Topic


Register To Post