Friday, May 30, 2014

I will be at California Extreme 2014!

Barring unpreventable emergencies (such as my unborn son coming very prematurely), I will be at California Extreme 2014 on July 12th.  I've contacted the CAX organizers and proposed that I give another presentation like I did 3 years ago (wow, can't believe it's been that long!) and now they are (presumably) considering it.  So this may or may not happen.  I am keeping my fingers crossed because I had a lot of fun last time and have some new cool stuff to talk about this time regarding Daphne, Dexter, and more :)

I will really try to have a Dexter rev3 prototype on hand to show off at the show, and to install in available laserdisc game cabinets at the show for those who are interested.  That means I have about a month to get the design finished, get the PCBs fabricated, and get the board assembled.  It's crunch time! :)

Saturday, May 24, 2014

Fixed Daphne 64-bit overflow bug

So, not too long ago, I modified the way vertical blanking timing is calculated to be very accurate.  I was working on Star Rider emulation, so I didn't spend that much time thinking about overflow issues when designing this.  Well, it turns out that my initial design was pretty prone to overflow, plus it was computationally expensive.  Here is how I calculated CPU cycle counts relative to vertical blanking timing:

#define DIVIDER 63000000

void VblankTimer::OnTopFieldVblankStart(unsigned int uFrameIdx)
{
// this operation is essentially [lines finished] * [microsecond / line] * [ cycles / microsecond]
// (uFrameIdx * 525) * (4004/63) * (cpuHz / 1000000)

// TODO : would floating point math be faster here?

// determine the base cycle count using a VERY accurate, self-correcting, and expensive method
m_u64Line1StartCycle = (uFrameIdx * 525);
m_u64Line1StartCycle *= 4004;
m_u64Line1StartCycle *= m_u32CpuHz;
m_u64Line1StartCycle /= DIVIDER;

// set up next event
this->m_pCallback->RegisterCpuEvent(m_u64Line1StartCycle + m_u32TopVblankEndCycleOffset, OnTopFieldVblankEndCallback, this);
}

As I show in my comments, the basic algorithm to compute the CPU cycle count for when vblank of the top field should begin is:

currentFrame * 525 * 4004 / 63 * CpuHz / 1000000

Now, one can plug this into a calculator and get a pretty accurate floating point result.  For example, if the cpu frequency is 1 MHz (1,000,000 Hz) the answer would be 0 if the currentFrame is 0, or 33,366.666666 or 33,367 if one rounds it up.

However, doing math the way a typical calculator does it in this program is expensive, and on CPUs such as ARM that handle floating point operations (ie fractional numbers) poorly, doing floating-point math brings the CPU to its knees.

So it is much faster to do integer math.

However, when doing integer math, the result of any math operation discards any fraction.  So 3/2 is not 1.5 but instead 1.  That means in order to get the best accuracy, division should be saved until the last possible moment.

This means that the order that one would perform the math would be:

currentFrame * 525 * 4004 * CpuHz

and then divide the whole thing by 63 * 1000000

This makes for a very large number before the division takes place!

I was noticing that Dragon's Lair was locking up after running for a while (meaning almost 24 hours) and I tracked it down to when the currentFrame value is greater or equal to 2,193,848.  At 29.97 frames per second, this is 20.33 hours.

2,194,847 is still manageable.  2194847 * 525 * 4004 * 4000000 (Dragon's Lair CPU speed) is 0xFFFFFF20BC896C00 in hex (64-bit number).

However, 2194848 * 525 * 4004 * 4000000 when performed on calc.exe is 0x1DE62BA75C8000 in hex (64-bit number) which is less than the previous result.  This means that overflow has occurred.  So the solution here is to either try to do 128-bit math which is expensive no matter what or try to find another way to solve this problem :)

I decided that the best solution to avoid overflow issues like this is to get rid of the multiplication entirely and just use addition (adding onto previous result each new frame).  This is dangerous to do with integer math because it can lead to greater and greater inaccuracy over time.

However, I did some experimentation and found a pattern with part of the calculation.  currentFrame * 525 * 4004 / 63 has a repeating pattern which can be mimicked with integer math:

Frame: 0, time is 0.000000
Frame: 1, time is 33366.666667
Frame: 2, time is 66733.333333
Frame: 3, time is 100100.000000
Frame: 4, time is 133466.666667
Frame: 5, time is 166833.333333
Frame: 6, time is 200200.000000

Every 3rd value is a whole number.

I observed that this pattern can be approximated (without loss of accuracy over time) by adding 33367, 33366, and 33367 over and over again.

I've now implemented a new system that does just this: adds 33367,33366,33367 after a one-time calculation of multiplying these numbers by cpuhz/1000000.  This now is much faster, doesn't lose accuracy, and will still work naturally when the total cycle count overflows (which, since I am using a 64-bit number, will take quite a while to do).

Monday, May 19, 2014

Dexter MACH 3 update

Well, got some less-than-great news about MACH 3.

The audio decoding issues probably can't be properly resolved with the current design of Dexter.  This is because Dexter relies on being able to drop both video and audio in order to stay in sync with the NTSC rhythm generated by the Raspberry Pi.  Audio output on modern PCs generally is pretty imprecise and thus Dexter needs to be able to adjust the contents of the audio stream dynamically in order to stay in sync.  For almost every game, this is something the human player will never notice, but it may mess up MACH 3's audio data decoding.  How much will it mess it up?  We don't yet know, and I don't think we can afford to hold Dexter rev3 back in order to take the time to find out.

So the current status of MACH 3 is that the "DISC ERROR STAY PUT" messages are not going to be going away.  Will the game still be playable?  Time will tell.

I am marking PR-8210 support as finished with a note about MACH 3 on the Dexter status page.  While Us Vs Them hasn't been tested, I am expecting it to work properly.

Monday, May 12, 2014

Did some more work with Rev3 layout (and new star rider video)

Just iterating on rev3 layout...


And here's another star rider video I took recently.  Swapped some special chips between boards and got further.


Monday, May 5, 2014

Star Rider partially working!

Ed Beeler sent me some spare Star Rider CPU and VGG boards to help get me unblocked.  I swapped out the bad CPU board I had with one of his good ones and the game partially boots!  It gets as far as the ROM test, at which point it fails all the tests, as if the ROM board is unpowered and/or disconnected.  I am out of time tonight to troubleshoot any further, but I did manage to get a video capture of the game booting up this far.  NOTE that this is from real hardware, not Daphne or emulation.  I used an RGB->s-video convertor board from jrok (as mentioned in a previous blog post) to make this happen.



Looking forward to getting this working fully and then testing it with Dexter rev3!

Sunday, May 4, 2014

Moved things around a little bit

I split the RCA audio output connector from a single tall wobbly one into two sturdier ones.  I also moved the Raspberry Pi outline around a little bit so that the mounting holes fall within the board instead of outside of it.  For now, I've decided that it doesn't matter if the Raspberry Pi's pin headers match up exactly with the ones on the Dexter board because this designed will require a ribbon cable regardless, so as long as they are close, it should be okay.

I've got quite a bit of space in the lower-left section of the board now.  I am going to start bringing stuff down and packing it in.  Switching from through-hole components to surface mount is going to shrink the board size down quite a bit (which is a good thing).

One component that is not on the board yet is an audio amplifier to make MACH 3 work.  So I will need to leave some room for that, assuming we determine that we need one.

Friday, May 2, 2014

Figuring out where the raspberry pi's mounting holes should go on the Dexter board

I have overlaid an outline of the Raspberry Pi's shape (complete with mounting hole positions) on top of the current Dexter rev3 layout.  I'm glad I did this because I think it needs to be moved around a little bit.

I am inclined to move the pin headers that connect to the pi up toward the top of the board.  And just to be clear, I don't plan on encouraging people to mount the Pi to the Dexter board because there isn't going to be enough clearance.  But if some clever person figures out a better way to do it later, I want the mounting holes to be there just in case.