I finished this a few days ago but forgot to post a picture.
Friday, August 29, 2014
Thursday, August 28, 2014
First Dexter rev 3b prototype finished
I've finished soldering the first rev 3b prototype and so far, I can't find any problems with it. I have 3 more of these boards to assemble before ordering more from the fabrication place.
Friday, August 22, 2014
Soldering the next round of Dexter prototypes
The new board without any soldering done.
The new board with all of the surface mount ICs soldered in. A few solder bridges, but no big deal. All in all, the soldering took me about 2 hours. I can't wait to have a machine assemble these :)
Thursday, August 14, 2014
Williams IC tester PCBs arrived, partially soldered
The Williams IC tester PCBs showed up today. I decided to try a new way to solder by just putting solder paste on the four corners of each IC. The purpose of this is twofold:
- Lay the solder paste down fast
- Have the ICs align themselves during the "melting" process
As you can see, they are aligned pretty well. They have solder bridges, as I expected, but these will actually be useful because I can take an iron and drag the bridges across the unsoldered pads to quickly get the rest of the pads soldered, then use a wick to clean up. I am liking this rapid DIY approach :)
- Lay the solder paste down fast
- Have the ICs align themselves during the "melting" process
As you can see, they are aligned pretty well. They have solder bridges, as I expected, but these will actually be useful because I can take an iron and drag the bridges across the unsoldered pads to quickly get the rest of the pads soldered, then use a wick to clean up. I am liking this rapid DIY approach :)
Wednesday, August 13, 2014
Troubleshooting Galaxy Ranger
I've been trying to troubleshoot a non-working Dexter setup with Galaxy Ranger and ended up writing some Z80 assembly language to modify the ROM for greater visibility. This little code snippet will print the LD-V1000 status to the screen. I am putting it here so I can find it again later if I need it :)
$D46 and $D47 should be changed from 0xCD 0x15 to 0x60 0x5B. The new code should be put at location $5B60 (lives on the second EPROM) in memory which is filled with 0xFF's on the stock ROMs.
; modify Galaxy Ranger to display LD-V1000 status code during "WARM-UP"
org 0x5b60
start:
; preserve registers
push hl
push bc
push af
; set target pointer
ld hl, 0xF169 ; right above $F189 'warming up' text (each row is 0x20$
ld a, (0xc800) ; LD-V1000 I/O flip-flop
ld b, a
; isolate top nibble
srl a ; A >>= 4
srl a
srl a
srl a
cp 0xA ; Acculumulator - 0xA is what?
jr c, Its0to9 ; if carry is set, it means subtraction resulte$
; A-F (hex)
add 0x37 ; make it ASCII
jr StoreIt
Its0to9:
add 0x30 ; make it ASCII
StoreIt:
ld (hl), a
inc hl
; now grab lower nibble
ld a, 0xF
and b
cp 0xA
jr c, Its0to9_2
add 0x37
jr StoreIt2
Its0to9_2:
add 0x30
StoreIt2:
ld (hl), a
; restore registers
pop af
pop bc
pop hl
; go to where we were supposed to originally
jp 0x15CD
$D46 and $D47 should be changed from 0xCD 0x15 to 0x60 0x5B. The new code should be put at location $5B60 (lives on the second EPROM) in memory which is filled with 0xFF's on the stock ROMs.
; modify Galaxy Ranger to display LD-V1000 status code during "WARM-UP"
org 0x5b60
start:
; preserve registers
push hl
push bc
push af
; set target pointer
ld hl, 0xF169 ; right above $F189 'warming up' text (each row is 0x20$
ld a, (0xc800) ; LD-V1000 I/O flip-flop
ld b, a
; isolate top nibble
srl a ; A >>= 4
srl a
srl a
srl a
cp 0xA ; Acculumulator - 0xA is what?
jr c, Its0to9 ; if carry is set, it means subtraction resulte$
; A-F (hex)
add 0x37 ; make it ASCII
jr StoreIt
Its0to9:
add 0x30 ; make it ASCII
StoreIt:
ld (hl), a
inc hl
; now grab lower nibble
ld a, 0xF
and b
cp 0xA
jr c, Its0to9_2
add 0x37
jr StoreIt2
Its0to9_2:
add 0x30
StoreIt2:
ld (hl), a
; restore registers
pop af
pop bc
pop hl
; go to where we were supposed to originally
jp 0x15CD
Friday, August 8, 2014
The next generation of emulation design?
So as I sat here at 1 AM feeding my newborn infant (that makes 4 kids!), my mind drifted to Star Rider emulation once again and the inherent inaccuracies with the way that Daphne (and also MAME last time I checked) handle emulating the 6809E CPU, the PIA6821, and the Williams Special Chips.
Daphne emulates a multi-CPU game by having each CPU take turns and execute a small chunk of cycles before switching to next CPU and doing the same. In the case of Star Rider, where each CPU runs at the same speed (1 MHz), Daphne actually executes a single instruction before switching to the next CPU which gives a pretty dang accurate experience. But even this is not really good enough to achieve "perfect emulation" because Star Rider (and other Williams games) utilize a design which requires the CPU to be halted on a regular basis or to not run on a steady clock. Daphne assumes that the clock will always be steady and that the CPU will never be halted, so this presents a bit of a problem. To get perfect emulation, one needs to look to FPGA solutions and forget about desktop PC solutions. Or does one?
I realized a few things tonight. If I am going to take Daphne to the "next level" and emulate a game like Star Rider "perfectly" so that real hardware and Daphne and run side by side and retain identical state over a period of days (with allowance for clock frequency drift between the two), I am going to have to ditch the traditional model of "execute N cycles as fast as possible in isolation".
Instead, a more ideal way to emulate Star Rider and other Williams games is to emulate the actual clock that all of the CPUs use and sync each CPU to this clock. Star Rider has a single 24 MHz clock on the VGG board which is split up (via PROM) into a 12 MHz, 6 MHz, 4 MHz, 2 MHz, and 1 MHz clock. But all of these clocks are in sync because they all derive from the parent 24 MHz clock. The PIF board has its own 4 MHz clock (which creates a 1 MHz clock for its CPU), and I haven't looked at the sound board that much to know where its clock is coming from, but it probably is also running at 1 MHz. It is quite convenient that the three 6809E CPUs for this game are all running at the same clock speed because that means they can all be clocked as if from a single source. Not all games will have this benefit, so this approach can't be necessarily applied to every scenario out there. But I think for Star Rider (and probably other Williams games) it would work to emulate the game on a clock level!
One reason that I didn't even consider doing this in the past is because of performance. Emulating the clock of a CPU is a lot slower than just running a bunch of instructions and keeping track of how many cycles it is supposed to take. But new modern CPUs keep adding greater capacity to execute multiple tasks in parallel (by adding CPU cores) and it has been a problem for me to figure out how to utilize this parallelism. The cool thing about emulating a CPU on a clock level is that multiple emulated CPUs can somewhat execute independently of each other which is perfect for a multi-threaded architecture!
It basically would work something like this:
When the clock is about to do a Low->High transition, the state of all inputs would be sent to every thread (emulated CPU) and then each thread would process this clock transition in parallel and ignore propagation delay (which I think is okay to do since this is a digital clocked system). Each thread would report any change to the CPU's output (for example, the R/W pin changing or data being put on the bus, etc) to the master thread. Then all of the output changes would be consumed by the master thread, triggering any emulation-related callbacks (such as updating video output), and the master thread would figure out the new input state. This new input state would be sent to every thread (emulated CPU) and they'd all perform the High->Low transition in parallel and this process would repeat infinitely.
The idea is that as long as a thread has an input state, it can execute its own clock transition and figure out its own output state without having to wait for another thread to accomplish the same task.
This assumes that the hardware design is such that propagation delay can be completely ignored due to the clock speed(s) of the system being chosen conservatively enough.
How would this type of system perform? I frankly have no idea! I don't even know if it would be worth using multiple threads to do this BUT my experience over the years tells me that there is a good chance that performance would be good enough that an emulated system like this would run at full speed on a modern PC.
Heck, at this level, even the video drawing hardware could be emulated! The possibilities! THE POSSIBILITIES!!! :)
As I alluded to in a previous blog post, I am working with Sean Riddle on an FPGA replacement design for the Williams Special Chip at the moment and once we have this thing perfected (hehe), someone could make an emulated version of it (FPGA isn't the same kind of emulation in my opinion) using the technique that I described above.
Daphne emulates a multi-CPU game by having each CPU take turns and execute a small chunk of cycles before switching to next CPU and doing the same. In the case of Star Rider, where each CPU runs at the same speed (1 MHz), Daphne actually executes a single instruction before switching to the next CPU which gives a pretty dang accurate experience. But even this is not really good enough to achieve "perfect emulation" because Star Rider (and other Williams games) utilize a design which requires the CPU to be halted on a regular basis or to not run on a steady clock. Daphne assumes that the clock will always be steady and that the CPU will never be halted, so this presents a bit of a problem. To get perfect emulation, one needs to look to FPGA solutions and forget about desktop PC solutions. Or does one?
I realized a few things tonight. If I am going to take Daphne to the "next level" and emulate a game like Star Rider "perfectly" so that real hardware and Daphne and run side by side and retain identical state over a period of days (with allowance for clock frequency drift between the two), I am going to have to ditch the traditional model of "execute N cycles as fast as possible in isolation".
Instead, a more ideal way to emulate Star Rider and other Williams games is to emulate the actual clock that all of the CPUs use and sync each CPU to this clock. Star Rider has a single 24 MHz clock on the VGG board which is split up (via PROM) into a 12 MHz, 6 MHz, 4 MHz, 2 MHz, and 1 MHz clock. But all of these clocks are in sync because they all derive from the parent 24 MHz clock. The PIF board has its own 4 MHz clock (which creates a 1 MHz clock for its CPU), and I haven't looked at the sound board that much to know where its clock is coming from, but it probably is also running at 1 MHz. It is quite convenient that the three 6809E CPUs for this game are all running at the same clock speed because that means they can all be clocked as if from a single source. Not all games will have this benefit, so this approach can't be necessarily applied to every scenario out there. But I think for Star Rider (and probably other Williams games) it would work to emulate the game on a clock level!
One reason that I didn't even consider doing this in the past is because of performance. Emulating the clock of a CPU is a lot slower than just running a bunch of instructions and keeping track of how many cycles it is supposed to take. But new modern CPUs keep adding greater capacity to execute multiple tasks in parallel (by adding CPU cores) and it has been a problem for me to figure out how to utilize this parallelism. The cool thing about emulating a CPU on a clock level is that multiple emulated CPUs can somewhat execute independently of each other which is perfect for a multi-threaded architecture!
It basically would work something like this:
When the clock is about to do a Low->High transition, the state of all inputs would be sent to every thread (emulated CPU) and then each thread would process this clock transition in parallel and ignore propagation delay (which I think is okay to do since this is a digital clocked system). Each thread would report any change to the CPU's output (for example, the R/W pin changing or data being put on the bus, etc) to the master thread. Then all of the output changes would be consumed by the master thread, triggering any emulation-related callbacks (such as updating video output), and the master thread would figure out the new input state. This new input state would be sent to every thread (emulated CPU) and they'd all perform the High->Low transition in parallel and this process would repeat infinitely.
The idea is that as long as a thread has an input state, it can execute its own clock transition and figure out its own output state without having to wait for another thread to accomplish the same task.
This assumes that the hardware design is such that propagation delay can be completely ignored due to the clock speed(s) of the system being chosen conservatively enough.
How would this type of system perform? I frankly have no idea! I don't even know if it would be worth using multiple threads to do this BUT my experience over the years tells me that there is a good chance that performance would be good enough that an emulated system like this would run at full speed on a modern PC.
Heck, at this level, even the video drawing hardware could be emulated! The possibilities! THE POSSIBILITIES!!! :)
As I alluded to in a previous blog post, I am working with Sean Riddle on an FPGA replacement design for the Williams Special Chip at the moment and once we have this thing perfected (hehe), someone could make an emulated version of it (FPGA isn't the same kind of emulation in my opinion) using the technique that I described above.
Sunday, August 3, 2014
Us vs Them working in Dexter!
Warren burned some Us vs Them ROMs for his MACH 3 hardware and confirmed that Dexter seems to be working perfectly with it! I tried to test Dexter on an Us vs Them at CAX, but I had bugs in the VBI injection firmware that I didn't find or fix until after CAX was over :(
Saturday, August 2, 2014
Making a new PCB to test/validate Williams ICs
Star Rider testing has got me so frustrated with suspicions of faulty hardware that I am making a little board to test the big ICs that Star Rider uses: the Motorola 6809E CPU, the Motorola PIA 6821, and the Williams Special Chips.
Subscribe to:
Posts (Atom)