Saturday, November 20, 2021

My Dragon's Lair 2's RAM wasn't working. Here's how I [finally] discovered the problem!

I've had a DL2 boardset for a few years but have never tried to power it on.  I also did not have a harness for it.  Well, recently I decided to try to get it running.  After making a power harness and a speaker harness, and burning a v3.19 EPROM, it was time to give it a shot!

Well, after powering it on, nothing happened.  Three LEDs lit up, but there were no beeps and no I/O on the serial port that I could see.  Ugh.  After scanning some forums, I determined that the most common problem with getting DL2 working was the null modem adapter (it is non-standard).  I studied the schematics and owners manual and determined that this was likely not my problem.

As I was basically blind, I decided to do what anyone would do who is a glutton for punishment: to sniff the pins of the CPU to see where it was getting stuck (or if it was even working).

The CPU type is 8088 which I was very unfamiliar with.  I mean, I knew the assembly language command set quite well, but knowing what the actual pins on the CPU did, I was in the dark.

I pulled up the datasheet and was kinda mortified about how convoluted this CPU's design is.  Unlike other 8-bit CPUs of the era (Z80, 6809), this 8088 CPU can address 1M of addressing space (instead of 64k) which you'd think would be a good thing.  But unfortunately, to cram this functionality into 40 pins, they share the address lines with the data bus and some status bits.  This makes decoding the CPU quite a challenge and often requires a separate IC near the CPU to present a normal address and data bus for the rest of the system.  I am frankly a little shocked that intel won the CPU wars of the 80's as a result of this.

In order to make parsing a logic analyzer capture of the CPU more convenient, I needed to sniff a pin of said separate IC to make my life easier.  This pin is called the ALE (hehe) pin, or Address Latch Enabled.  Studying a hard-to-read schematic of the DL2 PCB, I was barely able to make this out on some empty pins that the DL2 PCB designers had kindly added to the PCB to make troubleshooting easier.  (Had they not done this, I would've had to figure out a way to attach to a 100 pin surface mount part.  Not my idea of fun!)


Here is the wire I soldered onto this pin in order to attach the logic analyzer pin.


I then used a 40-pin clip to attach my logic analyzer to the CPU itself.


After sniffing some random traffic, I noticed that the program flow was not leaving the "BIOS" section of the ROM.  I also noticed that the ROM seemed to be executing blank code or getting lost and branching/jumping to invalid code.

At this point, I was completely confused.  I disassembled the ROM in IDA and started setting triggers on the logic analyzer to see if program execution got to certain parts of the code.  This turned out to be much easier than just brute-forcing my way through arbitrary captured data.



Eventually, I narrowed my search down to this harmless looking subroutine.


It actually is the first subroutine that gets called after the ROM boots up (not a coincidence!).

Here's my logic analyzer capture of the 'retn' instruction (address 0xFE54F).


Here's my explanation of each numbered section in the screenshot:

  1. Status of 4 means this an instruction fetch is about to take place.
  2. 0xFE54F is the address of the instruction to be fetched.
  3. 0xC3 is the fetched instruction (0xC3 is the RETN instruction).
  4. Status of 5 means that this is a memory read.
  5. 0x0FFF6 is the address to read from (it should map to RAM).  This is reading from the stack to get the subroutine's return address.
  6. This is the data that the RAM returns.  But wait!  This isn't right.  It's giving us the same value in the lower byte as the address (0xF6).  The changes of this being the correct value are extremely unlikely.
  7. Status of 5 means that another memory read is about to take place (the other byte of the return address).
  8. 0x0FFF7 is the address to read from.  To get the return address's other byte. (wow, the 8088 was really messed up.  16-bit return addresses but the CPU supports a 20-address bus?  Who designed this thing? haha)
  9. This is the data that the RAM returns.  But again, this isn't right!  It's giving us the same value that was already on the address lines (0xF7).  In other words, nothing is driving these address lines!
And sure enough, the new CPU address is now 0xFF7F6 (ie completely wrong):



Conclusion?  RAM is either bad, not being controlled correctly, or missing entirely.

At this point, I decided to inspect the DL2 PCB a little closer.

What the heck?  Some empty sockets where RAM is supposed to be installed?

People who own the game tell me that this is normal.  So I haven't completely solved the problem yet.  But I'm getting closer.

UPDATE: I can confirm that pin 16 on the RAM chips (Output Enable) is stuck high.

UPDATE: Looks like the problem may be pin 10 of "RP3" not making good contact with pin 59 ("CAS'") of U1, this 100-pin surface mount chip!  That pin eventually is what drives the RAM's "Output Enable" to actually pulse.


UPDATE: I decided to reflow the surface mount pins and that actually fixed the problem!  I can't believe it!  I would've never figured this out with schematics, a logic analyzer, and a logic probe!