We make a game console. How I assembled the retro console at home.

Hacker

Professional
Messages
1,041
Reaction score
850
Points
113
The content of the article
  • How it all began
  • Assembling the console
  • Video signal
  • Suitable graphics system
  • CPU
  • Connecting CPU and PPU
  • Time for the first real game
  • Adding custom graphics
  • Finally, the sound
  • Final result
  • Architecture
  • Final characteristics
  • Development of software for the console
  • Memory and I / O mapping
  • PPU management
  • Code in assembler
  • Using the C tools
  • Custom graphics
  • Sound
  • Putting it all together
  • Using an emulator for development
  • Demonstrating the operation of the console
  • Conclusion

How it all began
My name is Sergio Vieira and I am a Portuguese guy who grew up in the 80s and 90s. I've always been nostalgic for retro consoles, especially the 3rd and 4th generation. A few years ago, I decided to take a deeper look at electronics and try building my own video console. I work as a software engineer, and I had no previous experience with electronics - except for assembling and upgrading my desktop (which, of course, does not count). But, despite the lack of experience, I said to myself: "Why not?", Bought several books and sets of electronics - and began to study.
I wanted to build a console similar to the ones I've always loved - somewhere between the NES and the Super Nintendo, or between the Sega Master System and the Mega Drive. All of these consoles had a CPU, a custom video chip (it was not yet called a GPU), and an audio chip - either integrated or dedicated. Games for them were distributed on cartridges, which were usually hardware extensions with a ROM chip and sometimes other components.

My original plan was to build a console with the following specifications.
  • No emulation, all games / programs must run on real hardware - not necessarily hardware "from those times", just smart enough for their tasks.
  • A real nostalgic "retrochip" of the processor.
  • TV output (analog signal).
  • The ability to produce sound.
  • Support for two controllers.
  • Scrolling background and moving sprites.
  • Support for Mario-style platformers (and of course other types of games).
  • The ability to run games and programs from an SD card.

I decided to use support for SD cards instead of cartridges, because it is much more practical to run games from a card - it is also easier to copy files from a computer to it. If I had used cartridges, I would have had to wind up a lot more hardware and start a separate hardware for each program.

Assembling the console

Video signal
I started my work by generating a video signal. All consoles of the era I was targeting had their own proprietary graphics chips - which gave them very different characteristics. For this reason, I did not use a ready-made graphics chip - I wanted my console to have unique graphics capabilities. But, since I would not have been able to assemble my own chip, and I did not know how to use an FPGA, I chose a software-based graphics chip with a twenty-MHz eight-bit microcontroller. This is not overkill: it has exactly the performance to generate the type of graphics I want.
So, I started with an ATmega644 microcontroller running at 20 MHz, which sent a PAL signal to the TV. Since the microcontroller itself does not support this format, an external DAC had to be added.
Our microcontroller produces 8-bit chroma (RGB332: three bits for red, three bits for green and two for blue), and a passive DAC converts all this beauty to analog RGB. Fortunately, in Portugal, external devices to the TV are most often connected via the SCART connector - and most TVs receive the RGB signal through it.

Suitable graphics system
Since I wanted to use the first microcontroller exclusively for transmitting a signal to a TV (I called it VPU - Video Processing Unit), for graphics in general I decided to use the double buffering method.
I took a second microcontroller for the PPU (Picture Processing Unit) - ATmega1284 also at 20 MHz. It must generate an image on a RAM chip (VRAM1), after which the first microcontroller will transfer the contents of another RAM chip (VRAM2) to the TV. After each frame (two frames in PAL or 1/25 of a second), the VPU switches the RAM chips and transfers the image from VRAM1 to the TV, while the PPU generates a new one to VRAM2.
The video card turned out to be quite complicated: I had to use additional hardware to give the microcontrollers access to both RAM chips, as well as speed up access to the memory, which is also used to output the video signal using the bitbang method. To do this, I added several 74-series chips to the chain as counters, line selectors, transceivers and more.
The firmware for the VPU and especially for the PPU also came out quite complex, as I had to write extremely productive code - if I wanted to get all the graphics capabilities I was looking for. Initially, I wrote everything in assembler, later I did some code in C.
As a result, my PPU was generating a 224x192 pixel image, which the VPU broadcast to the TV screen. The resolution may seem too low, but in fact it is slightly less than the prototype consoles - they usually had a resolution of 256x224. But the lower resolution allowed me to squeeze more graphics features into the time period that it took to render each frame.
Just like the good old days, my PPU has "fixed" graphics capabilities that can be customized. The background is rendered from 8x8 pixel characters (sometimes called tiles). This means that the size of the entire background is 28 × 24 tiles. For pixel-by-pixel scrolling and the ability to smoothly update the background, I made four virtual screens of 28 × 24 tiles - they are all adjacent and "wrap around" each other.
On top of the background, the PPU renders up to 64 sprites that are 8 to 16 pixels wide and 16 pixels high (that is, one, two, or four characters) that can be rotated vertically, horizontally, or both. You can also render an overlay above the background - such a plate with the size of 28 × 6 tiles. It is useful for games where you need interface elements on top of the main screen (HUD), the background scrolls, and sprites are used not only to convey information, but also for other purposes.
Another "advanced" feature is the ability to scroll the background in different directions on separate lines, which allows you to add effects such as limited parallax scrolling or split screen.
There is also an attribution table that allows you to set each tile to a value from 0 to 3. And then you can, for example, assign all tiles with a specific value to a specific tile page or increase their symbol number. This is useful when specific background elements are constantly changing - in this case, the CPU does not need to update each tile individually, it can simply issue a command like "all tiles with a value of 1 increase their value by 2". This approach is used in various ways, for example, in games with Mario, where the question marks move in the background, or in other games with constantly pouring waterfalls.

CPU
When the functional video card was ready, I started working on the CPU - I chose the Zilog Z80 for my console. In addition to the fact that the Z80 is just a cool retroprocessor, it has separate 16 bits for memory and 16 bits for I / O, than other similar eight-bit processors, for example, the famous 6502, cannot boast. The same 6502 has only 16 bits of memory, which means that these 16 bits will have to be divided between the memory itself and additional devices: audio, video, input and others. If we have a separate section for I / O, then it will take over all external devices, and we can use 16 bits of memory (that is, 64 KB of code or data) for their intended purpose.
To get started, I connected my CPU to the EEPROM by throwing in some test code. I also screwed a microcontroller to the CPU through the I / O section, which communicates with the PC via RS-232 , to check if my processor and all other connections are working normally. This microcontroller (twenty-MHz ATmega324) was supposed to become an IO MCU (input-output microcontroller) and was responsible for accessing game controllers, SD card, PS / 2 keyboard and communication with a computer via RS-232.
Then I screwed a 128 KB RAM chip to the processor, of which 56 were available (this may seem like a waste of resources, but I only had 128 KB and 32 KB RAM chips). Thus, all processor memory consists of 8KB of ROM and 56KB of RAM.
Next, I updated the firmware of my I / O microcontroller with this library and added support for SD cards to it. Now the CPU has learned to navigate the directories of the SD card, view their contents, open and read files - reading and writing data to specific addresses of the I / O section.

Connecting CPU and PPU
It's time to implement the interaction between the CPU and PPU. For this, I found a "simple solution" - a RAM chip with a dual port (that is, one that can be simultaneously connected on two different buses). He saved me from winding up new microcircuits such as linear selectors - and also made access to RAM for both chips practically simultaneous. Also, the PPU communicates directly with the CPU every frame, activating its non-maskable interrupt (NMI). This means that every frame the processor is interrupted (a valuable skill for synchronizing and updating graphics in a timely manner).

Each frame, the interaction between the CPU, PPU and VPU develops according to the following scenario.
  • PPU copies information from external RAM (denoted as PPU RAM in the figure below) to internal RAM.
  • The PPU sends a non-maskable interrupt to the CPU.
  • At the same time: the CPU immediately calls the non-maskable interrupt function and updates the information about the graphics state in the PPU RAM in the next frame (the program must exit the interrupt before it starts);
  • PPU renders an image based on the information that was previously copied into one of the two RAMs of the graphics system (VRAM1 or VRAM2);
  • VPU sends the image from another VRAM to the TV.

Around the same time, I added support for game controllers. I originally wanted to use Super Nintendo controllers, but their connector is proprietary and not easy to get. So I opted for compatible Mega Drive / Genesis 6- button controllers: they use standard, common and available DB-9 connectors.

Time for the first real game
I had a game controller capable processor that could handle PPUs and load programs from an SD card, so ... it's time to make a game. I wrote it, of course, in Z80 assembly language - it took me a couple of days (game source code).

Adding custom graphics
Everything is fine, I have a working console, but ... it's not enough. Games cannot yet use custom graphics - only the one stored in the PPU firmware. And the only way to change the built-in graphics is to update the firmware. Therefore, I decided to add a separate RAM chip with graphics (character RAM, Character RAM) - it must be available by the PPU and load the graphics according to the instructions that came from the CPU. At the same time, it was necessary to use as few new components as possible, because the console was already quite large and complex.
I found the following solution: only the PPU will have access to the new RAM, and the CPU will transfer information to it through the PPU. And while this data is being transferred, our new RAM will not be used for graphics - its functions will temporarily be taken over by the integrated graphics.
After the data transfer, the processor will switch from the integrated graphics mode to the symbol RAM mode (CHR RAM in the diagram below), and the PPU will be able to use the custom graphics. It may not be a perfect solution, but it works. As a result, the new RAM had a volume of 128 KB and could store 1024 characters of 8 × 8 pixels for the background and 1024 characters of the same size for sprites.

Finally, the sound
I left the implementation of the sound for the final. Initially, I was going to give my console the same audio capabilities as the Uzebox and build in a microcontroller that would generate four channels of PWM audio. However, I figured out that vintage chips were relatively easy to get hold of - and ordered several YM3438 chips that work on the [frequency modulation synthesis] principle (https://en.wikipedia.org/wiki/Frequency_modulation_synthesis). They are fully compatible with YM2612 that are installed in the Mega Drive / Genesis. By installing this chip, I get Mega Drive quality music and sound effects that the controller produces. The CPU controls the sound module (I called it SPU, Sound Processor Unit - it issues commands to the YM3438 and produces sounds itself) again through a small RAM with a double port, this time with a capacity of only 2KB.
Just like the graphics, the sound module has 128 KB of storage for sound patches and PCM samples. The processor unloads information into this memory via the SPU. Thus, the processor can either tell the SPU to play instructions from this RAM, or update the instructions for the SPU every frame.
The CPU controls the four PWM channels through four circular buffers, which are in special RAM (SPU RAM in the diagram below). The SPU goes through these buffers and executes the instructions in them. Another circular buffer in the SPU RAM works in the same way - it serves the frequency modulation synthesis chip (YM3438).

The interaction between the processor and the sound module is similar to a story with graphics - and is organized according to the following scheme.
  • SPU copies information from SPU RAM to on-board RAM.
  • The SPU is waiting for the NMI signal from the PPU (for synchronization).
  • At the same time: the processor updates the buffers of the PWM channels and the frequency modulation synthesis chip;
  • The SPU executes instructions in buffers according to the information stored in the on-board memory.
  • While all this is happening, the SPU continuously updates the 16 kHz PWM audio.

Final result
After developing all the modules, I put some of them on the breadboards. For the CPU module, I managed to come up with and order a custom board. I don’t know if I’ll do the same for other modules - I guess I was pretty lucky to get a working custom board on my first try. Only the sound module remains as a layout for now.
This is what the console looks like at the time of this writing.

Architecture
The diagram below illustrates which components are included in each module and how they interact with each other. The only thing not shown is the signal in the form of an NMI that the PPU sends directly to the processor every frame, as well as a similar signal that is sent by the SPU.

CPU: Zilog Z80, clocked at 10 MHz.
  • CPU-ROM: 8KB EEPROM, contains bootloader code.
  • CPU-RAM: 128 KB of RAM (of which 56 KB are used), contains code and data for programs / games.
  • IO MCU: ATmega324, serves as the interface between CPU and RS-232, PS / 2 keyboard, game controllers and SD card file system.
  • PPU-RAM: 4KB Dual Port RAM, this is the interface RAM between the CPU and PPU.
  • CHRRAM: 128 KB of RAM, contains custom background tiles and sprite graphics (8x8 pixels each character).
  • VRAM1, VRAM2: 128 KB of RAM (43 008 bytes are used), are used to store the frame buffer; information in them is written by PPU, and read by VPU.
  • PPU (Picture Processing Unit): ATmega1284, draws a frame and send it to the framebuffer.
  • VPU (Video Processing Unit): ATmega324, reads framebuffer and generates RGB and PAL signal.
  • SPU-RAM: 2KB dual-port RAM, serves as the interface between the CPU and the SPU.
  • SNDRAM: 128 KB of RAM, contains PWM patches, PCM samples and instruction blocks for frequency modulation synthesis.
  • YM3438: eponymous frequency modulation synthesis chip.
  • SPU (Sound Processing Unit): ATmega644, generates PWM based sound and controls the YM3438.

Final characteristics
CPU:
  • eight-bit CPU Zilog Z80 with a frequency of 10 MHz;
  • 8 KB of permanent memory for the bootloader;
  • 56 KB of RAM.

Input / output (I / O):
  • reading data from SD cards of FAT16 / FAT32 file systems;
  • reading and writing to the RS-232 port;
  • two game controllers compatible with Mega Drive / Genesis;
  • PS / 2 keyboard.

Video:
  • resolution 224 × 192 pixels;
  • 25 frames per second;
  • 256 colors (RGB332 scheme);
  • a virtual background space 2x2 (448x384 pixels) with bi-directional pixel-by-pixel scrolling, which is described by four name tables;
  • 64 sprites with a height and width of 8 or 16 pixels with the ability to expand them both vertically and horizontally;
  • background and sprites consisting of 8 × 8 pixels each;
  • character RAM with 1024 background and 1024 sprite characters;
  • independent horizontal scrolling of the background on custom lines by 64;
  • independent vertical scrolling of the background by custom lines by 8;
  • overlaying a 224 × 48 pixel plate with or without transparency;
  • attribution table for the background;
  • RGB and PAL composite output via SCART connector.

Sound:
  • PWM-generated 8-bit 4-channel audio with predefined waveforms (square, sine, sawtooth, noise, and so on);
  • 8-bit and 8-kHz PCM samples on one of the PWM channels;
  • YM3438 frequency modulation synthesis chip with updatable instructions at 50 Hz.

Development of software for the console
The first piece of software written for the console is the bootloader. It is stored in the permanent memory of the processor and occupies up to 8KB. It also uses the first 256 bytes of the processor's RAM. The bootloader is the first software to run on a processor. Its purpose is to show the programs available on the SD card. These programs are stored in files that contain compiled code and can also contain custom graphics and sound data.
After selecting a program, it is loaded into the processor's RAM, character RAM, and sound module RAM. There the corresponding program is executed. The code for programs loaded on the console can take up to 56 KB of memory - excluding the first 256 bytes; also, of course, you need to consider the size of the stack and leave room for data.
Both the bootloader and the programs for this console are developed in a similar way. I will briefly explain how I made them.

Memory and I / O mapping
When developing for the console, special attention should be paid to how the CPU can access other modules, so the representation of memory and I / O is critical.
The processor accesses its ROM and RAM bootloader via memory. The memory view looks like this.
It accesses PPU-RAM and SPU-RAM as well as the IO MCU via the I / O section. The representation of the processor I / O section will be as follows.
Within the I / O section view, the IO MCU, PPU, and SPU have their own specific addresses.

PPU management
We can control PPU by writing to PPU-RAM, and access to PPU-RAM, as we know from the table above, is organized through the I / O section from addresses from 1000hto 1FFFh.
This is what this range of addresses looks like when presented in more detail.

PPU Status can take the following values:
0 - integrated graphics mode;
1 - custom graphics mode;
2 - write mode to character RAM;
4 - recording is over, waiting for confirmation from the CPU.

And here is an example of how you can work with sprites. The console can render up to 64 sprites at a time. Information about these sprites is transmitted through addresses from 1004hto 1143h(320 bytes), 5 bytes of information for each sprite (5 × 64 = 320 bytes):
  1. Mixed bytes (each of its bit - this flag: Active, Flipped_X, Flipped_Y, PageBit0, PageBit1, AboveOverlay, Width16and Height16).
  2. Character byte (which character is the sprite on the page described by the corresponding mixed byte flags).
  3. Chroma key byte (describes what color will be transparent).
  4. Horizontal position byte (X-axis).
  5. Vertical position byte (Y-axis).

So to make the sprite visible, we have to set the flag Activeto 1, and set the location in which it is visible (the coordinates x = 32and y = 32will place a sprite in the upper left corner of the screen; if the values xand ywill be less, the sprite is outside the screen - in part or in fully). We can then assign a symbol to it and determine which color of the sprite will be transparent.
For example, if we want to make the tenth sprite visible, we must set the I / O address 4145 ( 1004h + (5 x 9)) to 1. Then we set the coordinates of the sprite - say x = 100and y = 120- by setting address 4148 to 100 and address 4149 to 120.

Code in assembler
One way to write a program for our console is to use assembly language.

Below is an example of code that makes the first sprite move and collide with the corners of the screen:
Code:
ORG 2100h

PPU_SPRITES: EQU $1004
SPRITE_CHR: EQU 72
SPRITE_COLORKEY: EQU $1F
SPRITE_INIT_POS_X: EQU 140
SPRITE_INIT_POS_Y: EQU 124

jp main

DS $ 2166- $
nmi:
    ld bc, PPU_SPRITES + 3
    ld a, (sprite_dir)
    and a, 1
    jr z, subX
    in a, (c) ; increment X
    inc a
    out (c), a
    cp 248
    jr nz, updateY
    ld a, (sprite_dir)
    xor a, 1
    ld (sprite_dir), a
    jp updateY
subX:
    in a, (c) ; decrement X
    dec a
    out (c), a
    cp 32
    jr nz, updateY   
    ld a, (sprite_dir)
    xor a, 1
    ld (sprite_dir), a
updateY:
    inc bc
    ld a, (sprite_dir)
    and a, 2
    jr z, subY
    in a, (c) ; increment Y
    inc a
    out (c), a
    cp 216
    jr nz, moveEnd
    ld a, (sprite_dir)
    xor a, 2
    ld (sprite_dir), a
    jp moveEnd
subY:
    in a, (c) ; decrement Y
    dec a
    out (c), a
    cp 32
    jr nz, moveEnd
    ld a, (sprite_dir)
    xor a, 2
    ld (sprite_dir), a
moveEnd:
    right

main:
    ld bc, PPU_SPRITES
    ld a, 1
    out (c), a  ; Set Sprite 0 as active
    inc bc
    ld a, SPRITE_CHR
    out (c), a  ; Set Sprite 0 character
    inc bc
    ld a, SPRITE_COLORKEY
    out (c), a  ; Set Sprite 0 colorkey
    inc bc
    ld a, SPRITE_INIT_POS_X
    out (c), a  ; Set Sprite 0 position X
    inc bc
    ld a, SPRITE_INIT_POS_Y
    out (c), a  ; Set Sprite 0 position Y
mainLoop:   
    jp mainLoop

sprite_dir: DB 0

Using the C tools
You can also write console programs in C using the SDCC compiler or other custom tools. Development goes faster this way, although the performance of the code, of course, drops.

As an example, I'll show you a C code that performs the same task as the assembly code above. To make it easier to access the PPU, I used a library here.
Code:
#include <console.h>

#define SPRITE_CHR 72
#define SPRITE_COLORKEY 0x1F
#define SPRITE_INIT_POS_X 140
#define SPRITE_INIT_POS_Y 124

struct s_sprite sprite = {1, SPRITE_CHR, SPRITE_COLORKEY, SPRITE_INIT_POS_X, SPRITE_INIT_POS_Y};
uint8_t sprite_dir = 0;

void nmi() {
    if (sprite_dir & 1)
    {
        sprite.x++;
        if (sprite.x == 248)
        {
            sprite_dir ^ = 1;
        }
    }
    else
    {
        sprite.x--;
        if (sprite.x == 32)
        {
            sprite_dir ^ = 1;
        }
    }

    if (sprite_dir & 2)
    {
        sprite.y ++;
        if (sprite.y == 216)
        {
            sprite_dir ^ = 2;
        }
    }
    else
    {
        sprite.y--;
        if (sprite.x == 32)
        {
            sprite_dir ^ = 2;
        }
    }

    set_sprite(0, sprite);
}

void main() {
    while(1) {
    }
}

Custom graphics
The console has built-in read-only graphics that are stored in the PPU firmware (one tile page for the background and one graphic page for the sprites). However, custom graphics can also be used for programs.
The goal is to translate all the necessary graphics into binary form - in this form, the console loader will be able to load them into symbolic RAM. To achieve this, I started with several images of the right size - in this case, they are intended for the background in several game situations at once.
Since the custom graphics consist of four 8x8 pixel pages of 256 characters for the background and four of the same pages for the sprites, I converted the graphics from the image above to a PNG file for each page using a special tool (excluding the duplicate resulting characters).
And then I used another tool to convert the result to a binary with 8x8 pixels in RGB332 color scheme.
The result is binary files consisting of 8x8 pixel characters (characters in memory are contiguous, each occupying 64 bytes).

Sound
Samples of sound waves are converted into 8-bit and 8-kHz PCM samples. PWM sound effects and music patches can be compiled using predefined instructions. As for the Yamaha YM3438 frequency modulation synthesis chip, I found the DefleMask application for it. DefleMask makes PAL synced music for Genesis's YM2612 sound chip, which is compatible with our YM3438.
DefleMask converts music to VGM format, and then I use another special tool to turn VGM into a homemade audio binary.
Binaries with all three types of sounds are combined into one binary file, which the loader can then load into the sound module RAM (SNDRAM).

Putting it all together
The software binaries, graphics and sound must be combined into one PRG file. The PRG file has a header that tells if the program uses custom graphics and / or sound and how big each of these components are. It also contains all other relevant binary information.
Then the resulting file can be placed on an SD card, the console loader will read it from there, send all the necessary information to the appropriate RAM and launch the program.

Using an emulator for development
To make it easier to develop software for the console, I wrote an emulator in C ++ using wxWidgets. To emulate the processor, I turned to the libz80 library.
I have added some debugging functions to the emulator. In particular, I can end up at a specific breakpoint and walk out of it through all the assembler instructions. There is also a link to the source code if the program was compiled in C. As for the graphics, here I can check what is stored in the tile pages and in the name tables (background representation of four screens), as well as what is in the symbol RAM ( CHRRAM).
Here's an example of how to run a program on an emulator and use some debugging tools.

Demonstrating the operation of the console
The video from this section is a capture of the CRT screen with a phone camera. I am sorry that the quality is not very high.
Launch with BASIC and PS / 2 keyboard. In this video, right after creating the first program, I write directly to the RAM of the graphics module (PPU-RAM) through the I / O section of the command to enable and configure the sprite, and at the end move it.
Demonstration of graphics capabilities. This video shows a program that displays 64 16x16 pixel sprites, a custom background scrolling, and an overlay plate that moves up and down both in front of and behind the sprites.
The sound demo shows what the YM3438 can do when combined with playing PCM samples. Frequency modulation music along with PCM samples in this demo takes up almost all of the 128KB RAM of the sound module.
Tetris using almost exclusively background tiles for graphics, YM3438 for music, and PWM patches for sound effects.

Conclusion
This project has become a real dream come true, I have been working on it for several years in my spare and not so much time. I never thought I'd go this far trying to build my own retro game console. Of course, it's not perfect - I'm still not an electronics expert at all. There are too many components in the console, and it undoubtedly could be made better and more efficiently - surely someone reading this text thinks so. However, while I was working on this project, I learned a lot about electronics, game consoles and computer design, assembly language and other interesting topics. And on top of that, I got great satisfaction playing a game I made myself on hardware, which I also designed and made myself.

I plan to build other consoles and computers as well. In fact, I have almost finished another game console. This is a simplified retro-style console based on a cheap FPGA board and a few other components (but obviously there are not as many of them as in the first project). It was originally conceived as cheap and replicable.
 
Top