Programming for the NES

Posted by Astryl on Oct. 23, 2013, 6:39 a.m.

Copy-pasted from a hidden fortress deep under the sea

The NES is one of my favorite consoles; I grew up with it, and still have two clones and a boatload of cartridges sitting around on my shelves.

Of course, now that I have knowledge of the arcane and whatnot, I decided to actually do a bit of digging into the process of creating games for the NES.

My interest was initially piqued after I watched a video by the creator of Retro City Rampage, who ported his game to the NES (And managed to leave it more or less completely intact).

For three years now, I've been making music for the NES hardware, playing games on it, and designing my games after the same 'style' as popular NES titles. Actually creating a game for the machine is fun (And challenging). I'll write down my discoveries here, and other general information.

What I'm trying to accomplish

Basically, it started with this:

For this game, I used a portion of the NES palettes (More info on that below), and obviously began designing music in Famitracker.

But then a friend saw the game, and wondered aloud if it would be possible to do this on actual NES hardware.

As per usual, I couldn't resist a challenge, so I began my foray into the world of the 2A03.

What I have accomplished so far

More or less, I have everything initialized, I have my CHR table loaded, and I'm displaying a large sprite that moves across the screen:

The repeating pattern in the background is due to the nametable being zeroed out. More on that later.

The 2A03

The CPU at the core of the NES is not, as most people think, a 6502. Most people mistakenly think that the 2A03 is the "music chip". They're right, in a way. But the 2A03 is also the main processor.

Of course, it is still a 6502, just modified in several ways.

The 2A03 is a rare type of processor, pairing both the APU and CPU functionality on one die.

It has one general purpose register, two index registers, and operates in 8-bit hexadecimal mode. The 2A03 has a clock speed of around 25mhz, and several dividers pulsing the PPU, IRQ lines and ROM/RAM.

The processor is more or less the same as the Commodore 64's 6502, with the exception of the addition of the APU and the bank switching hardware.

The PPU

The PPU or "Picture Processing Unit", was a Ricoh RP2C02/RP2C07 (Depending on region), and was really advanced for it's time.

Before the NES, all graphics processing was performed by the CPU. Pixels were drawn manually, and thus VRAM was limited (A 320x240 screen at only 8 bits per pixel is already 640K of RAM, something that was impossibly expensive back then).

The NES used a few tricks to not only speed up graphics processing, but increase the resolution beyond what the current consoles had available to them (Those being the Commodore series, the Atari machines and the ZX Spectrum).

The 'native' resolution of a NES console is 256x240, a really nice 'square' aspect ratio (Nearly 1, actually 1.06).

To handle this memory, the console used several concepts that became staples in the Nintendo product stable (Up to the DS).

The first of these was the pattern table, an 8KB location in ROM where the game's graphics were stored. Two tables exist in a standard ROM layout, each one capable of holding 256 8x8x4 tiles. Often, one table would be used to store sprites, the other for storing backgrounds.

The next piece of the puzzle is the nametable. This is basically an array of 34x32 bytes, each one containing a tile reference. It's usually used to draw anything that isn't a sprite.

In my screenshot above, the nametable is filled with zeroes, and tile zero is the top-left of the player sprite. So the background is filled with that.

Sprites are handled in a separate way, utilizing OAM (Object Attribute Memory). Each piece of "Object Memory" is basically a structure that looks like this in C:

struct OAM_Entry
{
    unsigned byte y;
    unsigned byte tile;
    unsigned byte attributes;
    unsigned byte x;
}

This is the layout as it exists in memory. The PPU can handle up to 64 sprites on screen at a time, with up to eight per row (Though methods exist to bypass this limit entirely).

Each tile is 8x8 pixels, so to create my player sprite, I needed 8 tiles (Player sprite is 16x32). There are ways to reduce this, but I haven't written the code to handle this yet.

Anyway, this means that my player alone takes up 8 of the 64 sprites available. If I reduce it by using the 8x16 tile mode, I can cut away four of those (And thus have more screen space for other large sprites).

The OAM is copied to the PPU at the beginning of a vblank period (Time when the beam moves from the bottom of the screen back to the top in a traditional TV/Monitor).

To do this, I use DMA.

The APU

I haven't written much code dealing with the APU yet, though I have managed to 'pulse' it (Make a tonal noise).

There are many things to consider, though, when I do write my code for handling sound and music.

Naturally, I will be wanting to use either Famitracker or MML to create my music and sound effects. The problem lies in the fact that the "NSF" format isn't the "Native" NES sound format as some people believe.

It's the iNES compatible NES Sound Format, that was created as a unified way to store ripped music/sound data.

For my music, I have to create my own 'mixer' code, and define a system of reading data and playing it. So basically, anything goes; If I'm reading from my music banks and encounter a 0x32 byte, it could mean anything.

I'm probably going to make an opcode based system, where a specific byte is an 'op', and the next byte is the argument. This will allow me to implement most of the effects I use in Famitracker.

I can hook my sound into the NMI (Non Maskable Interrupt), and process my music after the frame has rendered.

A bit of code

I originally started out on my game using CC65, a free compiler that was originally designed with the Commodore 64 in mind, but now has an NES expansion.

At first, it was great (And it allowed me to use C code), but problems with memory and banking made me move over to NESASM3 (An assembler specifically designed for producing NES ROM Code).

That means no C for me, unless I set up a convoluted compilation system where CC65 produces NESASM compatible assembly, then I assemble it with that.

Though honestly, I'm enjoying the assembly. Also, C isn't the best choice for the NES; speed is an issue, and CC65 doesn't produce excellent code.

I'm including my initialization code here for anybody to take a peek at:

; This is the 'iNES' header. Used by emulators to determine what 'virtual cartridge' to use.
; This would have absolutely no effect on an actual NES, where you'd have to use a compatible ROM with
; these settings (In this case, 16KB of PRG ROM, 8KB of CHR ROM, no mapper chip, and mirroring enabled)
    .inesprg 1
    .ineschr 1
    .inesmap 0
    .inesmir 1

; Memory bank 0, the first 8KB of PRG ROM
    .bank 0
    .org $C000

    ; Include my palette data here
data_palette:
    .incbin "./palette_sprites.pal"
    .incbin "./palette_bg.pal"

initialize:
    .incude "./src/vectors.asm"   ; Includes code for RESET, IRQ and NMI vectors.
    
    ; Initialize PPU
    ; Enable Sprite and Background rendering
    lda #%00011000
    sta $2001

load_palettes:
    lda $2002
    lda #$3F
    sta $2006
    lda #$10
    sta $2006
    
    ldx #$00
load_palette_loop:
    lda data_palette, x
    sta $2007
    cpx #$20
    bne load_palette_loop  ; Jump back to loop beginning if we haven't processed all 32 palette entries.

; Enable PPU NMI
   lda #%10000000
   sta $2000

game_loop:
game_loop_vblank_1:
    bit $2002
    bpl game_loop_vblank_1
    
    ; Anything drawing related would go here

    ; Loop forever
    jmp game_loop

; Memory bank 1, the upper 8KB of the ROM.
; Need to set up vectors here.
    .bank 1
    .org $FFFA
    
    .dw NMI
    .dw RESET
    .dw IRQ

; Memory bank 2, CHR ROM
    .bank 2
    .org $0000

    .incbin "./chr/tiles.chr"

And with that, I'll leave this for now. I'll probably update this with little bits of information here and there, and more progress screenshots (And eventually videos) at the top.

Comments

SleepinJohnnyFish 10 years, 11 months ago

I've programmed directly onto a Gameboy Advance and this post was still a bit much for me. This is dedication.

Astryl 10 years, 11 months ago

@SleepinJohnnyFish I did that a while back too. The GBA has a similar system for handling sprites (Using an OAM and DMA transfer, as well as tiles).

The main difference here is the jump from ARM architecture, and the potential use of a C compiler, to working with the 6502 code, and using plain assembly.

Though I will say, using assembly is oddly refreshing…

@Cyrus That's what I meant, really. I keep mixing these things up.

I'd love to try some effects like PureSabe is pulling off. Make a bullet hell or something.

Thanks for the doc link, missed that one :P

SleepinJohnnyFish 10 years, 11 months ago

I miss Assembly. I've used it at 0 of my actual jobs during my career, sadly.

Astryl 10 years, 11 months ago

It's been more or less phased out in mainstream programming. You'll rarely (If ever) see inline or linked assembly code in games, applications or anything really. It's a pity, since it's still the way to get the most speed out of the CPU.

Here we are, speaking of milking the NES for all it's worth; but what would happen if we tried the same with the modern PC? (Well, not accounting for the tons of demos out there already).

SleepinJohnnyFish 10 years, 11 months ago

Well, with certain technology coming out, high efficiency might become a big thing again. If something like the Oculus Rift or CastAR becomes massively popular in the future, people might start considering the difference some highly efficient code in certain areas of rendering could make…

One can only hope.

Astryl 10 years, 11 months ago

Well, just think of the typical Demo. Some of the stuff that Demo coders pull off in real-time give Crytek a run for their money.

mrpete 10 years, 11 months ago

Quote: CyrusRoberto
I can only imagine this happening with consoles as they had pretty unique hardware configurations (or at least this generation and earlier), and the longer these consoles are on the market, the more time that developers will spend tweaking their code (which I always assumed was down to the ASM level). This is why the graphics you see on a console in its later years are always much better than in its earlier years, which is a trend I've seen from the Sega Genesis all the way to the PS3.

Though people have been working with x86 ASM for nearly 30 years, so this upcoming Gen of consoles might surprise me.

All console and arcade game development was done in Assembler. Mostly, referred back in the day as simply machine code.(or machine language(though a misnomer of the actual programming convention). C would've not been used because of the extra procedures and >MEMORY< needed(libraries-includes,etc.) before even coding. Plus the fact that C regardless, of today's popularity is just another abstraction layered over the already existing low-level state communicating with the hardware(In other words, it's just another programming method). Hardware wasn't as diverse as many people might think in this contemporary age. Though, their was minuet differences. It wasn't nothing that a data sheet or reference to the hardware memory map wasn't going fix(which was a need unless one wanted to go mapping). Though, requiring training time between actual development. One, really got a feel for the processor and hardware they where developing on.

@ALL: I think the biggest assumption that most people on this blog is missing. Is accessibility!!!! Meaning, their was NO INTERNET(More like MODEM). NINTENDO,SEGA,ATARI,CAPCOM, SNK etc. Tightly, controlled all information pertaining to their interest. Regardless, if it was a console or arcade machine(which had more memory and more expansion, nice color palettes ). Shit, getting 32 colors was a treat and getting 256 was the future(lol). Especially, if game development was the main focus.

Quote: Mega
oad_palettes:

lda $2002

lda #$3F

sta $2006

lda #$10

sta $2006

ldx #$00

load_palette_loop:

lda data_palette, x

sta $2007

cpx #$20

bne load_palette_loop ; Jump back to loop beginning if we haven't processed all 32 palette entries.

; Enable PPU NMI

lda #%10000000

sta $2000

game_loop:

game_loop_vblank_1:

bit $2002

bpl game_loop_vblank_1
@Mega - Nice code. :] . I think the coolest CPU that I ever worked with was the 68k series by Motorola. The thinking by that time was more CPU rich features that would make programming more human readable and much easier usage at the low-level.

It's always nice to see people appreciate the past enough to want to know. Regardless, of what generation one is born into…

Astryl 10 years, 11 months ago

@mrpete Thanks.

With regards to the accessibility though, it's not as if it was a poorly documented circuit, and there were programming magazines by the dozen that had columns full of user code and tricks for all the 65XX series (And 68k series) devices, on both the professional and hobbyist levels.

The main reason people didn't start really pushing the NES until later in it's lifetime (Like HAL did with Kirby's Adventure), was because earlier on the various developers were targeting multiple consoles, and trying to push games out at a frenetic pace (More games is More money, or so they thought).

After a while, things slowed down and developers started really pushing their code to the limits.

Jabberwock 10 years, 11 months ago

Neat! I love that there are people doing stuff like this.

svf 10 years, 10 months ago

I've always been curious about making games for NES.

Mega, you are the only person I know who pursues interests with great zeal. I wish I had the will to do something just because.