NES Remote Procedure Call Library
---------------------------------
This library implements a remote procedure call mechanism for on the
Nintendo NES. It allows a connected host PC to call routines on the NES,
using a high-level C API. Standard routines are provided for writing and
reading anywhere in CPU and PPU memory, including mappers. Custom
routines can called as well, with their code either generated on-the-fly
by the host, or assembled into a routine library and called by name.

Using this library, flexible tools can be written that load programs
into a rewritable cartridge (RAM, Flash memory, etc.), or run small
programs out of NES RAM that probe NES hardware or the cartridge.
Programs can be chained, so that the NES doesn't have to be reset
between each run. Rather than have a large, complex routine on the NES
that can do everything, the task is broken into small pieces which are
put together by C code on the host side, where it's simpler to adjust
what pieces get put together, and in what order. Even the standard
routines are implemented this way, so that the set of routines available
is completely flexible.

The host connects to the NES via RS-232 serial, using a link cable to
the second controller port. The cable merely converts between TTL and
RS-232, while the NES implements a software-based UART to do the serial
encoding/decoding. Data is transferred at 5.8KB/sec, which has been
reasonably fast and reliable over years of use. All data is checksummed
before use, so that any errors are caught before they cause erratic
operation. A small loader program resides in the first 0.5KB of NES RAM,
leaving the remaining 1.5KB free for routines. The host loads each
routine just before calling it, so that each one has the full 1.5KB to
use for code and data.


Contents
--------
* Requirements
* Procedure
* Getting started
* Serial link cable
* Library API
* Calling standard routines
* Receiving from the NES
* Errors
* Calling custom routines
* Routine libraries
* Custom routine environment
* Internal operation
* Notes


Requirements
------------
* Nintendo NES or Famicom, NTSC or PAL
* Programmable cartridge (PowerPAK, EPROM cart, etc.)
* Serial cable that connects to second controller port
* Host PC with RS-232 serial port
* C/C++ compiler
* ca65 assembler (optional)


Procedure
---------
* Power up NES with standard boot loader
* Connect to host via serial cable
* Use this library to generate "recording" file
* Send recording to NES
* Receive result of any reads from NES memory

If running another recording afterwards, it might not even be necessary
to reset the NES, depending on whether the last routine returned to the
loader.


Getting started
---------------
* Connect host to NES
    - Connect serial cable between host and second controller port.
    - Configure host for 57600 bps, 8 bits, 1 stop bit, no parity,
      no flow control.

* Run boot loader on NES
    - Put included bootloader.nes on programmable cartridge (PowerPak,
      EPROM cartridge, etc.).
    - Power up NES. It should show ready message.
    - Send make_noise.bin. NES should execute received code and make
	  a noise burst. On Linux, do this with (replace ttyUSB0 with your
	  serial device name)

        stty -F /dev/ttyUSB0 sane
        stty -F /dev/ttyUSB0 raw 57600 cs8 -crtscts -parenb -cstopb
        cat make_noise.bin > /dev/ttyUSB0

* Run demo
    - If using a PAL NES, modify demo.c line 14 to set pal_nes to 1.
    - Compile source code: demo.c, and all source files in nrpc/. On
      Unix, execute the following
        
        g++ -o demo.exe demo.c nrpc/*.c*
    
    - Run demo.exe. This should produce demo.bin.
    - Send demo.bin. NES should print HELLO WORLD message.


Serial link cable
-----------------
The serial link cable converts between the RS-232 and TTL voltage levels
to allow connecting the NES to the host using the second controller
port. It should use a MAX232, FTDI, or equivalent, inverting the RS-232
levels so that -12V converts to +5V to the NES, and +12V converts to 0V
to the NES, and vice-versa for the NES output. DO NOT connect the NES
directly to RS-232, as the voltage level difference could damage the
NES.

    RS-232                          NES

    TX (-12V/+12V) ---> (+5V/0V)   Data
    
    RX (-12V/+12V) <--- (+5V/0V) Strobe
    
    GND --------------------------- GND

For more information about the serial connection, see the standard NES
boot loader specification:

    http://blargg.parodius.com/nes-code/bootloader/spec.html


Library API
-----------
The library allows the host to generate a file that records one or more
NES routine calls. Several standard routines are provided which allow
reading and writing CPU and PPU memory, and custom routines can also be
defined and called. When the recording is sent to the NES, these routine
calls then occur. Any calls made to routines that read data back from
the NES send the data back to the host.

Use of the library follows this pattern:

    * Create nrpc recording object with nrpc_new().
    * Load the standard "nrpc.lib" with nrpc_load_library().
    * Load any of your own libraries as well.
    * Call routines with nrpc_call*(), nrpc_write_mem(), etc.
    * Save recording to a file with nrpc_save_recording().
    * Delete recording object from memory with nrpc_delete().
    * Send recording to NES over serial cable.
    * Receive any data requested from NES.

Be sure to check the error return value from nrpc_new(),
nrpc_load_library(), and nrpc_save_recording().

See nrpc/nrpc.h for full routine reference.

While the library is written in C++, it has a C API, and makes minimal
use of C++ features. In particular, it doesn't use exceptions or RTTI,
so those can be disabled.


Calling standard routines
-------------------------
Standard routines are provided to read and write CPU and PPU memory.

    Routine           Bus  Action
    - - - - - - - - - - - - - - - - - - - - - - - - - - -
    nrpc_write_byte   CPU  Writes single byte
    nrpc_write_mem    CPU  Writes bytes
    nrpc_fill_mem     CPU  Fills bytes with value
    nrpc_read_mem     CPU  Reads bytes
    
    nrpc_write_ppu    PPU  Writes bytes
    nrpc_fill_ppu     PPU  Fills bytes with value
    nrpc_read_ppu     PPU  Reads bytes
    
    nrpc_write_mmc1   CPU  Writes to MMC1 mapper register

Note that the CPU write routines use indexed write instructions, and
thus do a dummy read at the address or 256 bytes before the address
being written to.

The nrpc_write_mem and nrpc_write_ppu routines optimize writing of
blocks that have long runs of the same byte by breaking the writes into
a combination of writes and fills. This is done completely
transparently, so it has no effect on behavior, only performance.


Receiving from the NES
----------------------
The host can read data back from the NES in a limited fashion. To read
0x8000-0x8FFF back, call

    nrpc_read_mem( nes, 0x8000, 0x1000 );

When a recording of the above is later sent to the NES, it will respond
back with 0x1000 bytes of data.

To verify that the correct data was received, append a CRC with

    nrpc_read_crc( nes );

This will append a 2-byte CRC. Verify the data and CRC with

    char received [0x1002];
    
    ... receive data and CRC ...
    
    if ( nrpc_calc_crc( nes, received, 0x1002, 0 ) != nrpc_correct_crc )
        error( "receive error" );

If it's not convenient to receive the CRC in the same block as the data,
check it separately with

    char received [0x1000];
    char crc [2];
    int check;
    
    ... receive data and crc ...
    
    check = 0;
    check = nrpc_calc_crc( nes, received, 0x1000, check );
    check = nrpc_calc_crc( nes,      crc,      2, check );
    
    if ( check != nrpc_correct_crc )
        error( "receive error" );

A similar approach can also be used if the data itself is in multiple
blocks.

This CRC approach allows choice over whether a CRC is sent at all and
how often it's checked. It allows multiple read requests to be checked
as a unit, or individually, allowing flexibility in how data is
processed by the host.


Errors
------
If an error occurs, the loader sends an 'E' followed by a digit, and
attempts to change the screen color and play a weird sound, then it
stops responding to further routine requests. It still, however,
functions as a standard boot loader.

E1: Data from host to NES failed checksum test, due most likely to
corruption.

E2: Garbage data was received instead of a header. This is most likely
due to needing a longer delay between routines (nrpc_delay).


Calling custom routines
-----------------------
Custom routines can also be called. Their code can either be provided
directly when calling it, or assembled into a named routine and put into
a library file and then called by name. In both cases, the routine's
code is loaded into the NES and then executed.

When calling a routine, four 16-bit parameters are passed into arg0
through arg3 in zero-page. Extra bytes can also be passed and appended
to the routine's code, to allow passing more data to the routine,
without it needing to receive it as a separate data block.

    nrpc_jsr         JSRs to routine; doesn't load anything
    nrpc_call        Calls named routine from library
    nrpc_call_extra  Appends extra data to named routine before calling
    nrpc_call_code   Loads arbitrary code and extra data, then executes

If a routine will take more than a hundred or so cycles to complete, the
host must call one of the nrpc_delay functions. For example, if
"my_routine" may take up to 1000 cycles, do the following:

    nrpc_call( nes, "my_routine", 0, 0, 0, 0 );
    nrpc_delay_cycles( nes, 1000 );

If the routine will be sending N bytes of data back to the host, use
nrpc_delay_bytes().

It's OK if the delay passed is greater than what the routine actually
takes. The delay is necessary because no serial flow control is used. It
inserts padding bytes so that the next routine block from the host
doesn't arrive until the current routine has finished.


Defining custom routines
------------------------
A custom routine is a piece of code that is loaded into NES memory at a
specific address, then executed.

The simplest way to define a custom routine is to put its code into an
array and pass that to nrpc_call_code(). The following defines a simple
infinite loop that is loaded at address 0x200 and executed there.

    unsigned char code [] = {
        0x4C,0x00,0x02  /* JMP $200 */
    };
    
    nrpc_call_code( nes, 0x200, code, sizeof code, 0x200, NULL, 0,
            0, 0, 0, 0 );

Code can also be assembled into a named routine and put into a file.
This makes it easy to write and update the routine without having to
paste its machine code into the C code. The following source file,
"loop_forever.s", defines an infinite loop.

    .define ROUTINE_NAME "loop_forever"
    
    .include "routine.inc"
    
    main:
        jmp main
    
    .include "routine_end.s"

To assemble it, put it in routines/ and switch to that directory, then
execute the following.

    ca65 -I common loop_forever.s
    ld65 -C routine.cfg -o loop_forever.bin loop_forever.o

This generates the assembled version, loop_forever.bin. This can then be
loaded and used as follows.

    error( nrpc_load_library( nes, "loop_forever.bin" ) );
    nrpc_call( nes, "loop_forever", 0, 0, 0, 0 );

Note that nrpc_load_library() merely loads it into memory in the host
program, not the NES. Regardless of how a routine is defined, when
executed on the NES, it is always (re)loaded just before being executed.


Routine libraries
-----------------
A routine is a named piece of code that can be loaded into the NES at a
specific address and executed. It consists of a name string, a block of
code/data, an address it must be loaded at, and an execution address.

A routine library is zero or more routines concatenated into a file. If
more than one routine in a library is named the same, only the last one
with that name is used.

More than one library can be loaded at once. The behavior of libraries A
and B loaded in that order is equivalent to the concatenation of the two
files, with B at the end. Thus, when a routine of the same name is in A
and B, the one in B is used.


Custom routine environment
--------------------------
Once a custom routine begins, it can do anything it wants. The loader
provides several services that a routine can use if it wants. The
routine doesn't need to use these, and if it won't be returning back to
the loader, it can overwrite anything it wants in memory.

On entry, A, X, and Y are cleared, and the stack pointer is at $FD. The
routine can return to the loader at any time by using RTS, which is
equivalent to jumping to exit_routine. The four 16-bit arguments are
available in zero-page at $30-$37 (arg0 through arg3), with the low byte
of each value first.

The loader reserves $38-$1EF, leaving 0-$2F and $1F0-$7FF free for the
routine to use. The loader never writes to 0-$2F and $200-$7FF, except
if instructed to. Normally a routine begins at $200.

The loader provides several services that the routine can call. See
common/routine.inc for a reference.


Internal operation
------------------
* NES first starts up with standard boot loader.
* nrpc_boot loader routine is sent to NES and begins running.
* Loader waits to receive routine block from PC, then executes it.
* If routine will take a while, host sends delay bytes.
* When routine exits, loader waits for next routine from PC.

The loader also behaves as a standard boot loader, allowing a boot
program block to be sent at any time as well. See the following for more
information about the standard NES boot loader:

    http://blargg.parodius.com/nes-code/bootloader/spec.html


Notes
-----
* I'm not versed enough in makefiles to write one that builds everything
automatically. I was able to write one to generate nrpc.lib.
* This could be adapted to other host connection types, for example one
with an MCU buffering serial data, or one that doesn't even use a serial
connection. Contact me if attempting this.

-- 
Shay Green <gblargg@gmail.com>
