Tuesday 18 April 2017

More details

At some point during development, it struck me how complicated the scroll engine was becoming. The initial concept could be summarised in a few words and was supposed to be a simple matter of moving pointers around circular buffers, copying data from one place to another. But now there seemed to be an endless list of details to consider.

In some ways, the scrolling really is simple. It does pretty much boil down to pointers and copying. Where it gets complicated is figuring out which pieces of tile to draw, and a lot of that complication comes from the interaction between horizontal and vertical scrolling. One direction messes up the other.

But I do like a challenge, and I really wanted to see this thing working...

For horizontal scrolling we need to draw a one byte wide vertical stripe of tile pieces. Right away there are a couple of problems. The first is caused by the tiles being two bytes wide and the second is caused by vertical scrolling.

The first problem is easily solved by modifying the 'tile zero' base address we use to calculate the tile image addresses. By adding one to the base address, we automatically access the right half of each tile. If the tile image data starts at an even address, then the least significant bit of the address can be set to achieve the same effect.

The information to decide if we are accessing the left or right half of a tile comes from the map column pointer. This pointer is incremented or decremented at the same time as the buffer pointer, meaning the least significant bit indicates which tile half is being pointed to.

The second problem is a bit more involved. It means we have to deal with partial tiles at the top and bottom of the display. This divides the vertical stripe into three sections: A partial tile at the top, a run of complete tiles (well, half tiles), and a partial tile at the bottom.

Sometimes the tiles will line up neatly with the top and bottom edges of the display, meaning no partial tiles, and fewer visible tiles. How does this work exactly? The screen is 12 tiles high, but if partially scrolled, 13 will be visible. We can define things a bit more precisely to help shape the algorithm:

  • The top tile will have some or all pixel rows visible.
  • The run of complete tiles will always be the same number of tiles (11 if part of a full height display).
  • The bottom tile will have some pixel rows visible, or none.

That puts the bottom tile in The Occasionally Disappearing 13th Row*. It simply gets skipped on those frames where the tiles have perfect vertical alignment with the screen.

Take it from the top


Let's look at the top tile first. When it is partially visible, the bit that is missing is the top part. We can use the vertical pixel counter to figure out the parameters. This counter is incremented each time we need to move the screen contents up one pixel. The tiles are eight pixels high, so we're interested in the bottom three bits of the counter, which gives us a number in the range zero to seven. When it is zero, all of the tile rows are visible, and when it seven, just the bottom row of the tile is visible. So what we need is to subtract this number from eight to give us the number of rows of pixels to draw.

The other thing we need to determine is an offset into the tile image data so that we draw the correct part of the tile. Just like we did for vertical scrolling, we can take the bottom three bits of the vertical counter and multiply by two to create an offset for the tile image data.

The list of hoops to jump through before we can start drawing looks like this:

  • Determine where in the buffer to start drawing using the logic discussed in Scrolling 101
  • Determine where in the map we will start reading using the map row and column pointers discussed in Details
  • Modify the tile image base address to select the left half or right half of each tile. (i.e. add one if the bottom bit of the map column pointer is set)
  • Determine the parameters for the partial top tile using the vertical counter

That sets us up for the top tile. We use the map pointer to give us the tile ID which in turn allows us to calculate the address of the image data. We can then copy tile image bytes to the buffer.

After each tile image byte is drawn into the buffer, we need to advance the buffer destination pointer to the next row and check that it hasn't crossed the end of the buffer. If it has, then the buffer size should be subtracted from the pointer so that drawing continues from the top of the buffer.

Take it to the bridge


After the top tile, we need to draw the run of full height tiles. These are relatively easy as they are all fixed height and can be drawn with two loops: An inner loop to output the fixed number of bytes per tile, and an outer loop to output the fixed number of tiles. We continue advancing and wrapping the buffer destination pointer for each byte written, and similarly advance and wrap the map pointer for each tile produced.

Having to check the buffer destination address for every byte written consumes a lot of cycles. It looks like this piece of code:

    cmpx #buffer_end
    blo no_adjust
    leax -buffer_size,x
no_adjust

As the end of the buffer can only be crossed once, this code does very little useful work. It nearly always executes just the cmpx and the blo, but that's still 96 x 7 = 672 cycles for a full height draw.

It would be nice to avoid as many of those checks as possible. The approach I've used is to check the buffer pointer before drawing each full tile. If there is room to draw a tile without reaching the end of the buffer then it draws the tile using an unrolled loop with no pointer check. Otherwise the tile is drawn byte by byte in a loop with the pointer check. That trims out a lot of cycles without adding a lot of complexity.

Throw it in the river


Finally we reach the partial tile at the bottom. This is easier to deal with than the top tile. Firstly, the part of the tile that is missing is the lower part of the tile, so there's no need to offset the image address. Secondly, the number of pixel rows we need to draw is simply the bottom three bits of the vertical pixel counter. If it's zero, we don't need to draw the tile at all as we've already reached the bottom of the screen.


What we have so far, is pixel-by-pixel vertical scrolling, but the horizontal scroll is still only byte-by-byte. To get fast horizontal scrolling working at the pixel level, we need to bring in additional buffers and expand the drawing routines to include pixel shifting. Another layer of complexity. But at least the scroll engine will then be complete. It couldn't get any more complicated than that. Could it? To be continued...

(Spoiler alert: Yeah, it could)



* The Occasionally Disappearing 13th Row is possibly a British movie of the "I say, that's inconvenient!" disaster movie sub-genre, starring Timothy Dalton as a cheesy airline boss; Bill Nighy, apparently legally required to be in every British movie; and Martin Freeman as Tim from The Office. Again.

No comments:

Post a Comment