Changes

Programming:Fast Sprites

2,560 bytes added, 02:12, 13 July 2006
INC L<br>
OR (HL)<br>
INC L<br>LD (DE),A<br>INC E</tt>;Total = 11 us per byte
But in most cases, you don't have that much memory to store all your graphics. You can create a 256 AND mask table, and possibly also a 256 byte OR mask table. This is the quickest way to mask the data while saving some memory.
OR (HL)<br>
DEC H<br>
LD (DE),A<br>INC E</tt>;Total = 15 us per byte
The other advantage of using this method is that with a small change to the above code you can quite easily use different mask tables. In ZACK, I use a reversing table which has the two MODE 0 pixels reversed, essentially flipping the byte left to right. By then changing the INC C to a DEC C (and starting at a different offset in the sprite data) it's easy to flip a sprite, saving memory for sprites which can face both ways. Another table has the OR masks all set to ink 15 for every non-transparent pixel. This can then be masked again with a colour to create a solid colour version of a sprite.
If you don't need to either paint the sprite in a solid colour, or flip the sprite left to right (or you've got enough memory to store a flipped version), then you can optimise the routine to plot a single byte even further:
<tt>ld aLD A,(bcBC):inc c<br>ld lINC C<br>LD L,a:ld aA<br>LD A,(deDE)<br>AND (HL)<br>OR L<br>LD (DE),A<br>INC E</tt> ;Total = 12 us per byte As you can see, this is now down to 12 microseconds per byte, only 1 microsecond slower than storing the mask with the data, but the sprite data is half the size, so you can store twice as much graphics data in your left-over 2K of memory once you've got the double-buffered screens, music and sound fx code and game logic in place. '''Clipping''' Depending on the routine you use, clipping can be quite difficult or quite simple. Clipping vertically is simply a matter of adjusting (hla)the start offset in the sprite and (b), the number of rows (passed in E in the example code). Clipping horizontally presents more of a problem. Simply adjusting the start offset and the width (passed in D in the example) won't do the job, because the <tt>INC C</tt> won't happen enough times to increment the sprite offset to the next row. To get around this, either store the value of C in a spare register at the start of the loop (I think there's 2 left in the example:LY and I), then at the end of the horizontal loop add the width to the value (remember the width passed in is not the width of the data!), or lprecalculate the difference between the actual sprite width and the displayed sprite width, and add this value. This is probably the preferred method. eg. At the start of the code, pass in the clipped width in E, and the sprite width in A for example: <brtt>SUB E:DB #FD:LD L,A</tt>;LY = sprite width - displayed widthld Then at the end of the loop (dejust after <tt>RET Z</tt>): <tt>LD A,aC:inc eDB #FD:ADD L:LD C,A</tt> '''One last thought''' Earlier on in this document, I mentioned not using index registers because they are slow. There could, however be some merit in using them, especially for unrolled loops to replace the BC register above. Using the index register may remove the need for the INC C and the LD L,A above, for example: <tt>LD L,(IY + 0)<br>LD A,(DE)<br>AND (HL)<br>OR L<br>LD (DE),A<br>INC E</tt> ;Total = 13 us per byte This code is only 1 microsecond slower than the previous, but unrolled (using (IY + 1), (IY + 2) etc) it won't destroy the value in IY, and it also leaves BC free for loop counters or perhaps extra masks. Not destroying the value in IY means you can simply add the width in bytes to LY even when clipping to ensure the sprite data points to the next valid byte, but remember that adding a value to LY will take at least 5 microseconds, plus one extra microsecond per byte in the loop....
[[User:Executioner|Executioner]] 19:55, 12 July 2006 (CDT)
151
edits