Alrighty, update time!
I've spent the last fortnight completely rewriting the sprite processing code.
The algorithm was rewritten from scratch, and is now much more efficient, and faster as a result.
I don't see how (with my current knowledge) I can speed it up any further.
Of course, every time I think a routine is done, I think of some other crazy idea to try!
Anyway, I've also implemented two ideas for speeding up the screen drawing, and they are working really well.
They were actually much easier to implement than the sprite processing code, but the results are far more dramatic.
The first is to only draw rows that had a sprite write in them in either the previous or current frame (the screen is drawn as 50 rows of chunky pixels, so there are 50 rows to draw or ignore).
This results in a VERY large speedup when the sprite (in this case giant Savage) moves partly off the screen.
In fact the speedup might be too much, as it would cause large framerate (and game-logic-timing) fluctuations in-game.
I don't really know what to do about this… although in the game I'm planning, it may not be a problem, as the largest sprites would only appear on-screen very briefly. Not sure yet…
The second optimisation is to only draw columns that have had a sprite write to them.
I currently process the screen as 8 columns (8 groups of 10 bytes for each pixel row).
I haven't figured out the optimal number of columns groups to be using yet, but I suspect 8 groups might be about right.
Having more groups means I'd need to do more stack pointer hijinx for each pixel row, and more SET/RES stuff. More overheads in other words.
Having less groups would be very difficult now, as I had to rewrite some parts of the screen-redraw code to accomodate these 2 optimisations. I had to let go of a couple of registers in the push/pop cycle.
These optimisations came from the ashes of the unfinished rewrite of my 3d-maze conversion, where I was researching the idea of only drawing "changed" parts of the screen. They have been significantly refined since then though.
Anyway, I'm very chuffed with the results.
Please download the updated version and have a play.
It's attached to the original post in this thread.
You'll notice that as Savage moves towards any of the 4 screen edges, he speeds up a lot (particularly as there is less of him displayed vertically). As mentioned above, this may be ok for my game design.
There's one more nutty optimisation idea I'm going to try… it may be a really silly idea, and fail miserably. Or it might be awesome. I don't know yet.
After that, it'll be time to write some code that shows what sort of game this will be.