CPCWiki forum

General Category => Programming => Topic started by: ervin on 04:13, 22 April 24

Title: triple buffering
Post by: ervin on 04:13, 22 April 24
Hi folks.

I'm experimenting with using the extra RAM in the 6128, and I've read a few posts about triple buffering.
I'm trying to figure out how I could benefit from triple buffering, and how to use it sensibly.
Can anyone offer some tips?
Or share any experiences with triple buffering?

Thanks!
Title: Re: triple buffering
Post by: andycadley on 07:31, 22 April 24
The problem with triple buffering on the CPC comes down to only having 64K of usable video memory and each screen display being 16K. If you can reduce the usable area down to 8K, I guess it would be more manageable although your screen would be tiny. 

There's also the issue of finding a convenient memory arrangement that lets you access the right pages of RAM both for reading graphics data and writing to whichever target screen you need at the time. The 128K banking arrangements aren't terribly helpful in that regards, especially given you need suitable locations for things like the interrupt handler and stack too.

It's probably less of an issue in a GX/Plus cartridge game, because you can bank ROM much more flexibly and RAM is a lot more plentiful when you don't need to store code in it. You also have the benefit of the screen split hardware which could help having to replicate things like a status bar in different banks.
Title: Re: triple buffering
Post by: McArti0 on 07:56, 22 April 24
You can create a system with software only for addresses 4000-7FFF based on jumps to other banks and use the C0, C4-7 settings
Then you have access to 3 buffers bank 0,2,3. Of course the firmware doesn't work. Mode 2 interrupts, common stack in each bank.
Title: Re: triple buffering
Post by: McArti0 on 08:25, 22 April 24
But we've already talked about this:
https://www.cpcwiki.eu/forum/index.php?msg=235994

Maybe tell us where you have doubts?
Title: Re: triple buffering
Post by: ervin on 14:25, 22 April 24
Thanks for your replies guys.
I've managed to make stuff work with different ram banking schemes (thanks to the discussions in that other thread), but I haven't tried any code related to triple buffering yet.
I'm still at the thinking and understanding stage.
Assuming I can figure out how to do it, what would the benefit be?

@abalore has mentioned that triple buffering was one of the most important optimisations in Alcon.
https://www.cpcwiki.eu/forum/programming/hyperdrive-development/msg223048/#msg223048

And @arnoldemu mentioned a long time ago that by using triple buffering, there is no need to sync with frame flyback.
https://www.cpcwiki.eu/forum/amstrad-cpc-hardware/60hz-cpc/msg4856/#msg4856

@gerald mentioned something similar here.
https://www.cpcwiki.eu/forum/programming/frame-locking/msg65913/#msg65913

Is that potentially the main benefit?
Would I do something like the following?

buffer A is currently visible.
buffer B has been drawn completely.
buffer C is being drawn to.

When buffer C is finished, buffer A goes to the back of the queue, buffer B is made visible, and buffer C moves up. Buffer A then starts getting drawn to.

buffer B is currently visible.
buffer C has been drawn completely.
buffer A is being drawn to.

When buffer A is finished, buffer B goes to the back of the queue, buffer C is made visible, and buffer A moves up. Buffer B then starts getting drawn to.

etc.

Is that the idea?
Title: Re: triple buffering
Post by: GUNHED on 15:32, 22 April 24
Quote from: McArti0 on 07:56, 22 April 24You can create a system with software only for addresses 4000-7FFF based on jumps to other banks and use the C0, C4-7 settings
Then you have access to 3 buffers bank 0,2,3. Of course the firmware doesn't work. Mode 2 interrupts, common stack in each bank.
A common stack in each bank? So you copy the stack when switching the bank? Won't work in real life. Better use a own stack for every bank.
Title: Re: triple buffering
Post by: GUNHED on 15:35, 22 April 24
The whole triple buffer thing makes only sense if the time needed to draw one frame has a huge variation from screen to screen.

Or you use two screens and one buffer for the background. Makes only sense if you work with static pictures - w/o scrolling.
Title: Re: triple buffering
Post by: McArti0 on 15:48, 22 April 24
Quote from: GUNHED on 15:32, 22 April 24A common stack in each bank? So you copy the stack when switching the bank?
NO. I understand this as no change SP when changing banks.
Title: Re: triple buffering
Post by: andycadley on 16:05, 22 April 24
Quote from: GUNHED on 15:32, 22 April 24
Quote from: McArti0 on 07:56, 22 April 24You can create a system with software only for addresses 4000-7FFF based on jumps to other banks and use the C0, C4-7 settings
Then you have access to 3 buffers bank 0,2,3. Of course the firmware doesn't work. Mode 2 interrupts, common stack in each bank.
A common stack in each bank? So you copy the stack when switching the bank? Won't work in real life. Better use a own stack for every bank.
It can work as long as you're very, very confident about when bank switching will occur. It's pretty much a sure fire route to shooting yourself in the foot unless you're enormously careful though.
Title: Re: triple buffering
Post by: andycadley on 16:11, 22 April 24
The goal of triple buffering is never to be stalled waiting for a display flip to happen (unless rendering is so quick it takes less than a frame).

It's a more complex setup though and RAM gets very tight. More often than not you can get away with just double buffering if you can keep rendering time down to a minimum. It's really only when it takes between a frame and a frame and a half to render on average that you're really likely to be winning.
Title: Re: triple buffering
Post by: abalore on 16:20, 22 April 24
I don't know if you would call it triple buffering. Alcon had two switching visible buffers and a third invisible buffer which holds a clean copy of the background. That makes the sprite erasing in the other two buffers a lot faster.
Title: Re: triple buffering
Post by: andycadley on 16:39, 22 April 24
Yeah, that's not triple buffering, it's just using three buffers. The clean background doesn't even need to be accessible to the Gate Array, which simplifies things even further.
Title: Re: triple buffering
Post by: McArti0 on 17:04, 22 April 24
I've read it and I understand it now. Triple buffering is needed when most frames are rendered in 20ms and suddenly some frames are slightly longer or even one frame is twice as long.
Then it's time to render it because we have two left to display. Of course we have a lag of 40ms.

On CPC with the 512kB extension, we can create even a quadruple buffer with the C1+, C3+ settings because we have many banks instead of bank7 and we see the entire 64kB as vram.
Title: Re: triple buffering
Post by: djaybee on 19:11, 22 April 24
So, yeah, triple-buffering has the advantage that you (OP) describe.

You can work with a single buffer, but it's got to remain clean enough at all times (e.g. constantly working incrementally).

You can work with a double buffer, where one buffer (front) is clean and gets displayed while you work into the other buffer (back). If you can't immediately page-flip (either because the hardware doesn't support it or because you don't want the tearing to be visible), you end up in a situation where you've finished drawing into your back buffer and you have to wait for the page-flip before you can draw your next frame.

With a triple buffer, you have 2 back buffers, once you're done drawing into one of them, you can immediately start drawing into the other one, so you have a higher frame rate compared to double-buffer since you can always start drawing (with the understanding that there'll be some judder). This is not useful if you can draw faster than your refresh rate, but that's a good problem to have.

Double- and Triple-buffering on the CPC face 3 main challenges:
-all the buffers must fit in the low 64kB of RAM.
-if you use the firmware, it carves out 2 chunks of RAM in those low 64kB.
-bank mapping only has a small number of configurations, which creates constraints. E.g. using banks 1 and 3 for buffers feels natural, until you realize that no memory config maps bank 1 at the same time as banks 4-6, so you can't directly copy graphics from banks 4-6 to bank 1.

For triple-buffering specifically, the smaller your screen, the easier things get. Smaller than 5.33kB, 3 buffers fit in 1 banks, that's very easy. Smaller than 8kB, 3 buffers fit in 2 non-contiguous banks, that's very feasible and very similar to regular double-buffering. Smaller than 10.66kB, 3 buffers fit in 2 contiguous banks. Up to 16kB, 3 buffers need 3 banks, Beyond that and up to 21.33kB, you need all 4 banks and madness lies ahead (you can do resolutions like 312x280, 352x248, 376x232 in mode 1).
Title: Re: triple buffering
Post by: ervin on 01:24, 23 April 24
Thanks everyone for your replies.
All of that gives me a lot to think about.
Title: Re: triple buffering
Post by: Anthony Flack on 22:52, 23 April 24
I'm not aware of any CPC games that use triple buffering, are there? 

I am doing the exact same thing as abalore; I have a front buffer, a back buffer and a clean buffer for restoring the background. I guess we both independently concluded this was fastest. 
Title: Re: triple buffering
Post by: ervin on 00:08, 24 April 24
Thanks Anthony.
It sounds like it might be the most useful idea for me as well.
Title: Re: triple buffering
Post by: Anthony Flack on 09:58, 24 April 24
If you want to do something along these lines, the memory arrangement I used:

Main code goes in bank 0.

Front/back buffers are in bank 2 and 3 at &8000 and &c000.

Clean buffer, compiled sprites, and any other code all swap in to bank 1 at &4000, so that you can copy from any of these banks into either screen buffer. 
Title: Re: triple buffering
Post by: ervin on 10:44, 24 April 24
Sounds like a good scheme.
Thanks!
Title: Re: triple buffering
Post by: djaybee on 13:02, 25 April 24
Deep inside, I can't stop thinking about a closely related question: what situations would benefit from triple-buffering, i.e. what are the types of graphics where the performance gains of triple-buffering outweigh the memory costs?

Typically, triple-buffering results in an unsteady frame rate, which is especially visible at high frame rates, such that it's not necessarily a good idea for situations where the level of complexity is similar from frame to frame (which is the case for anything that's heavy on sprites and background graphics). On the other hand, 3D games, especially those drawn with polygons, might have a slower frame rate and and an inherently unsteady frame rate, such that those might be more appropriate situations for triple-buffering. In good news, those games might rely less on backgrounds and sprites and bitmaps in general, so the memory pressure from having 3 buffers might not be so high.

No, I don't have time to try this (I already have a lot of code on my plate), but, if I did, especially on 6128, I'd be using a 136x160 mode 0 display, with my buffers in banks 2 and 3, core code in bank 0, and banks 1 and 4-7 for situations where code and data can be paged in and out. In such a scheme, I would use memory modes 0 and 4 through 7, but I also note that memory mode 2 could be useful if there's some compressed data that needs to be decompressed on demand (with all the usual caveats about memory mode 2 and interrupts, of course).
Title: Re: triple buffering
Post by: McArti0 on 14:18, 25 April 24
Quote from: djaybee on 13:02, 25 April 24Deep inside, I can't stop thinking about a closely related question: what situations would benefit from triple-buffering, i.e. what are the types of graphics where the performance gains of triple-buffering outweigh the memory costs?
Fast gameplay 50fps with huge explosions . You need to copy 10kB to screen.
Title: Re: triple buffering
Post by: andycadley on 14:22, 25 April 24
Triple buffering doesn't necessarily produce a more variable frame rate, indeed the main reason for doing it is a more consistent frame rate than double buffering in any case where double buffering can stall due to excessive waiting.

Typically it works best when render time for a frame is a little over a frame (or a little over two frames etc). And, of course, you can still rate limit actual frame swaps to make things more consistent.

I don't think it's worth the pain on a standard CPC. As I said it might be more interesting on a GX4000 game as you have a lot more free RAM to play with, a lot more flexibility in terms of screen splitting and you can run much more from ROM to give you a much larger effective address space (assuming your screen buffers can be write only).
Title: Re: triple buffering
Post by: GUNHED on 15:04, 25 April 24
Sorry, the GX4000 has only half the RAM compared to the CPC6128. Furthermore the CRTC RAM is 64 KB in both cases. ROM is another thing. The GX4000 itself has none.
Title: Re: triple buffering
Post by: andycadley on 16:05, 25 April 24
Quote from: GUNHED on 15:04, 25 April 24Sorry, the GX4000 has only half the RAM compared to the CPC6128. Furthermore the CRTC RAM is 64 KB in both cases. ROM is another thing. The GX4000 itself has none.
It doesn't work like that in practice.

When you write cartridge software, you don't typically need much actual RAM for storing stuff (because 99% on code and assets can be in ROM), thus you typically have almost all of the 64K free to dedicate to whatever the CRTC needs. And you can be a lot smarter about arranging things by leaving ROMs paged in and relying on write-through for updates (assuming you don't need masking etc) which gives you an effective usable address space of 96K (and obviously up to 512K of code/data space in total).

On a 128K CPC most of the RAM tends to end up storing code + assets and you have to work with just 64K of effective address space (assuming you're not ROM software).
Title: Re: triple buffering
Post by: djaybee on 19:44, 25 April 24
Quote from: McArti0 on 14:18, 25 April 24Fast gameplay 50fps with huge explosions . You need to copy 10kB to screen.
Oh, interesting, I hadn't considered that explicitly. Essentially, if most frames take well less than 20ms but some take more than that, having a triple buffer allows to "borrow" time from a short frame into a long latter one.

I think I once did something like that in a demo for the Atari ST: my code took near-constant time, but the music player I used didn't, and I made it so that the average would fit in my 20ms budget (40064 NOPs on the ST). Overall, the delay never added up to more than the size of the bottom + top borders, so my frame boundary never crossed into the visible part of the display.
Title: Re: triple buffering
Post by: roudoudou on 20:30, 25 April 24
in a demo, lag has no meaning

in a game... ;D

that's why emulator authors try to reduce lag...

...sometimes :P
Title: Re: triple buffering
Post by: McArti0 on 21:00, 25 April 24
In Jet planes you always have a lot of lag. So it must be Jet simulator games.  ;D
Title: Re: triple buffering
Post by: Anthony Flack on 22:39, 25 April 24
Any kind of screen buffering on the GX is also complicated by the hardware sprites, which will have to have their data buffered as well, or else they'll get ahead of everything. But a sprite game on the GX should rightly be aiming to hit 50fps anyway.
Title: Re: triple buffering
Post by: andycadley on 23:03, 25 April 24
Quote from: Anthony Flack on 22:39, 25 April 24Any kind of screen buffering on the GX is also complicated by the hardware sprites, which will have to have their data buffered as well, or else they'll get ahead of everything. But a sprite game on the GX should rightly be aiming to hit 50fps anyway.
Depends what you're doing, if you were aiming for something 3D like Castle Master you might not have many sprites (or just use it for things like a cursor that just needs positioning). And that's where triple buffering is most likely to be useful. It'd be fascinating to see what a Mode 0 GX Freescape would be like, especially given how much cart space you could dedicate to massive look up tables to speed up the maths...

If you're doing something sprite heavy and scrolling, you'd be more likely to lean on the hardware features and avoid even double buffering entirely (at most keeping a secondary clean background buffer).
Title: Re: triple buffering
Post by: lmimmfn on 01:54, 26 April 24
Very few Amiga games are 50FPS even with all the hardware, games are normally 25FPS so I don't know why GX games should be 50FPS.

You can only move around X amount of screen data at 50FPS, at 25FPS you can double it.
Title: Re: triple buffering
Post by: Anthony Flack on 05:19, 26 April 24
Well, it depends what your priorities are. The GX is a console after all, and 50fps gives you that arcade/console feel. It's a worthy if restrictive ambition even for the stock CPC, and the GX is plenty capable of doing what the Commodore 64 can do, so I'd definitely shoot for that. I did say a sprite game, so something equivalent to what you might see on the NES or Master System. 

I know it cuts your drawing time in half but hopefully you don't need to spend all that time masking and drawing and erasing sprites either, and it's the difference between smooth arcade feel and slightly juddery home computer feel.

The CPC endured such rough FPS back in the day, it always feels really pleasing to see it hitting a clean 50. 

Title: Re: triple buffering
Post by: roudoudou on 07:26, 26 April 24
it's always possible (like Prehistorik II or Super Cauldron) to have some parts @50Hz and some animations @5Hz ;D

and it's also possible to have a dedicated engine to manage any framerate for any animation in the screen (some @50Hz, some @25Hz, ...)

even better, an automatic distribution, with priorities for main character and lower priority for background, as you must know time consumption for each element
Title: Re: triple buffering
Post by: Anthony Flack on 21:26, 26 April 24
Dragon Attack is a beautiful example of updating the player directly onto the front buffer at 50hz, with the bullets updating every third frame. It works very well. You can get away with jerky enemies if the player is smooth and responsive. 

You could try other things like only drawing half your enemies every other frame, or even go as far as the old Space Invaders trick of moving only one enemy each frame. On the Plus, updating sprite graphics takes time, so you can stick all your animation frame change requests into a buffer and only process a few each frame, or stagger them so that a quarter of the sprites update regularly every 4th frame for eg. 

You can buffer other changes, too. It doesn't matter if the score panel or some of the numbers and other widgets change a frame or two later, so I make score panel updates happen over several frames rather than all at once. 
Powered by SMFPacks Menu Editor Mod