CPCWiki forum

General Category => Programming => Topic started by: cpcitor on 21:46, 18 January 13

Title: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 21:46, 18 January 13
Hi,

Remember one scan line is usually 80 bytes.
It is often reduced to 64 for byte alignments (performance or speccy ports).
It would be interested to have a high-performance wide (or even overscan) mode.
Thanks CPCLER, Executioner, Octoate  for Programming:Overscan - CPCWiki (http://www.cpcwiki.eu/index.php/Programming:Overscan) .
The natural way would be 128 bytes lines.

I'm testing this with my usual emulator arnold (thanks again Kevin) compiled from arnold-nurgle-2009-03-17.tar.bz2 (thanks Andreas Micklei) and do not get immediately satisfying results.

Turn on emulator, issue :

border 0 : out &bc00,1 : out &bd00,64

Is the following result the same as a real CPC would do ? What do other emulators do ?

[attachimg=1]
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: db6128 on 21:52, 18 January 13
Horizontal widths must be OUTed minus one, i.e. you would want 63. This is already the "horizontal total".

IIRC, there is some technical reason that "horizontal displayed" cannot exceed 48 or so, so it's probably not possible anyway.

Besides, you would be wasting about 28 of the bytes. From the very page you linked:
Quotea full width screen would be around 48 characters, but you may like to use 50 to make sure you cover the left/right edges of the screen.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 22:14, 18 January 13
Thank you for your quick answer.

Quote from: db6128 on 21:52, 18 January 13
Horizontal widths must be OUTed minus one, i.e. you would want 63. This is already the "horizontal total".

Yes R0, not for R1. Just OUT 40 and see that nothing changes. Out 39 and see than the lines have become 39 char wide.


border 0 : out &bc00,1 : out &bd00,39


[attachimg=1]

Quote from: db6128 on 21:52, 18 January 13
IIRC, there is some technical reason that "horizontal displayed" cannot exceed 48 or so, so it's probably not possible anyway.

Value offered in Programming:Overscan - CPCWiki (http://www.cpcwiki.eu/index.php/Programming:Overscan) is 50, and it works (just tested).
In arnold it works up to 63 included. Only 64 produced this strange result.

Quote from: db6128 on 21:52, 18 January 13
Besides, you would be wasting about 28 of the bytes. From the very page you linked:

It's okay to waste some bytes at each scanline for some speed boost. Plus those bytes are still available for some hardware scrolling.  ;)

What actually puzzles me is that after that I played with R0, arnold did produce something correct with R1=64, even after I put what CRTC - CPCWiki (http://www.cpcwiki.eu/index.php/CRTC#The_6845_Registers) says to be default values.

Has anyone some experience on wide screens on some other emulator and even a real CPC ?


border 0 : out &bc00,1 : out &bd00,64

Title: Re: Wide and high-performance : 128-bytes line mode
Post by: ralferoo on 23:29, 18 January 13
Quote from: FindYWay on 22:14, 18 January 13
Has anyone some experience on wide screens on some other emulator and even a real CPC ?
I did quite a lot of experimenting with emulators and real CPCs when doing this for my FPGA implementation.

What seems to happen is this (or at least, how I emulate it and seems to be current).

There's a "current memory pointer" and a "start of line memory pointer".
There's a "current character position" counter.
There's the "current pixel line" counter that provides the RA lines.
At each CRTC clock, the current memory pointer is incremented by 1.
If the current character count equals R1 and the current pixel line equals R9, the "start of line memory pointer" is set to the "current memory pointer".
The current character position is incremeted or reset to 0 if it equals R0. In this case, the current pixel line is also incremented or reset to 0 if it equals R9. In the latter case, it also copies the "start of line memory pointer" to the "current memory pointer" so that the same addresses are used with a different line number. And increments the character line counter and does similar things there.

So, the upshot of this is:

If you set R1=63, you will get 126 byte lines. If you set R1=64 (or higher), the "start of line memory pointer" will never be updated and the line will repeat down the screen.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: db6128 on 00:45, 19 January 13
Ahhh, so that's why the maximal counters are subtracted by 1. :) Want a two-character screen: display pair of bytes 0, display pair of bytes 1, check if =1, reset.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: ralferoo on 02:22, 19 January 13
Like a lot of chips of that vintage, it's all about how to implement something most easily. Comparing if all bits are equal is easy with a few gates. Actually, I suspect that's probably not exactly how it works. I suspect it actually loads R1 into a counter and counts down and resets on carry (which is just a single bit ripple) - the carry flag could be the selector bit fed into a MUX to select the subtracted result or the new R1 value.

The way to test this is the set the value and change it mid-line. I just haven't done this kind of test yet... ;)
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: Executioner on 02:23, 16 October 13
I haven't been here for a long time and I'm just catching up on some posts. There is a way to make the display 64 (MODE 1) characters wide and do overscan with 128 byte wide display, but you need to tweak the horizontal total register by 1 in order to achieve it. ie. R0=64 rather than R0=63. This way you can set R1 to 64 and the CRTC should be able to increment the base address properly.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: TFM on 03:54, 16 October 13
Quote from: cpcitor on 21:46, 18 January 13
The natural way would be 128 bytes lines.

Ehm... NO! It would be 96 writable bytes and smart coding ;)




That's all I say  8)
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: ssr86 on 08:16, 16 October 13
Why 96 bytes? Could you expand? :)
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 08:50, 16 October 13
Quote from: TFM on 03:54, 16 October 13
Ehm... NO! It would be 96 writable bytes and smart coding ;)
That's all I say  8)

Kevin Thacker uses the same value in Unofficial Amstrad WWW Resource (http://www.cpctech.org.uk/source/os_spr.html) but reading the code there's no obvious reason. He only says "(best value for crtc type 2)". Kevin, any hint ?

;; This example demonstrates how to draw a sprite on an overscan
;; screen.
(...)
scr_height_chars equ 35 ;; scr height in chars (best value to fill screen)
char_height_lines equ 8
scr_width_chars equ 48 ;; width of screen in chars (best value for crtc type 2)
scr_offset equ 208 ;; scr offset - setup so that the "bad" address
;; is located on the left side of the screen
;; this simplifies "scr next byte" and means sprites
;; can be drawn a bit quicker
sprite_height equ 16 ;; sprite height in lines
sprite_width_pixels equ 16 ;; sprite width in mode 0 pixels
sprite_width_bytes equ sprite_width_pixels/2 ;; sprite width in bytes

Title: Re: Wide and high-performance : 128-bytes line mode
Post by: arnoldemu on 09:34, 16 October 13
CRTC type 2 has "bug" or "feature".

It depends on the hsync value (R2) and the hsync width (R3 lower 4 bits) and R0 (the line length).
If R2+R3>R0 then either no HSYNCS or no VSYNCS are generated. Either way, CPC doesn't generate an interrupt, or doesn't see VSYNC and there is no keyboard.

Best to make values that work on type 2.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: arnoldemu on 09:36, 16 October 13
Quote from: cpcitor on 21:46, 18 January 13
Hi,

Remember one scan line is usually 80 bytes.
It is often reduced to 64 for byte alignments (performance or speccy ports).
It would be interested to have a high-performance wide (or even overscan) mode.
Thanks CPCLER, Executioner, Octoate  for Programming:Overscan - CPCWiki (http://www.cpcwiki.eu/index.php/Programming:Overscan) .
The natural way would be 128 bytes lines.

I'm testing this with my usual emulator arnold (thanks again Kevin) compiled from arnold-nurgle-2009-03-17.tar.bz2 (thanks Andreas Micklei) and do not get immediately satisfying results.

Turn on emulator, issue :

border 0 : out &bc00,1 : out &bd00,64

Is the following result the same as a real CPC would do ? What do other emulators do ?

[attachimg=1]
Yes this is correct.

It should be same on my wip code.

I tested lots of things on real cpcs. I have cpcs with all the different crtc types.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: fano on 09:45, 16 October 13
Quote from: Executioner on 02:23, 16 October 13
I haven't been here for a long time and I'm just catching up on some posts. There is a way to make the display 64 (MODE 1) characters wide and do overscan with 128 byte wide display, but you need to tweak the horizontal total register by 1 in order to achieve it. ie. R0=64 rather than R0=63. This way you can set R1 to 64 and the CRTC should be able to increment the base address properly.
Problem with R0=64 is you'll get a 312*65µs frame so your frame timings will be not correct , some displays will not accept that  :(
Another problem with R1=64 is you'll waste 1/4 of vram as there are something close to 48 visible chars on screen, not a very 'clean' solution to save a bit of speed.
IF you have not hardware scroll (and if you have too but it is a bit tricky) you can know where the boundary break will occur and maybe you can find a solution to avoid it...
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: arnoldemu on 09:49, 16 October 13
Quote from: ralferoo on 23:29, 18 January 13
If you set R1=63, you will get 126 byte lines. If you set R1=64 (or higher), the "start of line memory pointer" will never be updated and the line will repeat down the screen.
The next best is spectrum sized. R1=32. With 64 bytes per line.
But then it's bigger borders ;)

After that it's R1=16, with 32 bytes per line and much larger borders...

and then it's

R1 = 8, with 16 bytes per line and very thin graphics  :laugh: :laugh: :laugh: :laugh:
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: arnoldemu on 09:49, 16 October 13
Quote from: ssr86 on 08:16, 16 October 13
Why 96 bytes? Could you expand? :)
48 chars wide, 2 bytes per char.
96 bytes.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: arnoldemu on 09:51, 16 October 13
Quote from: cpcitor on 22:14, 18 January 13
Thank you for your quick answer.
Be careful experimenting with longer lines on arnold, the monitor emulation is very poor.
On a normal cpc the screen will get distorted.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: Executioner on 10:17, 16 October 13
Quote from: fano on 09:45, 16 October 13
Problem with R0=64 is you'll get a 312*65µs frame so your frame timings will be not correct , some displays will not accept that  :(

That all depends on what you're designing the game/demo for. If it's to run on real hardware (CTM, GT) or a decent emulator, R0=64 is fine. I believe it should probably work with most VGA/SCART/S-Video/RF modulators also, but it would be worth testing.

Quote
Another problem with R1=64 is you'll waste 1/4 of vram as there are something close to 48 visible chars on screen, not a very 'clean' solution to save a bit of speed.
IF you have not hardware scroll (and if you have too but it is a bit tricky) you can know where the boundary break will occur and maybe you can find a solution to avoid it...

Yes, but you wouldn't be scrolling it horizontally since that defeats the purpose of using 128 byte wide screens to remove boundary crossings, so the extra data would be at a consistent location so you could put graphics data or other in there. There is, however one other problem with this... You need to either use a 32K screen or do a split in the middle to get more than 16 characters (128 scan lines) in height since that's exactly how many character rows fit into the 2K limit.

@Kev: Setting R0=64 and R1=64 shouldn't cause any problems with CRTC type 2 so long as you reduce the HSYNC width slightly to make sure you don't get the 1 char in the middle of the display.

The long and short of it is that it CAN be done, but there are a few limitations and possible problems.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 10:57, 16 October 13
Quote from: arnoldemu on 09:34, 16 October 13
CRTC type 2 has "bug" or "feature".

It depends on the hsync value (R2) and the hsync width (R3 lower 4 bits) and R0 (the line length).
If R2+R3>R0 then either no HSYNCS or no VSYNCS are generated. Either way, CPC doesn't generate an interrupt, or doesn't see VSYNC and there is no keyboard.

Best to make values that work on type 2.

Interesting. But in Unofficial Amstrad WWW Resource (http://www.cpctech.org.uk/source/os_spr.html) you set :

crtc_vals:
defb &3f ;; R0 - Horizontal Total
defb scr_width_chars ;; R1 - Horizontal Displayed
defb 48 ;; R2 - Horizontal Sync Position
defb &86 ;; R3 - Horizontal and Vertical Sync Widths


0x86 + 48 = 182 = 0xB6 which is much larger than 0x3F. Does it mean that this source fails on CRTC type 2 ?

Oh, you wrote "lower 4 bits".

0x6 + 48 = 54 = 0x36 compared to 0x3F.

But that does not seem related to setting R1=64, is it ?
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: Executioner on 11:14, 16 October 13
Quote from: cpcitor on 10:57, 16 October 13
But that does not seem related to setting R1=64, is it ?

No, but overscan requires the HSYNC to be moved further to the right in order to remove the border on the left side of the display, and if you use the default R3=#8E (ie. HSYNC width = 14) , you can't set it higher than about 50.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 11:28, 16 October 13
Quote from: Executioner on 10:17, 16 October 13
The long and short of it is that it CAN be done, but there are a few limitations and possible problems.

Thank you all.

So, for a CPU-intensive project that should work on all CRTC models without hassle, doing overscan following Unofficial Amstrad WWW Resource (http://www.cpctech.org.uk/source/os_spr.html) seems good.

Better, it's not even sensitive to the exact width. Routines are the same. The "complicated" CPC screen structure kind of simplifies scr_next_line because most of the time (average 7 out of 8) you can do it with 8-bit computation on the high byte.

Anyone has a faster scheme for fast graphics ? Have you found a specific width that allows to go faster ? How ?
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: TFM on 16:46, 16 October 13
Quote from: ssr86 on 08:16, 16 October 13
Why 96 bytes? Could you expand? :)


If you use 96 bytes horizontal, then you only have to deal with two columns (one byte in X) where you can not use INC L instead of INC HL (to move to next byte).


So you have a system as efficient as 256*256, but with X overscan.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: cpcitor on 17:15, 16 October 13
Quote from: TFM on 16:46, 16 October 13
If you use 96 bytes horizontal, then you only have to deal with two columns (one byte in X) where you can not use INC L instead of INC HL (to move to next byte).

So you have a system as efficient as 256*256, but with X overscan.

Interesting, but a little short.

To summarize another way, when width is 128 bytes, 64 or 32 bytes you never have this problem and can always use INC L.

When using 96, if you know in advance your sprite does not cross there boundaries, then you can replace INC HL with INC L. I assume you have as usual dedicated routines for each sprite. Does that mean also one routine with INC L and one with INC HL, for the case where the sprite crosses ? That would start to be complicated.

Title: Re: Wide and high-performance : 128-bytes line mode
Post by: fano on 17:30, 16 October 13
What means TFM is when your sprite starts on a even address , you can do INC x and INC xx and reverse when your sprites starts on odd address so only 2 differents routines (only one with a bit of self modified code).That's half save than INC x but better than nothing.
Title: Re: Wide and high-performance : 128-bytes line mode
Post by: TFM on 19:04, 16 October 13
Thanks' guys for a better explaining. [nb]I still suffer from the weekend flu and have a hard time to concentrate.[/nb]


I addition: Imagine you have a game with a turret for example: Paint that turret over the INC rr boundarys, so all sprite routine can work with INC r only.


It finally depends all on the type of game you try to do.


However 128 writeable bytes per line is overkill and I really don't suggest it.
Powered by SMFPacks Menu Editor Mod