News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_cpcitor

Wide and high-performance : 128-bytes line mode

Started by cpcitor, 21:46, 18 January 13

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

cpcitor

Hi,

Remember one scan line is usually 80 bytes.
It is often reduced to 64 for byte alignments (performance or speccy ports).
It would be interested to have a high-performance wide (or even overscan) mode.
Thanks CPCLER, Executioner, Octoate  for Programming:Overscan - CPCWiki .
The natural way would be 128 bytes lines.

I'm testing this with my usual emulator arnold (thanks again Kevin) compiled from arnold-nurgle-2009-03-17.tar.bz2 (thanks Andreas Micklei) and do not get immediately satisfying results.

Turn on emulator, issue :

border 0 : out &bc00,1 : out &bd00,64

Is the following result the same as a real CPC would do ? What do other emulators do ?

[attachimg=1]
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

db6128

#1
Horizontal widths must be OUTed minus one, i.e. you would want 63. This is already the "horizontal total".

IIRC, there is some technical reason that "horizontal displayed" cannot exceed 48 or so, so it's probably not possible anyway.

Besides, you would be wasting about 28 of the bytes. From the very page you linked:
Quotea full width screen would be around 48 characters, but you may like to use 50 to make sure you cover the left/right edges of the screen.
Quote from: Devilmarkus on 13:04, 27 February 12
Quote from: ukmarkh on 11:38, 27 February 12[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

cpcitor

Thank you for your quick answer.

Quote from: db6128 on 21:52, 18 January 13
Horizontal widths must be OUTed minus one, i.e. you would want 63. This is already the "horizontal total".

Yes R0, not for R1. Just OUT 40 and see that nothing changes. Out 39 and see than the lines have become 39 char wide.


border 0 : out &bc00,1 : out &bd00,39


[attachimg=1]

Quote from: db6128 on 21:52, 18 January 13
IIRC, there is some technical reason that "horizontal displayed" cannot exceed 48 or so, so it's probably not possible anyway.

Value offered in Programming:Overscan - CPCWiki is 50, and it works (just tested).
In arnold it works up to 63 included. Only 64 produced this strange result.

Quote from: db6128 on 21:52, 18 January 13
Besides, you would be wasting about 28 of the bytes. From the very page you linked:

It's okay to waste some bytes at each scanline for some speed boost. Plus those bytes are still available for some hardware scrolling.  ;)

What actually puzzles me is that after that I played with R0, arnold did produce something correct with R1=64, even after I put what CRTC - CPCWiki says to be default values.

Has anyone some experience on wide screens on some other emulator and even a real CPC ?


border 0 : out &bc00,1 : out &bd00,64

Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

ralferoo

Quote from: FindYWay on 22:14, 18 January 13
Has anyone some experience on wide screens on some other emulator and even a real CPC ?
I did quite a lot of experimenting with emulators and real CPCs when doing this for my FPGA implementation.

What seems to happen is this (or at least, how I emulate it and seems to be current).

There's a "current memory pointer" and a "start of line memory pointer".
There's a "current character position" counter.
There's the "current pixel line" counter that provides the RA lines.
At each CRTC clock, the current memory pointer is incremented by 1.
If the current character count equals R1 and the current pixel line equals R9, the "start of line memory pointer" is set to the "current memory pointer".
The current character position is incremeted or reset to 0 if it equals R0. In this case, the current pixel line is also incremented or reset to 0 if it equals R9. In the latter case, it also copies the "start of line memory pointer" to the "current memory pointer" so that the same addresses are used with a different line number. And increments the character line counter and does similar things there.

So, the upshot of this is:

If you set R1=63, you will get 126 byte lines. If you set R1=64 (or higher), the "start of line memory pointer" will never be updated and the line will repeat down the screen.

db6128

Ahhh, so that's why the maximal counters are subtracted by 1. :) Want a two-character screen: display pair of bytes 0, display pair of bytes 1, check if =1, reset.
Quote from: Devilmarkus on 13:04, 27 February 12
Quote from: ukmarkh on 11:38, 27 February 12[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

ralferoo

Like a lot of chips of that vintage, it's all about how to implement something most easily. Comparing if all bits are equal is easy with a few gates. Actually, I suspect that's probably not exactly how it works. I suspect it actually loads R1 into a counter and counts down and resets on carry (which is just a single bit ripple) - the carry flag could be the selector bit fed into a MUX to select the subtracted result or the new R1 value.

The way to test this is the set the value and change it mid-line. I just haven't done this kind of test yet... ;)

Executioner

I haven't been here for a long time and I'm just catching up on some posts. There is a way to make the display 64 (MODE 1) characters wide and do overscan with 128 byte wide display, but you need to tweak the horizontal total register by 1 in order to achieve it. ie. R0=64 rather than R0=63. This way you can set R1 to 64 and the CRTC should be able to increment the base address properly.

TFM

Quote from: cpcitor on 21:46, 18 January 13
The natural way would be 128 bytes lines.

Ehm... NO! It would be 96 writable bytes and smart coding ;)




That's all I say  8)
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

ssr86


cpcitor

Quote from: TFM on 03:54, 16 October 13
Ehm... NO! It would be 96 writable bytes and smart coding ;)
That's all I say  8)

Kevin Thacker uses the same value in Unofficial Amstrad WWW Resource but reading the code there's no obvious reason. He only says "(best value for crtc type 2)". Kevin, any hint ?

;; This example demonstrates how to draw a sprite on an overscan
;; screen.
(...)
scr_height_chars equ 35 ;; scr height in chars (best value to fill screen)
char_height_lines equ 8
scr_width_chars equ 48 ;; width of screen in chars (best value for crtc type 2)
scr_offset equ 208 ;; scr offset - setup so that the "bad" address
;; is located on the left side of the screen
;; this simplifies "scr next byte" and means sprites
;; can be drawn a bit quicker
sprite_height equ 16 ;; sprite height in lines
sprite_width_pixels equ 16 ;; sprite width in mode 0 pixels
sprite_width_bytes equ sprite_width_pixels/2 ;; sprite width in bytes

Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

arnoldemu

CRTC type 2 has "bug" or "feature".

It depends on the hsync value (R2) and the hsync width (R3 lower 4 bits) and R0 (the line length).
If R2+R3>R0 then either no HSYNCS or no VSYNCS are generated. Either way, CPC doesn't generate an interrupt, or doesn't see VSYNC and there is no keyboard.

Best to make values that work on type 2.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

Quote from: cpcitor on 21:46, 18 January 13
Hi,

Remember one scan line is usually 80 bytes.
It is often reduced to 64 for byte alignments (performance or speccy ports).
It would be interested to have a high-performance wide (or even overscan) mode.
Thanks CPCLER, Executioner, Octoate  for Programming:Overscan - CPCWiki .
The natural way would be 128 bytes lines.

I'm testing this with my usual emulator arnold (thanks again Kevin) compiled from arnold-nurgle-2009-03-17.tar.bz2 (thanks Andreas Micklei) and do not get immediately satisfying results.

Turn on emulator, issue :

border 0 : out &bc00,1 : out &bd00,64

Is the following result the same as a real CPC would do ? What do other emulators do ?

[attachimg=1]
Yes this is correct.

It should be same on my wip code.

I tested lots of things on real cpcs. I have cpcs with all the different crtc types.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

fano

Quote from: Executioner on 02:23, 16 October 13
I haven't been here for a long time and I'm just catching up on some posts. There is a way to make the display 64 (MODE 1) characters wide and do overscan with 128 byte wide display, but you need to tweak the horizontal total register by 1 in order to achieve it. ie. R0=64 rather than R0=63. This way you can set R1 to 64 and the CRTC should be able to increment the base address properly.
Problem with R0=64 is you'll get a 312*65µs frame so your frame timings will be not correct , some displays will not accept that  :(
Another problem with R1=64 is you'll waste 1/4 of vram as there are something close to 48 visible chars on screen, not a very 'clean' solution to save a bit of speed.
IF you have not hardware scroll (and if you have too but it is a bit tricky) you can know where the boundary break will occur and maybe you can find a solution to avoid it...
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

arnoldemu

Quote from: ralferoo on 23:29, 18 January 13
If you set R1=63, you will get 126 byte lines. If you set R1=64 (or higher), the "start of line memory pointer" will never be updated and the line will repeat down the screen.
The next best is spectrum sized. R1=32. With 64 bytes per line.
But then it's bigger borders ;)

After that it's R1=16, with 32 bytes per line and much larger borders...

and then it's

R1 = 8, with 16 bytes per line and very thin graphics  :laugh: :laugh: :laugh: :laugh:
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

Quote from: ssr86 on 08:16, 16 October 13
Why 96 bytes? Could you expand? :)
48 chars wide, 2 bytes per char.
96 bytes.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

Quote from: cpcitor on 22:14, 18 January 13
Thank you for your quick answer.
Be careful experimenting with longer lines on arnold, the monitor emulation is very poor.
On a normal cpc the screen will get distorted.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Executioner

Quote from: fano on 09:45, 16 October 13
Problem with R0=64 is you'll get a 312*65µs frame so your frame timings will be not correct , some displays will not accept that  :(

That all depends on what you're designing the game/demo for. If it's to run on real hardware (CTM, GT) or a decent emulator, R0=64 is fine. I believe it should probably work with most VGA/SCART/S-Video/RF modulators also, but it would be worth testing.

Quote
Another problem with R1=64 is you'll waste 1/4 of vram as there are something close to 48 visible chars on screen, not a very 'clean' solution to save a bit of speed.
IF you have not hardware scroll (and if you have too but it is a bit tricky) you can know where the boundary break will occur and maybe you can find a solution to avoid it...

Yes, but you wouldn't be scrolling it horizontally since that defeats the purpose of using 128 byte wide screens to remove boundary crossings, so the extra data would be at a consistent location so you could put graphics data or other in there. There is, however one other problem with this... You need to either use a 32K screen or do a split in the middle to get more than 16 characters (128 scan lines) in height since that's exactly how many character rows fit into the 2K limit.

@Kev: Setting R0=64 and R1=64 shouldn't cause any problems with CRTC type 2 so long as you reduce the HSYNC width slightly to make sure you don't get the 1 char in the middle of the display.

The long and short of it is that it CAN be done, but there are a few limitations and possible problems.

cpcitor

#17
Quote from: arnoldemu on 09:34, 16 October 13
CRTC type 2 has "bug" or "feature".

It depends on the hsync value (R2) and the hsync width (R3 lower 4 bits) and R0 (the line length).
If R2+R3>R0 then either no HSYNCS or no VSYNCS are generated. Either way, CPC doesn't generate an interrupt, or doesn't see VSYNC and there is no keyboard.

Best to make values that work on type 2.

Interesting. But in Unofficial Amstrad WWW Resource you set :

crtc_vals:
defb &3f ;; R0 - Horizontal Total
defb scr_width_chars ;; R1 - Horizontal Displayed
defb 48 ;; R2 - Horizontal Sync Position
defb &86 ;; R3 - Horizontal and Vertical Sync Widths


0x86 + 48 = 182 = 0xB6 which is much larger than 0x3F. Does it mean that this source fails on CRTC type 2 ?

Oh, you wrote "lower 4 bits".

0x6 + 48 = 54 = 0x36 compared to 0x3F.

But that does not seem related to setting R1=64, is it ?
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

Executioner

Quote from: cpcitor on 10:57, 16 October 13
But that does not seem related to setting R1=64, is it ?

No, but overscan requires the HSYNC to be moved further to the right in order to remove the border on the left side of the display, and if you use the default R3=#8E (ie. HSYNC width = 14) , you can't set it higher than about 50.

cpcitor

Quote from: Executioner on 10:17, 16 October 13
The long and short of it is that it CAN be done, but there are a few limitations and possible problems.

Thank you all.

So, for a CPU-intensive project that should work on all CRTC models without hassle, doing overscan following Unofficial Amstrad WWW Resource seems good.

Better, it's not even sensitive to the exact width. Routines are the same. The "complicated" CPC screen structure kind of simplifies scr_next_line because most of the time (average 7 out of 8) you can do it with 8-bit computation on the high byte.

Anyone has a faster scheme for fast graphics ? Have you found a specific width that allows to go faster ? How ?
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

TFM

Quote from: ssr86 on 08:16, 16 October 13
Why 96 bytes? Could you expand? :)


If you use 96 bytes horizontal, then you only have to deal with two columns (one byte in X) where you can not use INC L instead of INC HL (to move to next byte).


So you have a system as efficient as 256*256, but with X overscan.
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

cpcitor

Quote from: TFM on 16:46, 16 October 13
If you use 96 bytes horizontal, then you only have to deal with two columns (one byte in X) where you can not use INC L instead of INC HL (to move to next byte).

So you have a system as efficient as 256*256, but with X overscan.

Interesting, but a little short.

To summarize another way, when width is 128 bytes, 64 or 32 bytes you never have this problem and can always use INC L.

When using 96, if you know in advance your sprite does not cross there boundaries, then you can replace INC HL with INC L. I assume you have as usual dedicated routines for each sprite. Does that mean also one routine with INC L and one with INC HL, for the case where the sprite crosses ? That would start to be complicated.

Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

fano

What means TFM is when your sprite starts on a even address , you can do INC x and INC xx and reverse when your sprites starts on odd address so only 2 differents routines (only one with a bit of self modified code).That's half save than INC x but better than nothing.
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

TFM

Thanks' guys for a better explaining. [nb]I still suffer from the weekend flu and have a hard time to concentrate.[/nb]


I addition: Imagine you have a game with a turret for example: Paint that turret over the INC rr boundarys, so all sprite routine can work with INC r only.


It finally depends all on the type of game you try to do.


However 128 writeable bytes per line is overkill and I really don't suggest it.
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Powered by SMFPacks Menu Editor Mod