News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_ervin

drawing lines with cpcTelera / SDCC (V1.0 RELEASE)

Started by ervin, 12:57, 19 July 16

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

andycadley

Bear in mind all those timings are wrong for the CPC and optimizations that work on other machines may hurt performance on the CPC. It's better to base everything on NOP timing.


Docent

#77
Quote from: andycadley on 19:41, 28 August 16
Bear in mind all those timings are wrong for the CPC and optimizations that work on other machines may hurt performance on the CPC. It's better to base everything on NOP timing.

its always good to check before giving false opinions :)
Of course you're wrong - while single instruction timing doesn't show adjusted cpc clocks cycles, the total sum supports wait states introduced by the gate array. Here's the excerpt from the posted source:

ld hl,#8045 ;10T         12T 21 45 80
ld b,(hl) ;7T          8T 46
                                                         ^^              ^^


btw: the way adjusted timing is displayed is configurable and I prefer to see the original instruction timing for single execution.


ervin

#78
Alrighty, using suggestions from @Docent has lead to a small but measurable speed improvement.

Before the EX AF,AF' implementation, it took 36 seconds to draw the eye 30 times in mode 0.
Now it takes 35.4 seconds!
Not a huge difference (about 1.6%), but I'll take it - every little bit counts!

Unfortunately the suggestion to flip the JR's lead to a *very* small slowdown, so I ended up not using that.
This slowdown is due to the flow of code... flipping the JR's would actually force an extra jump every 8th check.

Nonetheless, my program is now overall a bit faster, so I'm chuffed!

I've uploaded the latest version to the first post in this thread, as bres_3.zip.
It contains the dsk with the mode 0 executable, and also all the mode 0 source code (cpctelera/sdcc) for anyone interested.
It's a bit horrifying (due to the nasty optimisations that I've implemented), but oh well.

Test A draws the eye 30 times.
Tests B to J perform tests of each line routine against a clipping window. These tests use double-buffering, but without vsync so there is a little flickering.
Test K is an early version of pixel plotting! It works well, but is rather slow. Speeding up plotting will be my next focus, and the other routines will benefit a little bit as well, due to (hopefully) improved setup code for each line.

For tests B to K, you can use the cursor keys to move the lines around inside the clipping window.

Incidentally, test B looks like it is a plotting test, but it actually isn't - it's simply a line routine designed to draw a line that is only one pixel in size. Once the plotting routines (used in test K) are faster, I'll simply call the plot routine in the single pixel line routine (used in test B).

Another thing I need to get started on is to very carefully put the latest round of optimisations into the mode 1 and mode 2 code...
Quite frankly, that is kind of scary!

Gryzor

Guys, just a small heads up - there's a nice formatting command, [code ] - [/ code] (remove spaces) to make it all more readable:




   exx                  ;4T       757680T   D9
   ld a,(hl)               ;7T      1515360T   7E
   and b                  ;4T       757680T   A0
   or c                  ;4T       757680T   B1
   ld (hl),a               ;7T      1515360T   77
   ld a,ixl               ;0T      1515360T   DD 7D
   cp #07                  ;7T      1515360T   FE 07
   jr nz,l9589               ;7/12T      2178360T   20 0C
   ld de,#c850               ;10T       284040T   11 50 C8
   add hl,de               ;11T       284040T   19
   ld de,#0800               ;10T       284040T   11 00 08
   ld ixl,#00               ;0T       284040T   DD 2E 00
   jr l958c               ;12T       284040T   18 03
l9589:
   add hl,de               ;11T      1989000T   19
   inc ixl                  ;0T      1326000T   DD 2C
l958c:


with
l9572:
   exx                  ;4T       757680T   D9
   ld a,(hl)               ;7T      1515360T   7E
   and b                  ;4T       757680T   A0
   or c                  ;4T       757680T   B1
   ld (hl),a               ;7T      1515360T   77
   ld a,ixl               ;0T      1515360T   DD 7D
   cp #07                  ;7T      1515360T   FE 07
jr z, nextblock
^^^
l9589:
   add hl,de               ;11T      1989000T   19
   inc ixl                  ;0T      1326000T   DD 2C
l958c:


...
nextblock
    ld de,#c850               ;10T       284040T   11 50 C8
   add hl,de               ;11T       284040T   19
   ld de,#0800               ;10T       284040T   11 00 08
   ld ixl,#00               ;0T       284040T   DD 2E 00
   jr l958c               ;12T       284040T   18 03

SRS

#80
@ervin:

first view at source recommendation: you may change labels like 00800$: to speaking labels like drawxy1:, would make it easier
to read and maintain.

More to come :)

ervin

Yep, I tried that, but SDCC wouldn't compile with such labels.
In fact the documentation says that asm labels must be in the form 00xxx$.
:(

Docent

Quote from: Gryzor on 13:19, 29 August 16
Guys, just a small heads up - there's a nice formatting command, [code ] - [/ code] (remove spaces) to make it all more readable:

Thanks for the tip - I updated my messages and they are much more readable now

SRS

#83
Quote from: ervin on 23:33, 29 August 16
Yep, I tried that, but SDCC wouldn't compile with such labels.
In fact the documentation says that asm labels must be in the form 00xxx$.
:(

Now this is weired. Just tried it (/sdcc-3.5.5) and got this ASM (Part): See Label "neun" without errors in compile.


;src/main.c:393: void initMasksMode0()__naked{
;    ---------------------------------
; Function initMasksMode0
; ---------------------------------
_initMasksMode0::
;src/main.c:439: __endasm;
    ld    hl,#_colourMask
    ld    (hl),#0x55
    xor    a
    ld    hl,#_lineColour
    bit    0,(hl)
    jr    Z,neun
    ld    a,#0x80
     neun:
    bit    1,(hl)
    jr    Z,00901$
    set    3,a
     00901$:
    bit    2,(hl)
    jr    Z,00902$
    set    5,a
     00902$:
    bit    3,(hl)
    jr    Z,00903$
    set    1,a
     00903$:
    ld    (#_lineMask),a
    ret

ervin

That is indeed weird!
I don't know why that's happening...


Docent

#85
Quote from: SRS on 21:15, 30 August 16
Now this is weired. Just tried it (/sdcc-3.5.5) and got this ASM (Part): See Label "neun" without errors in compile.


;src/main.c:393: void initMasksMode0()__naked{
;    ---------------------------------
; Function initMasksMode0
; ---------------------------------
_initMasksMode0::
;src/main.c:439: __endasm;
    ld    hl,#_colourMask
    ld    (hl),#0x55
    xor    a
    ld    hl,#_lineColour
    bit    0,(hl)
    jr    Z,neun
    ld    a,#0x80
     neun:
    bit    1,(hl)
    jr    Z,00901$
    set    3,a
     00901$:
    bit    2,(hl)
    jr    Z,00902$
    set    5,a
     00902$:
    bit    3,(hl)
    jr    Z,00903$
    set    1,a
     00903$:
    ld    (#_lineMask),a
    ret


You can use labels with more meaningful names between _asm/_endasm directives and it should work IF the compiler do not generate any temporary local labels for the function code before your asm code. Otherwise definition of your label will limit the scope of the temporary local compiler-generated label and the generated asm code wont assemble.
In other words - if you have a c-function, containing only assembly code, you can use normal label names in your code.
If you have a c function containing mixed c-source and asm, stick to label in the form of n$, where n<100.


ervin

Quote from: Docent on 04:52, 31 August 16
You can use labels with more meaningful names between _asm/_endasm directives and it should work IF the compiler do not generate any temporary local labels for the function code before your asm code. Otherwise definition of your label will limit the scope of the temporary local compiler-generated label and the generated asm code wont assemble.
In other words - if you have a c-function, containing only assembly code, you can use normal label names in your code.
If you have a c function containing mixed c-source and asm, stick to label in the form of n$, where n<100.


Ah, that makes sense.
Thanks.


ervin

#87
Hi folks.

Alrighty, the mode 1 and 2 sources have both been updated with all the optimisations from the mode 0 code.
They are much faster now than they used to be.
8)

Some quick benchmarks (to draw the eye 30 times):
mode 0: 35.0 secs
mode 1: 48.7 secs
mode 2: 74.8 secs

Considering that mode 1 has twice as many pixels horizontally (compared with mode 0), and mode 2 has 4 times as many, those times are pretty good!

Also, line clipping has been sped up by around 13% in all 3 modes, thanks to a small (and laughably simple) idea I had while sitting in a taxi!

TO DO:
- speed up plotting (test K)
- shrink the binary file
- make a game with this stuff!

I've attached bres4.zip to the first post in this thread.
It contains 3 dsk files - one for each screen mode.

You can use the arrow keys to move lines around inside the clipping window in all tests except A.
This will demonstrate how the clipping looks.

Mode 0
https://www.dropbox.com/s/11dtwjf7076wdg2/eye0.png?raw=1

Mode 1
https://www.dropbox.com/s/f2tpt4idg1ws57y/eye1.png?raw=1

Mode 2
https://www.dropbox.com/s/78szhub0uztfhg0/eye2.png?raw=1

ervin

#88
I don't believe it... I just found another optimisation, which speeds up unclipped lines by at least 5%!

Latest times (to draw the eye 30 times):
mode 0: 32.3 secs
mode 1: 45.5
mode 2: 71.1

TFM

How are you doing it? Making dots, or composing lines out of short horizontal and vertical lines? I use the latter one.
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

ervin

Quote from: TFM on 18:02, 02 September 16
How are you doing it? Making dots, or composing lines out of short horizontal and vertical lines? I use the latter one.

Only out of dots.
I did briefly look into the segments technique, but didn't really understand it.

ervin

Hi folks.

Plotting is now nice and fast, and lines that are only one pixel long are now drawn using the plot routine.

There has been a nice side-effect of the improved plotting routine. The setup code for each line to be drawn has been sped up a *little* as a result of using code from the plot setup routine. It's not really noticeable, but is measurable when drawing the eye 30 times.

There are a couple more line setup optimisations to put in based on the plotting code, and I think they will make a measurable difference, but I think the speed is almost at the limit of my ability.
;D

ervin

#92
Woohoo!!!
I've found an optimisation for the line clipping routines.
I've got some more testing to do, but if it works, the routines in question will be almost 15% faster!

BUT I'm more excited about an improvement in the line drawing routines, after applying a similar optimisation.
Drawing the (mode 0) eye 30 times now takes 30.2 seconds (5% speed improvement)! Almost down to 1 second per eye!
REALLY REALLY happy with that!
;D

roudoudou

i'm looking forward to play with your routines!
My pronouns are RASM and ACE

ervin

#94
Quote from: roudoudou on 07:55, 09 September 16
i'm looking forward to play with your routines!

I'll be releasing the source code soon.
:)

Unfortunately it isn't commented very well at all, so the code is kind of useless for anyone else to learn from (especially after all the very low-level optimisation that makes the code very hard to follow).

Oh well, I guess the point is just to provide code that works (and hopefully works well)!

ervin

Hi folks.

Alrighty then!
I'm ready to release the source code, as well as the test program.

I'm *very* happy with the speed, and the size is good as well.

For anyone interested, please download the 3 files in the original post of this thread (bres0.zip, bres1.zip and bres2.zip).
They each contain source code, a CDT file and a DSK file.

In each of the 3 test programs:
- A draws the eye
- B to J draw various lines inside a clipping window (try the arrow keys!)
- K plots a bunch of pixels inside a clipping window (again, try the arrow keys!)

This little project is the culmination of something I've wanted to create for a very long time, and I'm tremendously happy with the results.

If anyone wants to use these plotting/drawing routines with CPCtelera, feel free!
8)

Enjoy!

ervin

Hi everyone.

Well, here it is.
The final version of my line drawing program.

The attachments in the original post have been updated to v1.1.
DSK, CDT and SDCC source files are included.

Changes in this version include:
- some minor optimisations
- new test [L] - a cube! (use the left/right arrow keys to spin the cube)

My 3D code isn't particularly great, so perhaps the 3D calculations aren't as fast as they could be, but the line drawing itself is very fast.
8)

It has been a great pleasure to work on this project, and I hope that someone out there finds my code useful.

Now it's time to work on something more modern!
Perhaps something for PC's, or smartphones... not sure yet.
Regardless, I've just bought the GameMaker package from Humble Bundle, and I'm *really* excited to dive in!

Arnaud

Great work, you really should Pull your code on CPCTelera Github.

Beside drawing line is an opened Issue :
Draw line · Issue #21 · lronaldo/cpctelera · GitHub

ervin

Quote from: Arnaud on 17:05, 22 September 16
Great work, you really should Pull your code on CPCTelera Github.

Beside drawing line is an opened Issue :
Draw line · Issue #21 · lronaldo/cpctelera · GitHub

Thanks Arnaud.

I'd *love* to add my code to cpctelera, but there are 2 reasons that I haven't yet:
- I don't know how to add it
- It's very messy code; not well commented at all (I was more concerned with as much low-level optimisation as possible, so the code stopped being neat and tidy)

If anyone else is interested in adding the code to cpctelera, please go ahead.
:)

ervin

Hi folks.

Just a quick update.

I've re-organised parts of the code, and also added some comments which should hopefully make this stuff easier to use.
Please see the first post of this thread for the new download.

If anyone wants to use this code, and runs into any trouble with it, please let me know, and I'll try to help out.
It's all very easy to use once you know what to do.
8)

Powered by SMFPacks Menu Editor Mod