News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu

Fast MODE 2 text printing routines

Started by opqa, 12:30, 25 January 15

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

pmeier

Great, you gave me many keywords to improve my project... and finally you say I helped you... cool  ;D
Thanks for your help... put you on my credits list...

ervin

Quote from: ronaldo on 21:52, 22 May 18
I reviewed CPCtelera string drawing routines and I worked out new ways to improve and make them much faster. Your comment gave me a great idea. Thank you :)


This sounds fantastic!
What sort of improvements have you discovered?

SRS

#27
Quote from: ervin on 12:58, 23 May 18

This sounds fantastic!
What sort of improvements have you discovered?
Simple (to simple i guess) "solution": Assume Char 32 always blank (as cpctelera reads chars from ROM, so it must be blank) ...

So:

_cpct_drawStringM2::
drsm2_nextChar:   cp #32                          ;; is this "Space" ?
   jr z, drsm2_save        ;; yes, save print
   

   push hl                             ;; [11] Save HL and DE to the stack befor calling draw char
   push de                             ;; [11]
   ld  b, a                            ;; [ 4] B = Next character to be drawn
   call cpct_drawCharM2_asm            ;; [17] Draw next char
   pop  de                             ;; [10] Recover HL and DE from the stack
   pop  hl                             ;; [10]

   ;; Increment pointer values

   drsm2_save:

   inc  de                             ;; [ 6] DE += 1 (point to next position in video memory, 8 pixels to the right)
   inc  hl                             ;; [ 6] HL += 1 (point to next character in the string)

Tried it, you can compare to the original "Easy Strings Example" with this:


ronaldo

Quote from: SRS on 20:38, 23 May 18
Simple (to simple i guess) "solution": Assume Char 32 always blank (as cpctelera reads chars
Well, the idea is not so simple. In fact, your idea seems right until you think twice of it. Spaces have to be printed for two reasons: they are not always made of the same colour of the background (you can pick a paper colour when you call drawString and drawChar), and you also cannot safely assume that your background will be empty.


So, I'm afraid to say that this optimization cannot be used, even if it gains some cycles.

pmeier

#29
Right, I clear the screen previously. The speed gain can be measured and is also noticeable. So thank you. Maybe it's not a candidate for the cpctelera API, but my private API is using current pen/paper colors and blank optimization w/ clear screen and wait for frame flyback.

SRS

Quote from: ronaldo on 07:34, 27 May 18
Well, the idea is not so simple. In fact, your idea seems right until you think twice of it. Spaces have to be printed for two reasons: they are not always made of the same colour of the background (you can pick a paper colour when you call drawString and drawChar), and you also cannot safely assume that your background will be empty.


So, I'm afraid to say that this optimization cannot be used, even if it gains some cycles.
As said: too simple ;)

But not if you know you do not need to change pen/paper for the blank parts. So there it may save some cycles.

But not in general, absolutely agree with you.

ronaldo

Today I had some time to put ideas into code and I have a working version for mode 0. I have made a test and these are the results:   // Testing with 2 simple strings                          | New ver. | Old ver. |
   cpct_drawStringM0("Hello World!", (u8*)0xC000, 3, 5);  // | 10391 us | 14734 us | => ~42% faster
   cpct_drawStringM0("Dolly man",    (u8*)0xC0A0, 1, 9);  // |  7829 us | 11062 us | => ~41% faster
I still have to finish documentation before pushing up the code to development branch, but the improvement is quite remarkable. The improvement comes at a cost of +13 bytes for the total space taken by function code. Although there is still some little room for improvement, it won't be so significant.
In my mental designs, Mode 1 function should improve more than Mode 0. We'll see :)

ervin

Quote from: ronaldo on 13:36, 10 June 18
Today I had some time to put ideas into code and I have a working version for mode 0. I have made a test and these are the results:   // Testing with 2 simple strings                          | New ver. | Old ver. |
   cpct_drawStringM0("Hello World!", (u8*)0xC000, 3, 5);  // | 10391 us | 14734 us | => ~42% faster
   cpct_drawStringM0("Dolly man",    (u8*)0xC0A0, 1, 9);  // |  7829 us | 11062 us | => ~41% faster
I still have to finish documentation before pushing up the code to development branch, but the improvement is quite remarkable. The improvement comes at a cost of +13 bytes for the total space taken by function code. Although there is still some little room for improvement, it won't be so significant.
In my mental designs, Mode 1 function should improve more than Mode 0. We'll see :)

WOW!!!
I'm really looking forward to trying the new routine!

Any chance you could show the improved code?
;D

ronaldo

Quote from: ervin on 13:51, 10 June 18
Any chance you could show the improved code?
Yes, of course :) . As I said before, just needed some time to finish documentation ;) .

You may want to check Strings code folder under CPCtelera`s development branch.

Functions improved include:Hope you enjoy it ;)

ronaldo

And there you go, optimized versions for cpct_drawStringM1 are now ready and pushed to CPCtelera's development branch:

   // Testing with a 40-character Mode 1 string.
   // cpct_drawStringM1_f ==> Old fast version  (379 bytes in total)
   // cpct_drawStringM1   ==> New version       (216 bytes in total, 163 bytes less, 43% less space)
   cpct_drawStringM1_f("0123456789012345678901234567890123456789", (u8*)0xC0A0, 3, 5);  // | 24486 us |
   cpct_drawStringM1  ("0123456789012345678901234567890123456789", (u8*)0xC000, 3, 5);  // | 19501 us | => ~25% faster

As you can see, the new version takes 163 bytes less than previous fast version and is ~25% faster.
Moreover, there is an interesting new side effect. It is easy to add a new version of the same function to use your own character set. You only need to place your character set at 0x3800 and then remove the lines at any cpct_drawString/cpctdrawChar function that enable and disable ROM. Drawing is decoupled from either cpct_drawString/cpct_drawChar functions (implemented in cpct_drawCharMx_inner_asm functions) which also enables their direct use. You may create your own versions of cpct_drawString/cpct_drawChar functions calling inner functions and without the cost of including their code in your binary.

ervin

Fantastic!
Thanks so much for your work on this!

Widukind

#36
[ot]What an interesting topic this is. I am new to retro programming, but as a former Z80 and ARM assembler programmer I enjoy reading such topics a lot, including Prof Ronaldo's work. We can still learn a lot in retrospect! So please continue. :-)

QuoteI was worried about the decreasing knowledge my students shown about how programs actually work. Then I thought that asking them to create games for the Amstrad CPC could be a great way to force them to deal with low-level stuff and learn from it. I also tend to create ways to transform assignments into real world projects. I don't want my students to hand me what they think I expect from them: I want them to develop real projects for real people.
[/ot]

Powered by SMFPacks Menu Editor Mod