Author Topic: Craving for speed ? A visual cheat sheet to help optimizing your code to death.  (Read 2097 times)

0 Members and 1 Guest are viewing this topic.

Offline cpcitor

  • The user previously known as FindYWay
  • CPC6128
  • ****
  • Posts: 238
  • Country: fr
  • My heart still runs on traditional CPC.
    • My code for the CPC.
  • Liked: 112
  • Likes Given: 255
Hello,

Thank you @db6128 for the link to timings "Tested and verified on real hardware by Kevin (arnoldemu) and Richard (Executioner)." [Closed] Fastest cycles/byte memory write rate : answer 2µs per byte with PUSH.

Thanks to people replying on that thread and also Chase HQ : how did they manage ?

To write a really optimized routine for specific needs, I think one needs to get familiar with timings. Unofficial Amstrad WWW Resource is an interesting reference but with basic layout.
If laid out more visually, it can be even more useful.

So I copy-pasted that into a spreadsheet program and moved around and adjusted mercilessly until it made sense.

We all benefit from previously published work, so let's improve and publish again.
See link at end of post : "Z80 timings on Amstrad CPC - Cheat sheet" in PDF.

It is not intended to be printed but read and browsed visually while looking for variants of instructions.

For example : open, press Ctrl-F (for search) and type RLC, see all RLC variants highlighted so that you can quickly take that into account and figure out the best compromise for your context.

Also: publishing implies crediting previous authors, so I have to choose a definitive nickname. I'm hesitating about the best nickname to take. Quick poll: do you prefer "findyway" (which is actually the short name for my current project) or "cpcitor" ? What do they bring to your mind ?

All feedback welcome !

EDIT: small fix in attached PDF
EDIT: fix typo in post
EDIT: update PDF (POP IX/IY were missing.)
EDIT 20131019 : update PDF (All instructions available on HL DE and BC mentioned here as rp are also available on SP. )
« Last Edit: 10:26, 19 October 13 by cpcitor »
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC.

Offline Sykobee (Briggsy)

  • 6128 Plus
  • ******
  • Posts: 666
  • Country: gb
  • Liked: 221
  • Likes Given: 328
Apparently I don't have access to download the file? :-(


Edit: Works now.

Offline McKlain

  • 6128 Plus
  • ******
  • Posts: 867
  • Country: es
  • Programmable Sound Generator
    • www.mcklain.com
  • Liked: 338
  • Likes Given: 950
Well, findyway says nothing to me, and cpcitor sounds very funny in spanish  ;D

Offline fano

  • Supporter
  • 6128 Plus
  • *
  • Posts: 830
  • Country: fr
  • Easter Egg Programmer
    • Easter Egg
  • Liked: 267
  • Likes Given: 594
Nice chart ! can be very usefull ;)
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 71
  • Likes Given: 44
Thank you @db6128 for the link to timings
Glad to help, and thanks for this very useful edit!
 
Quote
Craving for speed ? A visual cheat sheet to help optimizing your code to death.
I was, but instead, I optimised my code to Life. ;)
 
Quote
Also: publishing implies crediting previous authors, so I have to choose a definitive nickname. I'm hesitating about the best nickname to take. Quick poll: do you prefer "findyway" (which is actually the short name for my current project) or "cpcitor" ? What do they bring to your mind ?
At least in my current state of having no idea what FindYWay means :P, I much prefer cpcitor. To me, it suggests ‘CPC operator’ (software), mixed with capacitor (hardware), mixed a little bit with a robot… so I’d say you have all the bases covered. :D
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1977
  • Likes Given: 4650
The PDF is great, now add the undocumented and illegal commands and it will be perfect :)
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 71
  • Likes Given: 44
I agree, and hahaha, I just saw TFM’s custom text under his avatar!  :laugh: :laugh:  I’ve never thought of myself as a trend-setter  :laugh:
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1977
  • Likes Given: 4650
Trend mode = ON!
This may help...
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 71
  • Likes Given: 44
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline cpcitor

  • The user previously known as FindYWay
  • CPC6128
  • ****
  • Posts: 238
  • Country: fr
  • My heart still runs on traditional CPC.
    • My code for the CPC.
  • Liked: 112
  • Likes Given: 255
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC.

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1977
  • Likes Given: 4650
You guys are great !!!
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline cpcitor

  • The user previously known as FindYWay
  • CPC6128
  • ****
  • Posts: 238
  • Country: fr
  • My heart still runs on traditional CPC.
    • My code for the CPC.
  • Liked: 112
  • Likes Given: 255
Hey, I just noticed that although SDCC uses ADD HL,SP on a regular basis, Arnoldemu's page Unofficial Amstrad WWW Resource and hence this document did not explicitly mention it.

The conclusion of the analysis below is :

Whatever instruction listed in the PDF with operand "rp" is not only available for rp=BC,DE,HL but also for rp=SP, except PUSH and POP.


Below the analysis to find out that summary.

From the downloadable spreadsheet at Complete Z80 instruction set - ticalc.org it seems that  ADD HL,SP has the same timings as e.g. ADD HL,DE.
Other instructions mentioning rp are applicable to SP. Here's the full list
Code: [Select]
LD SP,nn
ADD HL,SP ; same timing as with HL
INC SP ; same timing as with HL
DEC SP ; same timing as with HL
LD SP,HL  ;(already in PDF)
EX(SP), HL  ;(already in PDF)
ADD IX,SP ; same timing as with HL
LD SP,IX  ;(already in PDF)
EX (SP),IX  ;(already in PDF)
SBC HL,SP  ;(already in PDF)
ADC HL,SP  ;(already in PDF)
LD (nn),SP  ; same timing as with HL
LD SP,(nn)  ; same timing as with HL

Ok so when they exist and are not already mentioned in Kevin's doc and the PDF, instructions with SP have same timing as their HL counterparts.

Checking out every occurrence of "rp" in Kevin's doc and the PDF I see that all exist except PUSH SP and POP SP (would it make sense, perhaps after all).

I update the PDF.
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC.

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1977
  • Likes Given: 4650
ADD IX,SP ; same timing as with HL


Nope, ADD HL,SP is one microsecond faster. Usage if IX or IY instead of HL always uses one byte and one ys more.


If you talk about ADD IX,HL - sorry, that won't work.

TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline cpcitor

  • The user previously known as FindYWay
  • CPC6128
  • ****
  • Posts: 238
  • Country: fr
  • My heart still runs on traditional CPC.
    • My code for the CPC.
  • Liked: 112
  • Likes Given: 255
Nope, ADD HL,SP is one microsecond faster. Usage if IX or IY instead of HL always uses one byte and one ys more.

If you talk about ADD IX,HL - sorry, that won't work.

You're totally right, TFM.
That's the problem when addressing an audience of masters.  ;)

I was focused on checking equivalence between SP and others and forgot for a moment that the DD prefix replaces HL by IX in operations.

It should read ADD IX,SP ; same timing as with DE or BC instead of SP.

Code: [Select]
09 = ADD HL,BC
19 = ADD HL,DE
29 = ADD HL,HL
DD,09 = ADD IX,BC
DD,19 = ADD IX,DE
DD,29 = ADD IX,IX

Thanks TFM for spotting that.

I wished the PDF would be a reference of not only the timing but the available instructions.

The Z80 logic makes that ideal a bit complicated. Let's keep it simple at the moment.
« Last Edit: 18:48, 20 October 13 by cpcitor »
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC.

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1977
  • Likes Given: 4650
Don't thank me. Thanks' to YOU for compiling all that valuable data.  :)
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus