CPCWiki forum

General Category => Programming => Topic started by: cpcitor on 12:17, 16 January 13

Title: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: cpcitor on 12:17, 16 January 13
Hello,

Thank you @db6128 for the link to timings "Tested and verified on real hardware by Kevin (arnoldemu) and Richard (Executioner)." [Closed] Fastest cycles/byte memory write rate : answer 2µs per byte with PUSH. (http://www.cpcwiki.eu/forum/programming/fastest-cyclesbyte-memory-write-rate/msg56281/#msg56281)

Thanks to people replying on that thread and also Chase HQ : how did they manage ? (http://www.cpcwiki.eu/forum/programming/chase-hq-how-did-they-manage/msg56190/#msg56190)

To write a really optimized routine for specific needs, I think one needs to get familiar with timings. Unofficial Amstrad WWW Resource (http://www.cpctech.org.uk/docs/instrtim.html) is an interesting reference but with basic layout.
If laid out more visually, it can be even more useful.

So I copy-pasted that into a spreadsheet program and moved around and adjusted mercilessly until it made sense.

We all benefit from previously published work, so let's improve and publish again.
See link at end of post : "Z80 timings on Amstrad CPC - Cheat sheet" in PDF.

It is not intended to be printed but read and browsed visually while looking for variants of instructions.

For example : open, press Ctrl-F (for search) and type RLC, see all RLC variants highlighted so that you can quickly take that into account and figure out the best compromise for your context.

Also: publishing implies crediting previous authors, so I have to choose a definitive nickname. I'm hesitating about the best nickname to take. Quick poll: do you prefer "findyway" (which is actually the short name for my current project) or "cpcitor" ? What do they bring to your mind ?

All feedback welcome !

EDIT: small fix in attached PDF
EDIT: fix typo in post
EDIT: update PDF (POP IX/IY were missing.)
EDIT 20131019 : update PDF (All instructions available on HL DE and BC mentioned here as rp are also available on SP. )
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: Sykobee (Briggsy) on 12:51, 16 January 13
Apparently I don't have access to download the file? :-(


Edit: Works now.
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: McKlain on 12:59, 16 January 13
Well, findyway says nothing to me, and cpcitor sounds very funny in spanish  ;D
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: fano on 15:12, 16 January 13
Nice chart ! can be very usefull ;)
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: db6128 on 17:49, 16 January 13
Thank you @db6128 for the link to timings
Glad to help, and thanks for this very useful edit!
 
Quote
Craving for speed ? A visual cheat sheet to help optimizing your code to death.
I was, but instead, I optimised my code to Life. ;) (http://www.cpcwiki.eu/forum/programming/amstrifejohn-conways-game-of-life-for-the-cpcfast-many-features-on-the-way!/msg54680/#msg54680)
 
Quote
Also: publishing implies crediting previous authors, so I have to choose a definitive nickname. I'm hesitating about the best nickname to take. Quick poll: do you prefer "findyway" (which is actually the short name for my current project) or "cpcitor" ? What do they bring to your mind ?
At least in my current state of having no idea what FindYWay means :P, I much prefer cpcitor. To me, it suggests ‘CPC operator’ (software), mixed with capacitor (hardware), mixed a little bit with a robot… so I’d say you have all the bases covered. :D
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: TFM on 23:36, 16 January 13
The PDF is great, now add the undocumented and illegal commands and it will be perfect :)
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: db6128 on 01:50, 17 January 13
I agree, and hahaha, I just saw TFM’s custom text under his avatar!  :laugh: :laugh:  I’ve never thought of myself as a trend-setter  :laugh:
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: TFM on 03:32, 17 January 13
Trend mode = ON!
This may help...
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: db6128 on 15:55, 17 January 13
Hey, I have a much newer version :P
http://z80.info/zip/z80-documented.pdf (http://z80.info/zip/z80-documented.pdf)
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: cpcitor on 16:20, 17 January 13
Why not linking to the official source ? Msx (and other) Emulation Info (http://www.myquest.nl/z80undocumented/)
http://www.myquest.nl/z80undocumented/z80-documented-v0.91.pdf (http://www.myquest.nl/z80undocumented/z80-documented-v0.91.pdf)
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: TFM on 00:28, 18 January 13
You guys are great !!!
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: cpcitor on 10:20, 19 October 13
Hey, I just noticed that although SDCC uses ADD HL,SP on a regular basis, Arnoldemu's page Unofficial Amstrad WWW Resource (http://www.cpctech.org.uk/docs/instrtim.html) and hence this document did not explicitly mention it.

The conclusion of the analysis below is :

Whatever instruction listed in the PDF with operand "rp" is not only available for rp=BC,DE,HL but also for rp=SP, except PUSH and POP.


Below the analysis to find out that summary.

From the downloadable spreadsheet at Complete Z80 instruction set - ticalc.org (http://www.ticalc.org/archives/files/fileinfo/195/19571.html) it seems that  ADD HL,SP has the same timings as e.g. ADD HL,DE.
Other instructions mentioning rp are applicable to SP. Here's the full list
Code: [Select]
LD SP,nn
ADD HL,SP ; same timing as with HL
INC SP ; same timing as with HL
DEC SP ; same timing as with HL
LD SP,HL  ;(already in PDF)
EX(SP), HL  ;(already in PDF)
ADD IX,SP ; same timing as with HL
LD SP,IX  ;(already in PDF)
EX (SP),IX  ;(already in PDF)
SBC HL,SP  ;(already in PDF)
ADC HL,SP  ;(already in PDF)
LD (nn),SP  ; same timing as with HL
LD SP,(nn)  ; same timing as with HL

Ok so when they exist and are not already mentioned in Kevin's doc and the PDF, instructions with SP have same timing as their HL counterparts.

Checking out every occurrence of "rp" in Kevin's doc and the PDF I see that all exist except PUSH SP and POP SP (would it make sense, perhaps after all).

I update the PDF.
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: TFM on 22:01, 19 October 13
ADD IX,SP ; same timing as with HL


Nope, ADD HL,SP is one microsecond faster. Usage if IX or IY instead of HL always uses one byte and one ys more.


If you talk about ADD IX,HL - sorry, that won't work.

Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: cpcitor on 18:45, 20 October 13
Nope, ADD HL,SP is one microsecond faster. Usage if IX or IY instead of HL always uses one byte and one ys more.

If you talk about ADD IX,HL - sorry, that won't work.

You're totally right, TFM.
That's the problem when addressing an audience of masters.  ;)

I was focused on checking equivalence between SP and others and forgot for a moment that the DD prefix replaces HL by IX in operations.

It should read ADD IX,SP ; same timing as with DE or BC instead of SP.

Code: [Select]
09 = ADD HL,BC
19 = ADD HL,DE
29 = ADD HL,HL
DD,09 = ADD IX,BC
DD,19 = ADD IX,DE
DD,29 = ADD IX,IX

Thanks TFM for spotting that.

I wished the PDF would be a reference of not only the timing but the available instructions.

The Z80 logic makes that ideal a bit complicated. Let's keep it simple at the moment.
Title: Re: Craving for speed ? A visual cheat sheet to help optimizing your code to death.
Post by: TFM on 22:22, 21 October 13
Don't thank me. Thanks' to YOU for compiling all that valuable data.  :)