News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_cpcitor

Using emulator for performance measurement and profiling ?

Started by cpcitor, 23:15, 29 October 13

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

cpcitor

Hello,

When optimizing software seriously, precise reproducible overall measurement are important.

Duration measurement on a real CPC

When optimizing a demo running at native framerate, a tradition is to change border color at end of computation and see at which line on the screen color changes. Adjust code, run and see if the color change is higher or lower. It's relatively precise but not scriptable.

For longer computations, this cannot work. You can count the number of interrupts at 300Hz, which is much less precise.

On a real CPC are there other options ?

On an emulator

Emulators (with cycle accurate z80 emulation) open up interesting possibilities.

I imagine a setup:


  • Run the emulator from a script, with these arguments:
  • a binary to run, that runs on a finite time,
  • instruction to gather performance measurements and save them on exit in some readable file format (e.g. text).
  • instruction to quit on some event, like when the Z80's register PC=0 (which means the binary asked to reboot the CPC).

What performance measurement ?


  • (1) Total time spent running the binary e.g. (in NOPs)
  • (2) Number of times each of the 65536 (or 131072, or ...) bytes of memory was read/written by the Z80.
  • others ?

Benefits


  • Comparing (1) between two runs with different binaries allows to know which one performs faster. For example, should I compile with SDCC's options --(no-)callee-saves, or --(no-)omit-frame-pointer ? How many cycles do I win/lose ?
  • A completely unattended/scriptable setup allows interesting scenarios (like testing many alternative and their combinations).
  • Looking at values of (2) in the CODE memory area would tremendously helps to see where the Z80 spends its time. Basically it would say: "See, shaving even one CPU cycle in this loop yields 100 times more speedup than in that loop."

Are you aware of any emulator that can to such a thing, or even part of it ?
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

ralferoo

WinAPE has a cycle counter. I used to use it by clearing it on a breakpoint, run to the next breakpoint and just read off the total cycles over that time.

Powered by SMFPacks Menu Editor Mod