News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu

Is a SNES mode 7 impossible?

Started by Trebmint, 17:27, 18 December 09

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Trebmint

Hey, just found the wiki after no cpc zone for a long while. Just thought I'd start my first post with a question for all those z80 guys who can optomise better than I.

Is a SNES mode 7 impossible to achieve at a decent speed? I've been thinking about it, and I'm not sure as I'd previously thought it was probably impossible. Has it been done in a demo already?

Here's how I think it can be done, and I just wonder how many T cycles you z80 geniuses might get it down too. Here a quick little image.

Okay basically I imagine a visible floor area for a mariokart game to be 128 pixels wide by 80 high. The map is using a tiled method. Thats 128, 16x16 size Tiles using a total of 32k each Tile taking exactly 256 bytes and each byte being a pixel layout 12121212 (mode 0). Thats 16x16x128=32k

Heres a pic


Trebmint

The picture represents a viewing cone sitting on a 256x256 map. That map is 16x16 tiles, so there are 256 tiles in this map. The blob in the middle is the near point of the screen (character position). Points A,B are the furthest viewpoints C and D the nearest. Point A would be far horizon to left of screen, B furthest to right, C bottom left and D bottom right
If we assume the blob is at 130,130 on the map we can see that A is at 6,116 or thereabouts and B at 129,4. So the X difference between A and B is 129-6=123. Since we know we will show 128 pixel horizontally we can know that each pixel will be 123/128 apart. So starting at A or 6 each pixel will be A+(123/128)
Obviously we can't easily do 123/128 in Z80 so we move A to the High Byte in a 16 bit register
Ax=6*256 = 1536 , Bx=129*256 = 33024
33024 - 1536 = 31488
So the difference in X between Ax and Bx is 31488, now lets divide that by 128 the pixels being displayed. 31488/128 = 246

Trebmint

So Now all we have to do is starting at Ax or (6*256) is add 246 128 times to get the correct value across.
This is also done with the Y value the difference between Ay and By. In effect starting from A with have a value to add for its's X increment and Y increment.
Interestingly This means that for the X, the value gives us in the top high byte top nibble the tile across (0-15) the unit across within that tile (0-15) within the lower nibble. The Low byte just holds the fraction
Added to the Y value this gives us the Tile xHighnibble * yHighNiblle = Tile Address. And Pixel data in that tile xLowNibble * yLowNibble
This can be slightly altered for the Y for additional benefit. If The Y is actually 16 times smaller, ie uses only 12bits rather than 16, we can then add xHighNibble+yHighNibble immediately to give the tile number.

Trebmint

This tile number (0-255) will be a lookup table to the tile reference. Because the tile size is 256 bytes assuming the tile data is in 256 chunks immediately from this get the address.
Similarly the lower nibbles being offset will point straight to the pixel data inside the tile data
xLowNibble+yLowNibble=(0-255)location inside tile of pixel data.
This is then done 128 times per horizontal line and 80 in the vertical. Moving from Point A to C and from B to D in increments of 80 (the visible height though allowing for the foreshortening effect)

I hope this is clear. There is a lot more detail but I thought I'd start with this. Could this little routine be quick enough just by sticking to 256 boundaries etc?

fano

#4
By principle , i'd say it is not possible with decent speed but after seeing wolf3D prototype by Richard, i may change my jugement lol

As English is not my native language , i'm not sure i understand all but your idea is interesting and algorithm seems well thinked.The only demo that i remember to provide this type of software effect is "Ecole Buissonière" that owns a very impressive rotozoom.Mode 7 seems to be like a rotozoom but with perspective correction.

To save some time , you may store precomputed combinaisons of perspective + rotation of a defined count of angles in a table for each line (something like 16 or 32 angles) as the computations of theses would eat a lot of time.
Another problem is CPC video memory layout.With a 32 char screen width , each line will be contained in only one page (256bytes).

More, if you use 256 tiles of 32 bytes (4*8 pixies) so 8K, you can interlace data to compute very quickly tiles adresses (load page high byte adress in high byte and tile number in low byte :D)

Some demos own a depth effect using line split technics but it is done only on y axis (the last part of From Scratch is a great example)

About mesuring program performance , i'd recommend Winape debugger that owns a NOP (1 NOP=4 Z80 cycles = 1µs=1 mode 1 char width , on CPC all Z80 instructions timing are multiples of 1 NOP , a 50HZ frame is about 20000 µs ) counter and this (incomplete) timing chart

Anyway , good luck for your project as there is a lot of work and i am not sure it is possible to achieve this at a decent speed on a decent screen surface.
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

arnoldemu

Quote from: Trebmint on 17:27, 18 December 09
Hey, just found the wiki after no cpc zone for a long while. Just thought I'd start my first post with a question for all those z80 guys who can optomise better than I.

Is a SNES mode 7 impossible to achieve at a decent speed? I've been thinking about it, and I'm not sure as I'd previously thought it was probably impossible. Has it been done in a demo already?

Here's how I think it can be done, and I just wonder how many T cycles you z80 geniuses might get it down too. Here a quick little image.

Okay basically I imagine a visible floor area for a mariokart game to be 128 pixels wide by 80 high. The map is using a tiled method. Thats 128, 16x16 size Tiles using a total of 32k each Tile taking exactly 256 bytes and each byte being a pixel layout 12121212 (mode 0). Thats 16x16x128=32k

Heres a pic

I thought about the same idea a couple of times.

My thinking was to calculate data for every other line and then copy the line above to fill in. This would mean a chunky looking floor but square looking pixels. It may just look ok for the cpc.

In terms of drawing it is a matter of interpolation drawing a scanline at a time (done this before in java to do this kind of floor).
I agree, 8:8 fixed point is the best way to do it.

To draw each line, you need to know the x,y interpolation factor, interpolate this for each pixel on the screen and lookup into map as you've already described.

I worked something out using lots of table but ran out of ram. I've not looked at it since.

In terms of rotating the viewpoint, you're probably best to rotate those 4 points, and compute the interpolation values, otherwise you're going to loose some ram.

I planned to work out the code using C, optimise it with CPC in mind and convert to asm, but in reality I didn't get that far.


Is it fast on CPC? I never found out.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Trebmint

Just wrote some code which I think would work, and the main loop seems to be 100 states or so, which is pretty large. Can't see it moving better than 5-6fps, so is probably too slow if an interesting experiment

cpcitor

Quote from: fano on 11:45, 19 December 09
About mesuring program performance , i'd recommend Winape debugger that owns a NOP (1 NOP=4 Z80 cycles = 1µs=1 mode 1 char width , on CPC all Z80 instructions timing are multiples of 1 NOP , a 50HZ frame is about 20000 µs ) counter and this (incomplete) timing chart

The link is very interesting. Trying to find a good place in the wiki to put it, the best I could find was
CPC old generation - CPCWiki

Quote from: Trebmint on 00:08, 22 December 09
Just wrote some code which I think would work, and the main loop seems to be 100 states or so, which is pretty large. Can't see it moving better than 5-6fps, so is probably too slow if an interesting experiment

5-6 fps is not 50, but it can be enough for a game. What was the area in your estimation ? Half the screen (100 lines) ?
Reducing the area can then accelerate a lot, just like doom did. Just reduce x and y areas by 2 and you get 5 or 6*2*2 that make 20 to 24fps, which will be *really* impressive (for a CPC).
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

Powered by SMFPacks Menu Editor Mod