News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_SerErris

Disassembly with descrambling

Started by SerErris, 22:50, 25 November 22

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

SerErris

Hi, I do have a challenge for you Z80 Maschine code guys.

I do have a scrambled ROM where some databits are getting scrambled by XOR with some address bits.

The Databits 3 and 5 that are send to the CPC are XORed with Adressbits A2 and A4.

Inside the ROM it looks like every 4 bytes (because of A2) the scrambling changes and every 16 bytes (A4) it changes more.

you can also xor the databyte based on the address like that:
Bytes        XOR value (hex)
00-03       00
04-07       20
08-0B       00
0C-0F       20
10-13       08
14-17       28
18-1B       08
1C-1F       28

Then the whole thing repeats. 

There is just one issue. The module in question here (Vortex X-Module) can turn on and off this scrambling(XOR) by checking for the M1 line. So if a opcode is read by the CPU (M1 cycle) then the scrambling is off. If other bytes are read, the scrambling is on. 

In essence  it means that we have the following situation:

C006    C3 C9 C1       JP &C1C9

However as the C9 is at 07 and C1 is at position 08 it Means that C9 needs to be XORed with &20 and C1 with &00 (so no change). The unscrambled ROM would be this:

C006   C3 E9 C1        JP &C1E9     

So as I do know all of that, my question is - how could I every read that ROM and convert it to an unscrambled one?  

My problem is, that the byte in the ROM itself cannot identify if it is an opcode or not and the disassembler cannot change it and after the first jump it will fail to follow.

I would most likely start to write my own dissassembler - kind of a Z80 CPU emulator that can scramble all bytes that are not part of the M1 cycle.

Any idea on how to achieve that?
Proud owner of 2 Schneider CPC 464, 1 Schneider CPC 6128, GT65 and lots of books
Still learning all the details on how things work.

martin464

if the rom is defective you could dump the contents of the rom out and work on the data from here
but it sounds like the module is doing this intentionally as part of some internal mechanism?

i thought you meant a defective rom at first then you could try different ways to pull data off it
OR (HL) LD A,(HL) LD A, (nnnn) POP HL LD HL, (nnnn) LD IX, (nnnn) LDI LD r, (IX+d) even BIT n, (HL) or BIT r,(IX+d)
but i don't think you mean this... so i guess you could work on an emulator or find one already and tweak it, there must be one already instead of write your own but would be a fun project. i had a javascript one for 8080 somewhere but not sure if its lost or still have it
CPC 464 - 212387 K31-4Z

"One essential object is to choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the calculation." Ada Lovelace

pelrun

The rom isn't defective, it's deliberately scrambled as copy protection.

To unscramble this needs a tracing disassembler that's been instrumented to take the scrambling into account - instead of blindly disassembling a binary from start to finish it starts at a known entry point and follows every path of execution. It can then mark which bytes are opcodes and which are data, although it's an interactive process and certain tricks can frustrate it (for instance if the ROM uses an address as both an opcode and an operand at different times).

Personally I'd probably take Ghidra's z80 processor spec and modify it to mark opcodes and decode operands, but that's a industrial strength tool and has a learning curve to it - writing a simple disassembler from scratch is fairly straightforward by comparison.

eto

Quote from: pelrun on 14:16, 29 November 22The rom isn't defective, it's deliberately scrambled as copy protection.
and pretty effective... Extremely clever protection that makes it so hard to decrypt, that it's just not worth it for a competitor or cracker. 

Quote from: pelrun on 14:16, 29 November 22for instance if the ROM uses an address as both an opcode and an operand at different times
that would be extremely nasty and would make the process even harder. And you either need to patch the ROM or keep the original circuit to use the ROM.

BSC

Quote from: SerErris on 22:50, 25 November 22Hi, I do have a challenge for you Z80 Maschine code guys.
That sounds like a really sophisticated protection scheme. Apart from the easy part, i.e. unscrambling based on the address, I think I would use one of the many existing Z80 emulation engines as a starting point and just add some callback which gets executed when an M1 is encountered. The tricky part seems to be finding all code paths. I wonder if a heuristic approach would work, like after adding the M1 callback, you let your simulator jump into random positions within the ROM to have a piece of code executed until a RET is encountered and repeat this a lot and in a way to ensure all the ROM is eventually covered. If you then, for each memory location, store the decoded value and a counter for that value (so you essentially collect probabilities for each value at each location), you might end up with some kind of probability distribution for each location where the value with the highest probability might be the actual decrypted byte.
** My SID player/tracker AYAY Kaeppttn! on github **  Some CPC music and experiments ** Other music ** More music on scenestream (former nectarine) ** Some shaders ** Some Soundtrakker tunes ** Some tunes in Javascript

My hardware: ** Schneider CPC 464 with colour screen, 64k extension, 3" and 5,25 drives and more ** Amstrad CPC 6128 with M4 board, GreaseWeazle.

Longshot

You need to identify the code areas of the data areas.

The encryption of the operands can complicate this task, but a priori, you know the algorithm to correctly decrypt the "real" branch addresses.

With a modified Z80A simulator, it is possible to follow all possible paths when calls are related to F states (JP Cond, JR Cond, CALL Cond, RET Cond)
This involves managing a tree by storing the branches to be executed and setting as a stopping condition a control of the areas already treated.

It must be a little more complex when the addresses are contained in tables, but it would have been counterproductive for this protection system, because the addresses would not be encrypted.

Finally, it is probably necessary to check afterwards that the decrypted code does not access data from the executable code.
A decrypted code no longer containing the true value as "data".
Since the pointer in rom can be calculated (LD A,(HL)) rather than direct addressing (LD A,(xxxx)), I think a simulator would make the job easier.

In the situation where the rom would control itself in this way, the only solution would then be to patch the rom to offer it the "good" data elsewhere.

If you're interested, you can find a Z80A simulator here, in Z80A.
Basically I wrote it to calculate the CPU between 2 pointers, but you can get inspiration from it.
http://logonsystem.fr/down/CalcCpuv4.asm
Rhaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!!

SerErris

I thought about a hardware solution ..  kind of applying a handbreak (Raspberry PI or Arduino) to the real CPC and then sniff the address and databus.

Or the Raspberry PI doing the whole job (has the ROM and plays it back). So it could react on the CPC request and deliver the correct output and ad the same time put the same output into a memory file.

I could then run through all the entrypoints (they are luckly known) and then get it going. Unfurtunately there are some of the entries that are not obvious and would not work that way.

I now did another approach:
1. I used z80dismbl https://github.com/maziac/z80dismblr
2. I learned typescript :-)
3. I changed the code to do the descrambling part - vscode + typescript is actually fantastic as you immediately can see any errors you made (syntax errors that is).
4. Then this tool accepts jump entries and labels and walks the code automatically on all entriepoints given and then disassembles it. That give you 95% of the job I would say.
5. The rest now needs to be done in WinAPE or any other emulator and checking function by function if they work.

So that worked to get the ROM init process working - which is promising. It did of cause not work directly as the CPC BASIC 1.0 runs the CAS_IN_ABANDON routing and that call into the ROM has not been processed. So it runs in a strange "press play on tape" crash. It was not a real crash but if the CPC is in "press play on tape" nothing works anymore as interrupts are getting disabled.

But thanks for your insights. I was in hope of a simpler solution. This looks like it will work.

It was kind of 2 days work to get the tools setup. Now it will take some more days to get the full decryption done. As a side effect I will have a complete ROM listing of the ROM afterwards, and the tool I used is also great for this kind of work. If you identify information you starting to document everything in another file and at the next disassembly run it will take all of the information and put it in the source code as comments.

Also it creates graphs and you can look at the flow and the connections ... 
Get ready for getting a bigger printer :-) This is just a tiny fraction from a massive diagram.
You cannot view this attachment.

Again thank you all for your input. really helped me to find the right tool.
Proud owner of 2 Schneider CPC 464, 1 Schneider CPC 6128, GT65 and lots of books
Still learning all the details on how things work.

Powered by SMFPacks Menu Editor Mod