Programming:Fast Square Root

From CPCWiki - THE Amstrad CPC encyclopedia!
Revision as of 21:03, 18 September 2007 by Executioner (Talk | contribs) (Faster Again)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Input: A = The number you want to find the square root of

Output: A = Square root of the number supplied in A

Destroys: A,B,DE

Call: CALL SqrtA

;this routine written 10 - 28 - 2003
;ported from the 68k version (by Frank Yaul) by konrad meyer
;this routine is not as memory efficient as it could be; this is
;done to save cycles

;note: this routine is using the fastest available method
;and will come within 1 below the #
;note: this routine is inaccurate in #s below 4

SqrtA:
	LD	(Asqr),A
	SRL	A
	JR	DataOver
Asqr:
	.DB	0
Bsqr:
	.DB	0
Csqr:
	.DB	0
DataOver:
	LD	(Bsqr),A
	LD	B,A
	LD	(Csqr),A
iterate:
	LD	A,(Asqr)
	LD	B,(Bsqr)
	LD	D,A
	LD	E,B
divideDbyEreturnA:
	RL	D
	RLA
	SUB	E
	JR	nc,$+3
	ADD	A,E
	LD	E,A
	LD	A,D
	CPL
	LD	B,(Bsqr)
	ADD	A,B
	SRL	A
	LD	(Bsqr),A
	LD	A,(Csqr)
	DEC	A
	LD	B,(Bsqr)
	CP	B
	JR	z,done
	LD	(Csqr),B
	JR	iterate
done:
	LD	A,(Bsqr)
	RET

Faster Again

Since there are only 15 possible results for the square root of an 8 bit number, it's really just a series of comparisons, optimised to use a binary comparison. This routine also leaves all other registers intact:

SqrtA:	CP 8 * 8
	JR NC,ge8
	CP 4 * 4
	JR NC,ge4
	CP 2 * 2
	JR NC,ge2
	OR A
	RET Z
	LD A,1
	RET
ge2:	CP 3 * 3
	SBC A
	ADD 3
	RET
ge4:	CP 6 * 6
	JR NC,ge6
	CP 5 * 5
	SBC A
	ADD 5
	RET
ge6:	CP 7 * 7
	SBC A
	ADD 7
	RET
ge8:	CP 12 * 12
	JR NC,ge12
	CP 10 * 10
	JR NC,ge10
	CP 9 * 9
	SBC A
	ADD 9
	RET
ge10:	CP 11 * 11
	SBC A
	ADD 11
	RET
ge12:	CP 14 * 14
	JR NC,ge14
	CP 13 * 13
	SBC A
	ADD 13
	RET
ge14:	CP 15 * 15
	SBC A
	ADD 15
	RET

The average time for this routine, including CALL and RET is 26.9us.