I'm not 100% sure, but I think the firmware DRAW routines are quite fast and don't use a pixel PLOT routine at all, rather they use what's called Bresneham's line drawing algorithm and determine first if the line is to be stepped horizontally or vertically, then use single pixel mask rotates and address increments which is much faster than recalculating the address and mask for each individual pixel. So the firmware draw could be much faster than using my fast plot routine.
That's the way the firmware draws lines.
As I said, there's no need to determine the new pixel position by recalculating it completely. Instead you "move" to the next neighbouring position by one pixel. That usually is either horizontally determining the new pixel mask and perhaps add a byte to the screen address or moving up one line (usually subtracting &800) and retain the mask (since the next pixel is directly above the current). In the worst case you'll have to do both of the above because the line is diagonal.
@hal 6128: I would not recommend you to try to implement a Bresenham algorithm, because this is more advanced and not suited for a beginner level. You would be easily discouraged. Try horizontal, vertical and diagonal lines first, until you think you've completely understood it.
Now, if you look at the fast pixel plot routine, the first part tries to determine the screen address (up to the line with the comment "+ HL = Screenaddress". Once you have that address, you don't need to recalculate it again. Instead just add/subtract an offset to it to find the next byte if you need to move vertically, if necessary.
The second part of the fast pixel plot determines the exact pixel position within that screen address. You have to change that part whenever you try to move the pixel horizontally.
When you need to move diagonally, you need to combine both of the above.
And for all that you really need to get the hang of the CPC's screen layout!
If all that seems too difficult at first, try to write a BASIC version that does the same thing, and if that works, convert that to assembly.