Hi all:
I am coding a library in ASM using SDCC/ASZ80. On some routines, I want data to be aligned to different boundaries (4-bytes, 8-bytes, 16-bytes). ASZ80 has .bndry directive, which seems to do that, but it has a very big problem:
Quote
The relocation and/or concatenation of an area containing .bndry directives to place code at specific boundaries will NOT maintain the specified boundaries. When relocating such code areas you must specify the base addresses to the linker manually and/or you must pad the allocated space of an area to match the boundary condition.
This means that having something like this:
.bndry 16
my_aligned_buffer: .ds 10
will only be aligned if this code section is never relocated/concatenated. That is not possible for library code, as it is relocated and concatenated with the main program that uses it, effectively destroying memory alignment.
I have tried several things, using directives to try to force memory location to be "% 16 = 0", with no luck at all. The only kind of "hack" i can think of is something really weird like this:
my_aligned_buffer: .ds 10 + 16
my_aligned_buffer_start: .dw my_aligned_buffer
align_buffer_start::
LD BC, #16
LD HL, (my_aligned_buffer_start)
LD A, L
AND #0b11111100
LD L, A
ADD HL, BC
LD (my_aligned_buffer_start), HL
RET
This is really weird, as I need to reserve lot of useless bytes (just to force padding on all possible cases) plus code to do the alignment and clock cycles to waste.
Is there any better way to force the compiler to align a piece of data or code on compilation time, even if it is relocated?
I have used the following to align to a 256 byte boundary:
.module _cpc_IM2Table
.globl _cpc_IM2Table
.area _IM2TABLE (PAG)
_cpc_IM2Table:
.ds 257
This does work.
256 is a multiple of 16, so you could also use this and pack many 16 byte regions together???
EDIT: I didn't try to align to smaller boundaries. I think it's not possible for library code.
Thank you for your reply, @arnoldemu (http://www.cpcwiki.eu/forum/index.php?action=profile;u=122). I knew .area directive, but it seems odd having to pack all aligned data into blocks of 256. For instance, if I have say 4 or 5 buffers along the library, it forces me to pack all of them into one data area. The problem here is that buffers are associated to different functionalities and, if the user of the library does not make use of one concrete functionality, associated buffer(s) should not be included in the final binary. Packing all the buffers into one area would force all the data to be included in every binary that links with the library and uses one or more routines associated with one or more buffers.
May be I am starting to get too mad about optimizations. Call me finicky, but I even have considered picking up another compiler for this reason :)