Difference between revisions of "SP0256 Voice Generator"

From CPCWiki - THE Amstrad CPC encyclopedia!
Jump to: navigation, search
(Created page with 'Voice Generator The voice generator relies on the Amplitude, Pitch, F0..F5, and B0..B5 registers, which are processed like so: Amplitude --> F0 --> F1 --> F2 --> F3 --> F4 -…')
 
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Voice Generator
+
* [[SP0256]]
 +
* [[SP0256 Voice Generator]]
 +
* [[SP0256 Instruction Set]]
 +
* [[SP0256 Allophones]]
 +
* [[SP0256 Pin-Outs]]
 +
* [[SP0256 on Printer Port (DIY)]]
 +
 
 +
== Voice Generator ==
 
The voice generator relies on the Amplitude, Pitch, F0..F5, and B0..B5 registers, which are processed like so:
 
The voice generator relies on the Amplitude, Pitch, F0..F5, and B0..B5 registers, which are processed like so:
  
Line 7: Line 14:
 
Another important register is the Repeat counter, which indicates when the next opcode shall be executed (and which may then load new values into the above registers).
 
Another important register is the Repeat counter, which indicates when the next opcode shall be executed (and which may then load new values into the above registers).
  
Sample Rate and Repeat Timings
+
== Sample Rate and Repeat Timings ==
The SP0256 is (usually) driven by a 3.12MHz oscillator, and it uses 7bit PWM output. So the sample rate should be 3.12MHz/128, probably further divided by two:
+
The SP0256 is (usually) driven by a 3.12MHz oscillator, and it uses 7bit PWM output, which is clocked at 3.12MHz/2, to obtain a 10kHz sample rate, the chip issues some dummy steps with constant LOW level additionally to the 128 steps needed for 7bit PWM, making it a total number of 156 steps per sample.
  
   Sample Rate = 3.12MHz/256 = 12.1875kHz   ;82.051ns per sample
+
   Sample Rate = 3.12MHz/2/156 = 10.0kHz   ;100us per sample
  
Which means one sample is 82.051ns long, that value multiplied by 64 or 91 gives the following timings per repeat:
+
Which means one sample is 100us long, that value multiplied by 64 or 91 gives the following timings per repeat:
  
   5.251ms per repeat (noise and pause), or
+
   6.4ms per repeat (noise and pause), or
   7.466ms per repeat (tone with pitch=91)
+
   9.1ms per repeat (tone with pitch=91)
  
That (guessed/undocumented) samplerate does more or less match the specifications that say that (unwanted PWM-) noise is above 10kHz, and that wanted pitch can be up to 5kHz. And the repeat timings do more or less match the timings shown in the allophone list.
+
Note: Some speech interfaces have the chip overclocked to 4MHz, resulting in higher pitch & sample rate, and shorter timings as with the normal 3.12MHz.
  
Amplitude/Pitch/Repeat
+
== Amplitude/Pitch/Repeat ==
 
The 8bit amplitude register defines the volume in floating point form,
 
The 8bit amplitude register defines the volume in floating point form,
  
Line 30: Line 37:
 
   |        |        |
 
   |        |        |
 
   |________|________|________    __ Zero level            PITCH
 
   |________|________|________    __ Zero level            PITCH
 
+
 
 
   <-Pitch->                      __ Amplitude level (-)
 
   <-Pitch->                      __ Amplitude level (-)
 
   <--------repeat=3--------->
 
   <--------repeat=3--------->
Line 44: Line 51:
 
Note that (aside from noise) the AL2 ROM uses only one pitch value: 5Bh aka 91 decimal (meaning that all vowels are using the same base frequency, and they differ only by using different filter settings).
 
Note that (aside from noise) the AL2 ROM uses only one pitch value: 5Bh aka 91 decimal (meaning that all vowels are using the same base frequency, and they differ only by using different filter settings).
  
Amplitude/Noise/Repeat
+
== Amplitude/Noise/Repeat ==
 
Noise is activated when setting pitch=0. The timings are then same as when pitch=64, but instead of outputting HIGH and NULL levels, the hardware does now randomly output HIGH or LOW levels, for example, pitch=0 and repeat=5:
 
Noise is activated when setting pitch=0. The timings are then same as when pitch=64, but instead of outputting HIGH and NULL levels, the hardware does now randomly output HIGH or LOW levels, for example, pitch=0 and repeat=5:
  
Line 56: Line 63:
 
The exact random algorithm is unknown (probably some shift/xor stuff?), the random levels seem to be output on each sample (not only on the first sample of a repeat). Like normal pitch, the noise is passed to the 6 filters.
 
The exact random algorithm is unknown (probably some shift/xor stuff?), the random levels seem to be output on each sample (not only on the first sample of a repeat). Like normal pitch, the noise is passed to the 6 filters.
  
Pause/Repeat
+
== Pause/Repeat ==
 
The pause command sets amplitude=0. The timings are then same as when pitch=64, but the output is always NULL, for example, pause and repeat=5:
 
The pause command sets amplitude=0. The timings are then same as when pitch=64, but the output is always NULL, for example, pause and repeat=5:
  
 
                                   __ Amplitude level (+)
 
                                   __ Amplitude level (+)
 
+
 
 
   ______________________________  __ Zero level            PAUSE (SILENCE)
 
   ______________________________  __ Zero level            PAUSE (SILENCE)
 
+
 
 
   <-64->                          __ Amplitude level (-)
 
   <-64->                          __ Amplitude level (-)
 
   <----------repeat=5---------->
 
   <----------repeat=5---------->
Line 68: Line 75:
 
Pause does reset the filters to 0, so the silence is not affected by filters.
 
Pause does reset the filters to 0, so the silence is not affected by filters.
  
Digital Filters
+
== Digital Filters ==
 
As shown above, the amplitude/pitch/noise output is passed through six digital filter stages (using the F0..F5 and B0..B5 registers), each stage looks like so:
 
As shown above, the amplitude/pitch/noise output is passed through six digital filter stages (using the F0..F5 and B0..B5 registers), each stage looks like so:
  
Line 100: Line 107:
  
 
Above shows only positive values for index 0..127. Values for index -1..-128 should be 0..-511, or maybe -9..-512.
 
Above shows only positive values for index 0..127. Values for index -1..-128 should be 0..-511, or maybe -9..-512.
 +
[[Category:Music and sound]]

Latest revision as of 17:49, 19 December 2010

Voice Generator

The voice generator relies on the Amplitude, Pitch, F0..F5, and B0..B5 registers, which are processed like so:

 Amplitude   --> F0 --> F1 --> F2 --> F3 --> F4 --> F5 --> PWM --> External
 Pitch/Noise     B0     B1     B2     B3     B4     B5             5kHz Filter

Another important register is the Repeat counter, which indicates when the next opcode shall be executed (and which may then load new values into the above registers).

Sample Rate and Repeat Timings

The SP0256 is (usually) driven by a 3.12MHz oscillator, and it uses 7bit PWM output, which is clocked at 3.12MHz/2, to obtain a 10kHz sample rate, the chip issues some dummy steps with constant LOW level additionally to the 128 steps needed for 7bit PWM, making it a total number of 156 steps per sample.

 Sample Rate = 3.12MHz/2/156 = 10.0kHz   ;100us per sample

Which means one sample is 100us long, that value multiplied by 64 or 91 gives the following timings per repeat:

 6.4ms per repeat (noise and pause), or
 9.1ms per repeat (tone with pitch=91)

Note: Some speech interfaces have the chip overclocked to 4MHz, resulting in higher pitch & sample rate, and shorter timings as with the normal 3.12MHz.

Amplitude/Pitch/Repeat

The 8bit amplitude register defines the volume in floating point form,

 Amplitude = lower5bit SHL upper3bit

The pitch defines the frequency, counted in numbers of samples per period. For pitch=91, one HIGH sample (amplitude) is output, followed by 90 zero samples (null). That pattern is repeated as many times as specified in the repeat count, for example, with repeat=3:

                                 __ Amplitude level (+)
 |        |        |
 |________|________|________     __ Zero level             PITCH
 
 <-Pitch->                       __ Amplitude level (-)
 <--------repeat=3--------->

As shown above, the generated waveform is NOT a square wave (which would have 50% high, and 50% low). After applying filters, the final waveform may look somewhat like so:

                                 __ Amplitude level (+)
 |        |        |
 |_|_.____|_|_.____|_|_.____     __ Zero level             PITCH+FILTERS
  | |      | |      | |
  |        |        |            __ Amplitude level (-)

Note that (aside from noise) the AL2 ROM uses only one pitch value: 5Bh aka 91 decimal (meaning that all vowels are using the same base frequency, and they differ only by using different filter settings).

Amplitude/Noise/Repeat

Noise is activated when setting pitch=0. The timings are then same as when pitch=64, but instead of outputting HIGH and NULL levels, the hardware does now randomly output HIGH or LOW levels, for example, pitch=0 and repeat=5:

                                 __ Amplitude level (+)
 ||| || |  | |  || |   || ||| |
 |||_|| |__|_|__||_|___|| |||_|  __ Zero level             NOISE
    |  | || | ||  | |||  |   |
 <-64->| || | ||  | |||  |   |   __ Amplitude level (-)
 <----------repeat=5---------->

The exact random algorithm is unknown (probably some shift/xor stuff?), the random levels seem to be output on each sample (not only on the first sample of a repeat). Like normal pitch, the noise is passed to the 6 filters.

Pause/Repeat

The pause command sets amplitude=0. The timings are then same as when pitch=64, but the output is always NULL, for example, pause and repeat=5:

                                 __ Amplitude level (+)
 
 ______________________________  __ Zero level             PAUSE (SILENCE)
 
 <-64->                          __ Amplitude level (-)
 <----------repeat=5---------->

Pause does reset the filters to 0, so the silence is not affected by filters.

Digital Filters

As shown above, the amplitude/pitch/noise output is passed through six digital filter stages (using the F0..F5 and B0..B5 registers), each stage looks like so:

                     _____                      _____
 ------------------>|     |------------------->|     |-----+----->
          _____     | SUB |          ______    | SUB |     |
    +--->| *B  |--->|_____|    +--->| *2*F |-->|_____|     |
    |    |_____|     _____     |    |______|    _____      |
    +---------------|OLDER|<---+---------------| OLD |<----+
                    |_____|                    |_____|

Ie. the incoming samples are adjusted like so:

 for i=0 to 5                                         ;filter number
   sample = sample - quant_table[F.i] * OLD.i * 2     ;F0..F5 registers
   sample = sample - quant_table[B.i] * OLDER.i       ;B0..B5 registers
   OLDER.i = OLD.i
   OLD.i   = sample
 next i

Whereas, quant_table is a non-linear translation table that translates the signed 8bit registers to signed 10bit factors (with 9bit fractional part, ie. 511 means 0.99), with following entries:

 0  ,9  ,17 ,25 ,33 ,41 ,49 ,57 ,65 ,73 ,81 ,89 ,97 ,105,113,121
 129,137,145,153,161,169,177,185,193,201,209,217,225,233,241,249
 257,265,273,281,289,297,301,305,309,313,317,321,325,329,333,337
 341,345,349,353,357,361,365,369,373,377,381,385,389,393,397,401
 405,409,413,417,421,425,427,429,431,433,435,437,439,441,443,445
 447,449,451,453,455,457,459,461,463,465,467,469,471,473,475,477
 479,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495
 496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511

Above shows only positive values for index 0..127. Values for index -1..-128 should be 0..-511, or maybe -9..-512.