Speech! from Superior Software was such a ground-breaking piece of software, that it is difficult to overestimate the impression that it made. It was featured on television programs - for instance Micro Live presenter Fred Harris demonstrated it on the BBC 1 Saturday morning children's show "Saturday Superstore". And it also starred on records - most famously on Roger Water's (ex-Pink Floyd) solo LP "Radio Chaos". "Radio Chaos" was a concept album about someone that creates a computerised voice (a person receives a credit for operating a Master 128 on the sleeve!).
At only £9.95 on tape (or £11.95 on disk), Speech! killed Acorn's own speech synthesis hardware, which they had gone to a great deal of effort to develop. A sad side-effect of this was that because Acorn bundled their speech synthesis hardware with their ROM cartridge socket (designed to go on the left hand side of the BBC Micro's keyboard), this upgrade also effectively died too. Acorn's stock of Speech Synthesis upgrades came to a sad end - being sold for peanuts in the Micro User's Reader Offers page.
A huge amount of work and ingenuity went into writing this Speech!. Check out the following texts:
From Your Computer, January 1986, "Software Shortlist" - page 42:
Speech without a speech chip. It's hard to think of a more entertaining utility than this one. You can easily insert the program's 7K of code on someone else's disc, write a short Basic loader, and ask them to boot it up. Lo! a spoken message. The surprise effect is worth the price; like seeing colour on a ZX-81 screen.
Speech's speech may not always be intelligible if the words are not printed on the screen at the same time. But it is unusually easy to use. *SAY voices any English sentence, while *PITCH sets the pitch. If you are willing to put a bit of effort into it you can improve the program's enunciation by using *SPEAK which directly accesses a set of 49 phonemes. One of the demo programs will even take a text file - from Wordwise or View, for example - and painstakingly talk it through - if you can bear to listen to it. Although speech has been generated before on the CBM-64 and Spectrum, through the sound chip alone, this is the first time on the BBC. Elsewhere in this issue Speech's author, David Hoskins, explains how.
From BBC English, Your Computer, January 1986 - pages 84-86:
The BBC has very easy to use and versatile sound facilities. The envelope command enables you to change the pitch and volume of any sound played. But there is one draw-back, you cannot change the actual waveform of the sound. So you always get that spikey, tinny kind of sound that's always recognisable as a BBC.
First, set the sound chip to it's highest possible frequency, which is inaudible to the human ear. The more channels used the louder the waveform will be. The waveform which varies in amplitude from 0 to 15 is then sent nibble by nibble to the volume control of each channel. The fastest speed the BBC can do this is about 10000 times a second, the faster you go the clearer the sound will become. What was needed now was the waveform data to feed the program. To do this I used the Interbeeb module from DCP Microdevelopments which sampled - recorded - data at a rate of 10000 times pre second, this was perfect for the job. After many weeks of frustration and fiddling I eventually had a result...
"It worked!" I exclaimed and ran downstairs to tell everyone. The scope for experiment was awesome. I spent a long time recording and playing sounds, at different speeds, backwards, with an echo, while mixing with other sounds, making drum beats, and even playing tunes with samples like some £280000+ synthesisers !
I had digitised about five seconds of music at a high sample rate, and it was now playing through the BBC's speaker under complete software control! To digitise speech I use a sampling rate of about 5.5khz and a filter to cut out any unwanted sounds above that frequency. I speak the words into a microphone which are converted from analogue form to digital by the Interbeeb, and sent to the BBC which in turn stacks the data in memory. The data can then be saved and played back at any time without using any hardware.
The answer is in phonemes: the sounds that we string together in order to make speech. Take the words "mat" and "cat" for example, they both use exactly the same "a" sound. So in the sentence "The cat sat on the mat" all the "a" sounds can be just one block of speech data. The same thing applies to the "t"'s, "th"'s, "e"'s, and so on until you have covered as many phonemes needed for understandable speech. In Speech! I have used 49 different phonemes, which are merged to form whatever word or sentence you like.
Unfortunately to convert English phrases into phoneme data is a little more complicated. The trouble with the English language is that it is occasionally very illogical. For instance, the word "laughter" almost completely changes pronunciation when an "s" is prefixed to make "slaughter". Why is the word "rough" different from "bough" and "tower" different from "mower" merely because of the prefixing consonant? The reason why we know the differences and exceptions in the English language is because we have had years of training. The computer hasn't, and all it had to rely on is logic. Basically there can be at least 400 rules for text/speech conversion with a few thousand exceptions! Obviously to fit Speech! into 7.5K I have cut down on the exceptions, but any mispronounced words can be easily altered with a little thought.
The biggest problem with sampling and repeating waveforms is that it does not work with fricatives. These are phonemes which use "white noise" or resonated hiss e.g. "S" as in "Salt". Voiced fricatives use the resonant hiss but are also overlayed by a glottal pulse waveform. You cannot use the BBC's own white noise generator because it has only one set sound and doesn't resemble any of the phonemes.
To get around this problem I digitised 128 samples of spoken fricative, which fitted nicely into the rest of the data and could be easily tabled. To play the fricative Speech! picks a random number between 0 and 127 along the data of the sound and then plays a waveform block of 16 samples from that point. The process is then repeated by choosing the random number again. This is continued until the desired time of the phoneme is over. The result is quite and an accurate reproduction of the original fricative. The only problem left (soundwise) was the voiced fricatives.
The result is a gradually varied and fairly accurate reproduction of speech. An example of Speech! can be heard on two games from Superior Software: Citadel and Repton 2. Speech! and six programs and utilities can be bought at £9.95 on tape and £11.95 on disk.
10 REM Waveform Encoded strange sounds program 20 REM (C) 1985 DAVID J HOSKINS 30 REM for YOUR COMPUTER 40 REM line 110 inserts a sinewave into waveform table 50 REM this can be changed to create any sound 60 REM once data is in memory line may be skipped 70 80 MODE4 90 PROCinit 100 MOVE0,448 110 FORA%=0TO255:Y%=7.9+8*SINRAD(360/256*A%) 120 DRAWA%*4,Y%*64:A%?table=Y%:NEXT 130 CALLstart 140 END 150 160 DEFPROCinit :REM assemble code 170 temp=&71:table=&4000:sped=&70 180 tim=&72:coun=&74 190 FOR pass=0 TO 2 STEP2:P%=&4100 200 210 [OPT pass 220 230 .start 240 250 SEI 260 LDA #193:JSR in:LDA #0:JSR in \ set up sound channels 270 LDA #129:JSR in:LDA #0:JSR in \ " " " " 280 LDA #161:JSR in:LDA #0:JSR in \ " " " " 290 LDA #0 300 STA tim 310 STA tim+1 320 STA sped 330 340 .loop 350 LDY#0 360 STY coun 370 380 .loop2 390 LDA table,Y 400 AND #15 410 JSR send 420 LDA tim 430 AND #127 440 CLC 450 ADC#1 460 TAX 470 .wait 480 DEX 490 BNE wait 500 TYA 510 CLC 520 ADC sped 530 TAY 540 CLC 550 ADC #3 560 STA tim 570 DEC coun 580 BNE loop2 590 LDA sped 600 AND #7 610 ADC tim+1 620 ADC tim 630 STA sped 640 INC tim+1 650 BNE loop 660 CLI 670 RTS 680 690 .send \send wave form to sound chip 700 710 EOR #15 720 STA temp 730 ORA #208 740 JSR in 750 LDA temp 760 ORA #144 770 JSR in 780 LDA temp 790 ORA #176 800 JSR in 810 RTS 820 830 .in 840 850 LDX #255 860 STX &FE43 870 STA &FE41 880 INX 890 STX &FE40 900 LDA &FE40 910 ORA #8 920 STA &FE40 930 RTS 940 950 ] 960 NEXT 970 ENDPROC