☢ Configuration (OTOing) ☢
All screenshots are taken in setParam, but these settings apply to any configuration tool.
Each reclist comes with a base oto to make the process faster and easier, but this is not a substitute for manual configuration. If you want your voicebank to synthesize well, you'll need to go through each sample and adjust parameters if necessary.
It's also possible to use slightly different oto techniques and still achieve good quality results; I trust that if you know what you're doing, you don't need me to point out what the alternate methods are, so I will only be going over how I do it.
☆ CVs ☆
Medial CVs
Alias: [CV]
The offset parameter should go at the point where the previous nucleus ends. The white area between the consonant and cutoff parameters should cover a section of the nucleus that will loop or stretch smoothly.
The preutterance should go directly between the onset and nucleus. The overlap should be about a quarter to a third of this value.
For consonants that only occur in codas [N, 5]
treat the consonant as if it were an onset in configuration.
For unaspirated plosives [k, t, p]
the offset goes after the [s]
. They are otherwise the same.
For affricates [tS, dZ]
, the offset should go at the beginning of the fricative release.
Initial CVs
Alias: [- CV]
The preutterance and overlap have fixed values of 200 and 50. The offset should be adjusted so that the preutterance is directly between the onset and nucleus.
The consonant and cutoff are the same as medial CV.
The preutterance shouldn't need to be adjusted unless there is noise before the CV or the onset is unusually long.
☆ VCs ☆
Transitional VCs
Alias: [V C]
The preutterance should be 200 msec, and go at the point where the previous nucleus ends — the same place as the offset for medial VCs. The overlap should be 50 msec and be in the stable section of the previous vowel.
If the overlap ends up in an unstable section of the nucleus, such as the semivowel portion of a diphthong, decrease the offset until the overlap is in the stable section, and increase the preutterance to be in the correct position.
For most consonants, the white area between the consonant and cutoff parameters should cover a relatively stable section of the consonant.
For plosives [p, b, t, d, k, g, ']
the white area should cover the quietest part of the airflow gap, because this is the portion of the consonant we use as the blending point. The plosive release should be cropped out.
For affricates [tS, dZ]
the blending point is the fricative release, so the white area should cover a stable portion of that.
Final VCs
Alias: [V C-]
The offset, preutterance, and overlap are the same as transitional VCs.
The consonant parameter should extend to cover the entire coda so that it does not get looped or stretched, and the cutoff can be placed at any point between there and the next sample as long as there is minimal noise.
☆ Vs & VVs ☆
Initial Vowels
Alias: [- V]
Like initial CVs, the preutterance and overlap have fixed values of 200 and 50. The offset should be adjusted so that the preutterance is at the point where the vowel begins.
The consonant and cutoff are the same as CVs, with the white area covering a stable portion of the vowel.
Glottal Stops
Alias: [V']['V]
The phonemic glottal stop samples are treated like medical CVs and transitional VCs.
Sustained Vowels
Alias: [V]
Sustained vowels are meant to be blended with a previous sample containing the same vowel for longer notes or pitch changes.
The preutterance is 300 and the overlap is 100; these should not need to be changed.
Like CVs and initial Vs, the white area should cover a stable portion of the vowel, but this section will likely be larger than that of other sample types.
The position of the offset doesn't matter much as long as it is within the vowel and the sample as a whole is relatively stable.
Vowel Ends
Alias: [V -]
These are essentially the same as final VCs, with the preutterance at 200, the overlap at 50, and the offset adjusted so the preutterance is at the end of the vowel.
If there is a breath at the end, extend the consonant parameter to cover it, otherwise the placement of it and the cutoff don't really matter as long as there's minimal noise.
Like any VCs, make sure the overlap is in the stable portion of the vowel, and adjust the offset and preutterance if necessary.
Vowel Blends
Alias: [V V]
Vowel blends are exactly the same as those in a VCV. The preutterance is 300 and the overlap is 100, and the offset should be adjusted so that the preutterance occurs at the moment where the second vowel begins. This can be difficult to see, so use the spectrogram to look for changes in the frequencies to help.
Same as above, make sure the overlap is in the stable portion of the vowel, and adjust the offset and preutterance if necessary.
The white area should contain a stable portion of the second vowel.
☆ Cs & CCs ☆
None of these samples should contain any portion the dummy vowel.
Initial Consonants
Alias: [- C]
These have a preutterance of 100 and an overlap of 25. The offset should be adjusted so that the preutterance goes at the moment where the onset begins. Provided there is no noise, the preutterance and overlap should not need to be changed.
The white area between the consonant and cutoff should contain a relatively stable section of the consonant.
These samples are not present for plosives, as they would only contain silence.
Final Consonants
Alias: [C -]
Place the offset at the point where the coda is stable, and place the preutterance at the point where it ends. The overlap should be about a quarter to a third the length of the preutterance.
The white area should be a section of the silence between the coda and the start of the next syllable.
Standard Consonant Blends
Alias: [C C]
The offset and overlap should go in the same place as a final consonant. The preutterance should go at the start of the second consonant.
The consonant and parameters should be placed so that the white area covers the stable section of the second consonant, since this is the one we want to act as the blending point.
This also means that plosives and affricates follow the same principals as their transitional VC counterparts, using the airflow gap or fricative release as the blending point.
Plosive-plosive CCs are excluded since they would be silent.
Medial Clusters
Onset Cluster Alias: [CC][CCC]
Coda Cluster Alias [C CC]
Functionally the same as a standard blend, just with different aliases to indicate their type.
For clusters with three or more consonants, the preutterance is still placed before the last consonant, so any middle consonant(s) will be between that and the overlap.
Initial Onset Clusters
Alias: [- CC][- CCC]
Same as an initial C, but with the preutterance place placed at the start of the first consonant of the cluster, and last consonant acting as the blending point. All else will fall into the consonant area.
Final Coda Clusters
Alias: [C C-][C CC-]
Same as a final C, but with the preutterance being placed at the end of the first consonant in the cluster. All else will fall into the consonant area.