Effect of Co-Articulation on One Third Octave Spectral Amplitudes of Vowel /I/

Hypernasality is one of the core speech characteristics observed in the speech of children with repaired cleft lip and palate. One-third-octave analysis has been considered as a potential tool to measure acoustic correlates of hypernasality in the speech of individuals with RCLP. However, the acoustic characteristics of speech are influenced by the contextual effects. Hence, the present study aimed to find out the difference in one third octave spectral amplitudes of vowel /i/ across various contexts in children with RCLP and typically developing children. A total of 24 participants (12-RCLP, 12-TDC) in the age range of 4-12 years were considered for the study. The speech sample recorded included repetition of isolated vowel /i/ and vowel /i/ in the phonetic context of /pit/ and /tip/. The one third octave spectral amplitudes were measured for all the stimuli and compared across the groups using MATLAB. The results indicated that energy concentration over one third octave spectrum was more in RCLP group for stimulus /i/, /pit/, & /tip/ as compared to control group. The spectral energy at low frequencies (97Hz, 125Hz and 157.5 Hz) of the isolated vowel /i/ demonstrated a significant increase in spectral energy in RCLP than the control group. The study also reported higher spectral amplitudes for vowel /i/ in the context of /pit/ and /tip/ across frequencies as compared to the spectral amplitudes of isolated vowel /I/ across the groups. The differences were attributed to the influence of phonetic context on the spectral amplitude of vowel /i/.


Introduction
Hypernasality is a perceptual quality associated with excessive nasal resonance because of velopharyngeal incompetence [1]. It is one of the major speech deviances exhibited by individuals with cleft lip and palate (CLP). The evaluation of hypernasal speech of children with repaired cleft lip and palate (RCLP) can be carried out using various methods. The acoustic analysis is one of the objective techniques which intend to directly study the speech production mechanism. Nasalization with its characteristic acoustic features affects the acoustic analysis of a speech signal. The speech signal gets influenced by dampening effect and by anti formants which will have a major impact on acoustic signal [1]. This technique is advantageous as it is non-invasive, cost effective, and since it can be applied to speakers with various age, gender and speech impairments with different etiologies [2]. Kent, Liss and Philips [3] and Chen [4] described the acoustic correlates of nasalized vowels in spectrograms. They reported an increase in formant bandwidths, the overall reduction in the amplitude of the vowel. They also noticed the low energy of the upper formants as a result of the presence of ant formants. Among the acoustic measures, one third octave spectral analysis is one of the spectral measures which is considered as a potential tool to measure acoustic correlates of hypernasality in the speech of individuals with RCLP. One third octave interval was chosen as it can be judged well against the critical bandwidth of ear's analyzing mechanism [5]. The power spectrum extracted from digitized samples was analyzed at every one third octave band to calculate the mean power level of each band. These levels were then normalized relative to the amplitude of the band that contained the fundamental frequency [5]. This tool has been proved to quantify the degree of hypernasality [6]. The recent studies focusing on spectral features of hypernasal speech have been investigated hypernasality in the speech of children and young adults with cleft palate and cleft lip using one third octave analysis and in adults following maxillectomy.
Kataoka et al., [7] aimed at correlating the once third octave spectral evaluation with the perceived nasality in children with cleft palate and controls. When the two groups were compared, it was shown that the spectrum of hypernasality group was marked by increased spectral amplitudes between F1 and F2 and a reduction in spectral amplitudes around F2 region which differentiated the two groups. They obtained a highly significant correlation (r=0.84) between perceptual ratings and amplitudes of one third octave spectral bands (1k, 1.6k, & 2.5 kHz) using multiple regression analysis. In the same line of thought, Navya [7] measured one third octave band spectrum in vowels /a/ and /i/ and looked for its sensitivity and specificity in differentiating hypernasality group from the control group. The results indicated increase in amplitudes was observed for frequencies below 1000 Hz which demonstrated a significant difference between the two groups. Another major finding of the study was that the high sensitivity and specificity was found for the frequency region between 998Hz and 2663 Hz which shown to be better differentiating the two groups using 1/3rd octave spectra analysis. However, there are variations seen in the spectral characteristics with respect to speakers and phonetic contexts [8,9].
The majority of the studies incorporated vowels as the stimulus for carrying out an acoustic analysis of hypernasal speech. Among vowels, vowel /i/ was chosen as the optimal stimulus to determine nasality owing to the fact that high vowels are produced with greater velar height and it demands relatively less nasal coupling for it to be perceived as nasal, compared with low vowels [10][11][12]. However, it is observed that the acoustic property of a vowel gets influenced by coarticulatory effects. Coarticulation is regarded as a process whereby the properties of a segment are altered due to the influences exerted on it by neighbouring segments. Several studies have documented the variation in the acoustic features of vowels that were seen as a function of consonantal context. Lindblom [13] had observed the effect of three consonants (/bVb/, /dVd/) on eight Swedish vowels and compared the production of the same vowels in isolation. The results revealed that in the context of consonants, the formant frequency of a given vowel fails to achieve its target values than in a neutral context. Therefore, it was concluded that the vowel varied as a function of consonantal context and this effect was termed it as a formant undershoot. Hence, the present study aimed to evaluate the difference in one third octave spectral amplitudes of vowel /i/ across various contexts in children with RCLP and typically developing children.
Aim of the study: To evaluate one third octave spectral amplitudes of isolated vowel /i/ and also in the context of /pit/ and /tip/ in children with RCLP and typically developing children.

Participants
The present study considered 24 children in the age range of six to ten years. Among 24 children, 12 children with repaired cleft lip and palate having no associated anomalies and 12 age and gender matched typically developing children served as controls. The control subjects had no history of ear, nose, and throat infections and all the participants in both the groups were screened for hearing loss prior to the inclusion. The informed consent was provided to the parents/caretakers of the participants.

Procedure
The speech recording of all the participants was performed in a sound treated room. The participants were seated comfortably in an upright position and the sound level meter (SLM) was placed 2 centimeters away from each participant. The speech stimuli consisted of sustained phonation of vowel /i/ and repetition of the CVC words /pit/ and /tip/ was recorded. The children were demonstrated to repeat/phonate the stimulus at a comfortable pitch and loudness level prior to the actual recording. The instructions were given to phonate vowel /i/ and repeat words thrice with the inter stimulus duration of three seconds. Later the recorded samples retrieved from SLM were saved in a laptop. The Praat software was used to extract a steady state portion of 500 milliseconds in a sustained vowel /i/ and 50 milliseconds of the vowel /i/ in the context of /pit/ and /tip/ for further spectral analysis.

One third octave spectral analysis
The spectral band energy at every one third octave interval from 100 Hz to 16000 Hz was analyzed [5]. In the present study, the edited data was subjected to MATLAB analysis to obtain one third spectral amplitudes of vowels across the stimuli for two groups. In total, spectral amplitudes were obtained at 23 octave bands over a frequency range of 100 -16,000 Hz. However, only specific frequency bands were selected based on the findings of the earlier studies and spectral amplitudes were analyzed for 396Hz, 500Hz, 630Hz, 793Hz, 1000Hz, 1259Hz, 1587Hz, 2000Hz, 2519Hz, 3174Hz, and 4000Hz. The one third octave spectral amplitudes at these bands were calculated and compared across the stimuli for both the groups (/i/, /pit/ and /tip/) to find out the effect of coarticulation on spectral amplitudes of vowel /i/.

Statistical analysis
The data related to spectral amplitudes obtained for both the groups across stimuli were arranged and subjected to descriptive statistical analysis using Statistical Package for Social Science, Version 17.0 (SPSS). The mean amplitude values and its standard deviation for different stimuli /i/, /pit/ and /tip/ across the frequencies and groups were obtained. The normal distribution of the data was analyzed using Shapiro-Wilk test of normality and as the normality was not achieved non parametric test i.e., Mann Whitney U test was carried out to evaluate the null hypothesis that there is no change in participant's amplitude scores when measured across the stimuli and between the groups. The descriptive statistical analysis of the data was performed. The table 1 and figure 1 depicts the mean and standard deviation for isolated vowel /i/, /i/ in the context of /pit/ and /tip/ across the frequencies and groups. Figure 1 describes the variations in energy concentration with respect to frequencies across RCLP and normal groups for different stimuli. In general, it can be depicted that as the frequencies increased, there was a rise in amplitudes for both the groups across the stimuli. It is observed that frequencies from 12.4Hz to 31.3Hz showed an overall increase in the amplitudes across all the groups for all the stimuli. However, at 39.4Hz, there is a sudden drop in amplitudes for both the groups and also a gradual rise of amplitudes was observed from 49.6Hz to 198.4Hz. Later, again there is a significant increase in energy concentration for frequencies 250Hz and 315Hz. One more major finding of the study is that, across frequencies, for both the groups of RCLP and TDC, mean amplitude values of /pit/ and /tip/ were higher when compared to the isolated /i/ stimulus which explains the effect phonetic context on spectral features of /i/. From figure 1, it was apparent that the relative differences in spectral amplitudes across the groups were higher for the isolated vowel /i/ than vowel /i/ in the context of /pit/ and / tip/. The isolated vowel /i/ exhibited increased spectral amplitudes in the RCLP group compared to the normals. The one-third-octave spectral amplitudes between the two groups for the vowel /i/ in / pit/ and /tip/ context were overlapping, which failed to stand as a differentiating factor.

Descriptive and non-parametric statistical test results
The table 2 and figure 2 represents the amplitudes of frequencies from 396Hz to 8KHz for the stimulus /i/, /pit/ and /tip/ across the two groups. From figure 2, it can be noticed that there is a gradual reduction in the spectral amplitudes from 793Hz to 2519Hz across the groups. Then for 3174Hz and 4000Hz, there is a rise in the spectral amplitudes followed by gradual reduction while reaching to 8000Hz. The frequencies 1000Hz, 1259.9 Hz, 1587.4Hz and 2000 Hz demonstrated a significant increase in amplitudes of the isolated vowel /i/ across the groups. For vowel /i/ in the context of /pit/ and /tip/, the spectral energy in the frequency bands between 630Hz to 2519Hz demonstrated higher amplitudes for RCLP group than TDC. To check the normality, Shapiro-Wilk test of Global Journal of Otolaryngology normality was applied. Review of the S-W test for normality of one third-octave spectral amplitudes for RCLP and TDC group indicated the skewed distribution of the data. Followed by the normality test, the data was subjected to non-parametric statistical test i.e., Mann-Whitney U test. The Mann-Whitney U test was conducted to evaluate the null hypothesis that there is no change in participant's amplitude scores when measured across the stimuli and between the groups (Table 3)   In the present study, overall the energy concentration over the one third octave spectrum for stimulus /i/, /pit/, & /tip/ were more in RCLP group as compared to TDC group. From table 3, it can be interpreted that one-third octave spectral amplitudes at frequencies 1259 Hz, 1587 Hz and 2000 Hz showed a significant difference between normal and RCLP groups in all the three stimuli (/i/, / pit/ & /tip/). It shows that irrespective of the context, these mid frequencies were sensitive enough to discriminate the two groups. For isolated vowel /i/, significant differences in spectral energies between RCLP and control group was found at lower frequencies (97Hz, 125Hz, and 157.5 Hz). However, the significant difference in spectral energies was not found for the same frequencies for vowel /i/ in the presence of a context (pit, tip). For /pit/, frequencies such as 793.7Hz, 1000Hz, 1259Hz, 1587Hz and 8000Hz differentiated the normal from RCLP group. The significant difference was exhibited at frequencies 1000Hz, 1259Hz, 2519Hz, 1587Hz and 8000Hz for the stimulus /tip/.

Discussion
The present study aimed to find out the differences in spectral amplitudes between children with RCLP and typically developing children (TDC) across stimuli. One of the major findings of the study is that the energy concentration over the one third octave spectrum was found to be more in RCLP group for across the stimulus compared to TDC group. This result is in consensus with the findings of a previous study conducted by Navya et al. [13,14] who reported higher spectral amplitudes at all the one-third octave spectral frequencies for the vowel /a/ and /i/ in RCLP than in normals. The result was also supported by Kataoka et al., [10] who also found increased amplitudes for the hypernasal group of isolated vowels compared to control group. The possible explanation for increased spectral energy in the vowel production of children with RCLP is due to the presence of reinforced harmonics at frequencies where the energy is not normally expected. However, the contradictory findings were found in majority of the studies who have reported reduction in the amplitude of all formants in hypernasal speech [4,14,11] The second major finding of the study is that the isolated vowel /i/ had shown significant differences in spectral energies between RCLP and control group at lower frequencies (97Hz, 125Hz, and 157.5 Hz). The present result can also be explained by the earlier findings of the study carried out by Navya [14]. In their study, they reported that for vowel /i/ the amplitudes below 1 kHz showed increased values and were sensitive in differentiating the two groups. This is explained as there was the potential effect of the introduction of pole-zero pairs in the transfer function due to the coupling of the nasal tract to the main vocal tract. When the high vowel /i/ becomes nasalized, the pole-zero pair emerges in the high-frequency region. As a result, the amplitude of F1 is not attenuated. Glass and Zue [15] also repoted noticeable differences in the magnitude of spectrum between the nasalized and nonnasalized vowels in the low-frequency regions which highlight the importance of low frequencies in differentiating control group from hypernasality group.
Another finding of the current study is that, the spectral amplitude of vowel /i/ in the context of /pit/ and /tip/ differentiated the normal from RCLP group at 793.7Hz, 1000Hz, 1259Hz, 2519Hz and 8000Hz by exhibiting increased amplitudes for RCLP group Similar results were obtained by Lee, Ciocca, & Whitehill [16] who also used non nasal words in consonant-vowel-consonant (CVC) combinations (e.g., /pit/, /tip/) and found that participants with hypernasal speech tended to have higher intensity levels at bands centered at 630 Hz, 800 Hz, and 1000 Hz, as well as lower intensity levels for the band centered at 2.5 KHz compared to speakers with normal resonance. However, the justification for the findings obtained and explanation about the contextual effect was not provided by the authors in their work.
The study also reported higher amplitudes for /pit/ and / tip/ across frequencies compared to the isolated vowel /i/, for both the groups of RCLP and TDC. This finding can be correlated well with coarticulatory studies [13,11,[17][18][19][20][21][22]. These studies focused on consonantal context effect on spectral and temporal characteristics of vowels and have shown that the vowels undergo phonetic reduction owing to the influence of consonantal context. The phonetic reduction of a vowel entails a reduction in the acoustic duration of the vowel and formant undershoots (i.e, a change in formant frequencies from their ideal target values). As a result, it can be decoded that these potential spectral modifications in a vowel due to the effect of phonetic context might alter the amplitude related information and frequency specific differences can be described on the same basis. Thus the study concluded that coarticulation results in higher spectral amplitudes in vowel /i/ in the context of /pit/ and /tip/ compared to isolated vowel /i/ and were attributed to the phonetic reduction of vowel /i/. Overall, the RCLP group exhibited higher one third octave spectral amplitudes than control group.