The CR model is also known as Fujisaki’s model (Fujisaki, 1983). It represents F0 contours in the logarithmic scale as the sum of three layers of data, including (a) a baseline lnFb, (b) phrase components and (c) accent/tone components. Phrase and accent/tone components are generated from second-order critically-damped linear systems in responses to phrase and accent/tone commands, respectively. Phrase components represent the slow phrase-level F0 variations while the accent/tone components represent faster F0 variations. The model is represented by the following equation (Gu et al., 2007):
where Gp(t) represents the impulse response function of the phrase control mechanism, Gt(t) represents the step response function of the accent/tone control mechanism and Fb is the baseline F0. The parameters Api and T0i denote the magnitude and time of the i-th phrase command, while Atj, T1j and T2j denote the amplitude, onset time and offset time of the j-th accent/tone command, respectively. The constants α, β and γ have default values of 3.0 Hz, 20.0 Hz, and 0.9.
References