Command-Response (CR)

The CR model is also known as Fujisaki’s model (Fujisaki, 1983). It represents F0 contours in the logarithmic scale as the sum of three layers of data, including (a) a baseline lnFb, (b) phrase components and (c) accent/tone components. Phrase and accent/tone components are generated from second-order critically-damped linear systems in responses to phrase and accent/tone commands, respectively. Phrase components represent the slow phrase-level F0 variations while the accent/tone components represent faster F0 variations. The model is represented by the following equation (Gu et al., 2007):

where Gp(t) represents the impulse response function of the phrase control mechanism, Gt(t) represents the step response function of the accent/tone control mechanism and Fb is the baseline F0. The parameters Api and T0i denote the magnitude and time of the i-th phrase command, while Atj, T1j and T2j denote the amplitude, onset time and offset time of the j-th accent/tone command, respectively. The constants α, β and γ have default values of 3.0 Hz, 20.0 Hz, and 0.9.


Fujisaki, H. (1983). Dynamic characteristics of voice fundamental frequency in speech and singing. The Production of Speech. P. F. MacNeilage. New York: Springer-Verlag  pp. 39-55.

Gu, K. Hirose and H. Fujisaki, “Analysis of Tones in Cantonese Speech Based on the Command-Response Model,” Phonetica, vol. 64, pp. 29-62, 2007.