Target Approximation (TA) – Common Prosody Platform

The basic idea of PENTA is that surface F₀ results from syllable-synchronized sequential target approximation, whereby each target is an underlying linear trajectory specified by multiple communicative functions (Xu, 2005). PENTA has been implemented to perform both local and global optimization methods (Prom-on, Thipakorn & Xu, 2009; Prom-on, Liu & Xu, 2011). The detailed implementation of PENTA with global optimization is given in (Prom-on et al., 2009). Target approximation (TA) in PENTA is mathematically realized as a third-order critically damped linear system driven by pitch targets:

Here the first parenthesis is the pitch target while the second is the natural response of the system to the target. m and b indicate slope and height of the target, respectively. This means that the target can be static or dynamic depending on the slope, and can be higher, lower, or at a similar level to the referenced F₀ baseline depending on the target height. λ is an empirically derived rate of target approximation, which indicates how fast F₀ approaches the target. Coefficient c₁, c₂, and c₃ are determined by solving the initial value problems given the initial F₀ level, velocity and acceleration directly obtained from the data. Thus, at the end of each syllable, the final F₀ dynamic states are transferred to become the initial condition of the next syllable. This guarantees the continuity of F₀ contour up to the third order.

References

Prom-on, S., Liu, F. and Xu, Y. (2011). Functional modeling of tone, focus and sentence type in mandarin Chinese. In Proceedings of The 17th International Congress of Phonetic Sciences, Hong Kong.

Prom-on, S., Xu, Y. & Thipakorn, B. (2009). Modeling tone and intonation in Mandarin and English as a process of target approximation. Journal of the Acoustical Society of America 125: 405-424.

Xu, Y. (2005). Speech melody as articulatorily implemented communicative functions. Speech Communication 46: 220-251.