Voice Onset Time (VOT)


Voice onset time (VOT) is the time it takes for the vocal folds to start vibrating following the release of a plosive consonant, usually measured in milliseconds (ms). The duration of VOT is dependent on the type of plosive (aspirated or unaspirated, voiced or voiceless) and phoneme which follows the plosive.

There are three main types of VOT, according to the sequence of sounds:

  • Zero VOT: this occurs when an unaspirated voiceless plosive is followed by a sonorant where voicing occurs immediately after the release of the plosive. Due to the immediate voicing (and vocal fold vibration) the onset time is almost 0 ms.
  • Positive VOT: aspirated stops that are followed by a voiced sonorant have a longer, positive VOT. This means that there is a greater amount of time in between the release of the stop and the voicing of the following segment.
  • Negative VOT: voiced plosives have a “negative” VOT. That is, that the vocal folds begin vibrating during or before the release of the plosive and continues into the following sonorous segment.

In a waveform, the VOT is represented by the amount of time (along the x-axis) that it takes for there to be vocal fold vibration. This can be observed in the following diagram that illustrates the three types of VOT. The straight lines represent voicelessness and the waves representing voicing or vocal fold vibration.

The differences in waveforms can be observed for each type of VOT. Source