Class
Segmenter
In: rbSnack/tkSnackUtil.rb
Parent: Object

A class used to bundle segmentation routines

Methods

doubleSeg, getNextSeg, new, singleSeg,
Attributes

 [RW]  :bkgrnd_power
Public Class methods
new( logPow, bkgrnd_power=nil) src
Public Instance methods
singleSeg(startPos=0, endPos=nil, minPower=nil, minDuration=nil) src

returns an array of time intervals on which the power exceeds the min (background) power for a period longer than a min duration, with a minimum peak value More explicitly. Time intervals are created by consider those periods where the sound is above a given min threshold. However, since we are dealing with voice, intervals which are too short in duration are discarded. Likewise, we assume the speaker is not whispering, so portion of the sound made by the speaker should be reasonably loud, thus intervals which do not contain a point of a minimum power peak are discarded. Intervals left over are retained. This algorithm was originally intend to isolate phrases where there was a constant background hum (due to my server). However, by varying minPower, phazes can be further split into words and syllables, although not perfectly.

getNextSeg(startPos=0, minPower=nil, minDuration=nil) src

if found word, but did not find end, returns start, nil if found word, and found end returns start, end if nothing found, returns nil, nil

doubleSeg(startPos=0, endPos=nil, minPower=nil, minPower2=nil, minDuration=nil, minDuration2=nil) src

This is used to further refine the segmentation technique given by singleSeg. What it does is to make 2 passes, the first pass invoking singleSeg isolates the intervals of speech from the background (there is assume a some constant minor background noise). The second pass invokes singleSeq upon each interval of speech obtained from the first pass, with a higher power requirement. Thus drops in power between words and syllables are detected, and we form intervals where the power has dropped. Within each of those we search for the point where the power is at a minimum and use that as a segementation boundry point.