A sound object consists of the sound and the methods to play, record,
analyse and edit.
+,
[],
[]=,
cget,
changed,
concatenate,
configure,
copy,
crop,
cut,
dBPowerSpectrum,
destroy,
filter,
flush,
formant,
get_sample,
info,
insert,
length,
length=,
load,
max,
min,
new,
pause,
pitch,
play,
power,
read,
record,
reverse,
section_to_canvas,
set_sample,
spectrogram_to_canvas,
stop,
waveform_to_canvas,
write,
[R] |
:name |
Options is hash of constructior options
|
Included modules
Creates a new Sound object. Valid options are
Valid options are
- load fileName
- reads file filename into self
- file fileName
- links on-disk file to sound
- channel channelName
- links sound to channel. Used for audio streaming or playing large files.
Audio data not loaded into memory.
- rate sampleRate
- specifies sample rate, a positive integer which should supported by
hardware. For speech applicatons 16000 is most common.
- channels channelSpec
- Valid values are mono, stero, and integers greater than
0. The default is 'mono'.
- encoding sampleFormat
- The sample encoding format: legal values for sampleFormat are
_Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float,
Alaw, or Mulaw. Default encoding is _Lin16_.
- skiphead noOfBytes
- Used to skip used an unknown file header of size noOfBytes bytes
- byteorder endianess
- Valid values are littleEndian and bigEndian
- guessproperties boolean
- Valid values are 1, 0, or the ruby true and false.
Specifies that Snack should try to infer
properties, such as byte order, sample encoding format, and sample rate for
a raw file by analyzing the contents of the file.
- precision precision
- Specifies the precison of the buffer data. Valid values are single
and double.
- buffersize size
- Specifies the internal buffer size used for samples obtained through a
channel.
- fileformat RAW
- Sets the file format. Legitimate formats currently supported are WAV, MP3,
AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command returns the file
format detected. It is possible to force a file to be read as RAW using
fileformat RAW. In this case the properties of the sound
data should be specified by hand
- changecommand cmd
- Specifies a procedure to be called whenever a sound property is changed
- debug level
- Controls the level of debugging. Valid values are 0 through 5.
Retrieve the current value of a given sound option. Permissible values for
option argument are: Valid options are
- load
- returns the name of the file which was loaded into self
- file
- returns file linked to sound
- channel
- lreturns channel which sound is linked to (Used for audio streaming or
playing large files. Audio data not loaded into memory.)
- load fileName
- reads file filename into self
- file fileName
- links on-disk file to sound
- channel channelName
- links sound to channel. Used for audio streaming or playing large files.
Audio data not loaded into memory.
- rate sampleRate
- specifies sample rate, a positive integer which should supported by
hardware. For speech applicatons 16000 is most common.
- channels channelSpec
- Valid values are mono, stero, and integers greater than
0. The default is 'mono'.
- encoding sampleFormat
- The sample encoding format: legal values for sampleFormat are
_Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float,
Alaw, or Mulaw. Default encoding is _Lin16_.
- skiphead noOfBytes
- Used to skip used an unknown file header of size noOfBytes bytes
- byteorder endianess
- Valid values are littleEndian and bigEndian
- guessproperties boolean
- Valid values are 1, 0, or the ruby true and false.
Specifies that Snack should try to infer
properties, such as byte order, sample encoding format, and sample rate for
a raw file by analyzing the contents of the file.
- precision precision
- Specifies the precison of the buffer data. Valid values are single
and double.
- buffersize size
- Specifies the internal buffer size used for samples obtained through a
channel.
- precision
- Returns the precison of the buffer data. Valid values are single
and double.
- buffersize size
- Returns the internal buffer size used for samples obtained through a
channel.
Note: At most one value at at time can be specified. The
value may be specified either as a single string or as a block.
- *Example 1*
- cget("precision")
- *Example 2*
- cget(){ channels; }
In the rare case that Snack needs to be forcefully
notified that a sound has changed invoke the change method on the sound
object that has changed. Permissible values for
notificationMessage are
- More Used only when more samples are appended
- New Used for all other cases. (default)
Concatenates the sample data from trailingsound to the end of self. Both
sounds must be of the same format (type). Self is now the join of both data
samples
To configure (or reconfigure) a sound use configure Valid options
are
- load fileName
- reads file filename into self
- file fileName
- links on-disk file to sound
- channel channelName
- links sound to channel. Used for audio streaming or playing large files.
Audio data not loaded into memory.
- load fileName
- reads file filename into self
- file fileName
- links on-disk file to sound
- channel channelName
- links sound to channel. Used for audio streaming or playing large files.
Audio data not loaded into memory.
- rate sampleRate
- specifies sample rate, a positive integer which should supported by
hardware. For speech applicatons 16000 is most common.
- channels channelSpec
- Valid values are mono, stero, and integers greater than
0. The default is 'mono'.
- encoding sampleFormat
- The sample encoding format: legal values for sampleFormat are
_Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float,
Alaw, or Mulaw. Default encoding is _Lin16_.
- skiphead noOfBytes
- Used to skip used an unknown file header of size noOfBytes bytes
- byteorder endianess
- Valid values are littleEndian and bigEndian
- guessproperties boolean
- Valid values are 1, 0, or the ruby true and false.
Specifies that Snack should try to infer
properties, such as byte order, sample encoding format, and sample rate for
a raw file by analyzing the contents of the file.
- precision precision
- Specifies the precison of the buffer data. Valid values are single
and double.
- buffersize size
- Specifies the internal buffer size used for samples obtained through a
channel.
- precision precision
- Specifies the precison of the buffer data. Valid values are single
and double.
- debug level
- Controls the level of debugging. Valid values are 0 through 5.
Copies sample data from source_sound into self, destroying
original data. Valid options are
- start startPos
- Specifies the starting position of the section to be copied.
- stop endPos
- Specifies the ending (stoping) position of the section to be copied.
Removes all sample data outside of the range [startPos..endPos]. Self now
consists of sample data inside the range [startPos..endPos] If endPos is
nil, end of data sample is used in place of endPos
Removes all sample data inside the range [startPos..endPos]. This is the
opposite of crop
Removes and frees the storage for this sound. associated with it.
Computes the log FFT power spectrum of the sound (at the time given by the
start option) and returns a list of dB values. Permissible options include
- start position
- Sets the starting positon of the sample
- stop postion
- Sets the ending (stoping) positon of the sample
- fftlength length
- Sets the length of the Fast Fourier transform. It is the N in
S(k) = | N-1 å t=0
| s(t)exp | æ ç è | - | 2pikt N | | ö ÷ ø | |
Permissible values
are 8, 16, 32, 64, 128, 256, 512, 1024, 2048, or 4096. The default is 512.
- windowlength length
- Sets the length of window. See Windowing
- windowtype type
- Sets the type of window. Permissible types are Hamming,
Hanning, Bartlett, Blackman, or
Rectangle.
- skip pointsPerStep
- which specifies how many points to move the FFT window forward at each
step.
- channel channelNumber
- Selects which channle to use for multichannel sounds
- preemphasisfactor factor
- Sets the amount of preemphasis applied to the signal prior to the FFT
calculation (default 0.0).
- analysistype type
- Sets the analysis type. Valid values are either FFT or
LPC. FFT is the default.
- lpcorder order
- Sets the order, N, (number of poles) when using Linear Predictive Code
analysis. In particular, this assumes within a given frame the speech
signal s(t) is determined by constands b0...bN by
s(t) = | N å i=0
| bi ´ s(t - i) |
Applies the givenFilter to self. ie. Filters self by givenFilter.
Note: the givenFilter must be of type Filter Valid options are
- start position
- Sets the sample starting position where we begin to apply the filter
- stop postion
- Setssample stoping position where we cease to apply the filter
- continuedrain bool
- Specifies whether to use.. boolean
- progress boolean
- Progress callback. Acceptable values are true or false
(Also _"show"_ or _"hide"_)
Removes all audio data from self, ie empties self.
Estimates speech formant trajectories. Dynamic programming is used to
optimize trajectory estimates by imposing frequency continuity constraints.
The formant frequencies are selected from candidates proposed by solving
for the roots of the linear predictor polynomial computed periodically. The
local costs of all possible mappings of the complex roots to formant
frequencies are computed at each frame based on the frequencies and
bandwidths of the component formants for each mapping. The cost of
connecting each of these mappings with each of the mappings in the previous
frame is then minimized using a modified Viterbi algorithm.
Returns an array of frames, each frame consists of a pair of arrays:
The first member of the pair consists of an array of formant values, lowest
to highest; the second member of the pair consist of of an array of formant
band widths. Thus the return value looks like
where each Frame
i looks like
where
the F
i's are formant means and the B
i's are the bandwidths. The first
frame corresponds to a start time of half the window length. The most
recent frame is the last frame.
Valid options are
- start position
- Sets the sample starting position where we begin to apply the filter
- stop postion
- Sets sample stoping position where we cease to apply the filter
- framelength t
- Specifies the intervals between the values (default 0.01).
- numformants n
- Controls how many formants to calculate (default 4, maximum 7).
- windowlength length
- Specifies the size of the window in seconds (default 0.049).
- preemphasisfactor factor
- Specify the amount of preemphasis applied to the signal prior to windowing
(default 0.7).
- windowtype type
- Select windowing function Valid values are Cos^4 (default), Hamming,
Hanning, and Rectangle)
- lpctype t
- Valid values are either 0 (autocorrelation) or 1 (stabilized covariance).
- lpcorder n
- Specifies the order of the LPC Analysis (default is 12).
- progress boolean
- Progress callback. Acceptable values are true or false
(Also _"show"_ or _"hide"_)
- ds_freq f
- Specifies the sampling rate of the data to be used in the formant frequency
analysis (default 10000).
- nom_ff_freq f
- Specifies the nominal value of the first formant frequency. This value is
used to adjust the nominal values of all other formants and of the ranges
over which the formants are permitted to exist. The default value of 500Hz
assumes that the vocal tract length is 17 cm and that the speed of sound is
34000 cm/sec. Nominal F1 values scale directly with sound velocity and
inversely with vocal-tract length.
Gets sample value at index
Returns a list with information about the sound. The entries are [length,
rate, max, min, encoding, channels, fileFormat, headerSize]
insert applies only to in-memory sounds. It is used to insert into
the orignal sound (self) another sound (sound) at the
given position. Moreover, a range of the samples from
sound to be inserted can be specified by specifing
startPos and stopPos
Sets length of sound in number of seconds or samples (default) To set in
terms of seconds use sound.length=(n, 'seconds')
Gets the length of this sample in terms of number of seconds or number of
samples(default). To get in terms of seconds uses
len=sound.length('seconds')
Loads new sound data from a file. Synonym for "read". Valid
options are
- rate sampleRate
- specifies sample rate, a positive integer which should supported by
hardware. For speech applicatons 16000 is most common.
- channels channelSpec
- Valid values are mono, stero, and integers greater than
0. The default is 'mono'.
- encoding sampleFormat
- The sample encoding format: legal values for sampleFormat are
_Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float,
Alaw, or Mulaw. Default encoding is _Lin16_.
- skiphead noOfBytes
- Used to skip used an unknown file header of size noOfBytes bytes
- byteorder endianess
- Valid values are littleEndian and bigEndian
- guessproperties boolean
- Valid values are 1, 0, or the ruby true and false.
Specifies that Snack should try to infer
properties, such as byte order, sample encoding format, and sample rate for
a raw file by analyzing the contents of the file.
- start startPos
- Starting position of data to be read
- stop endPos
- Stoping position of data to be read
- fileformat RAW
- Sets the file format. Legatimate formats are Current supported file formats
are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command
returns the file format detected.
- progress boolean
- Progress callback. Acceptable values are true or false
(Also _"show"_ or _"hide"_)
It is possible to force a file to be read as RAW using fileformat
RAW. In this case the properties of the sound data should be
specified by hand
Returns the largest positive sample value of the sound.
Returns the largest negative sample value of the sound.
Pause current record/play operation. Next pause invocation resumes
play/record.
Returns a list of pitch values, spaced 10ms, computed using the AMDF
method. Valid options are
- start startPos
- Starting position of sample range
- stop endPos
- Stoping position of sample range
- maxpitch maxpitch
- Maximum pitch (default is 400)
- minpitch minpitch
- Minimum pitch (default is 60)
Plays the sound. Valid options are
- start startPos
- Determines the starting position of segment to be played
- stop endPos
- Determines the stoping position of segment to be played
- output outputJack
- Determines the output jack
- blocking booleanValue
- False means asynchronous playback, true means the the function doesnot
return until playback is completed.
- command callback
- The callback to be excuted at the conclusion of the playback.
- device outputDevice
- Specifies audio output device.
- filter filter
- Used to filter sound during playback.
- devicerate freq
- Used to specify alternate device rate.Rarely used.
- devicechannels n
- Used to specify alternate device channel. Rarely used.
Returns a list of windowed of log-power values Valid options are
- start startPos
- Specifies the starting position of the sample to be examined.
- stop endPos
- Specifies the stoping position of the sample to be examined
- framelength length
- Specifies interval length between values
- windowtype type
- Specifies the type of windows to be used. Permissible types are Hamming,
Hanning, Bartlett, Blackman, or Rectangle.
- windowlength length
- Specifies the number of points in each window.
- preemphasisfactor factor
- Specifies the preemphasisfactor applied to signal prior to windowing. This
Reads new sound data from a file. Synonym for "load". Valid
options are
- rate sampleRate
- specifies sample rate, a positive integer which should supported by
hardware. For speech applicatons 16000 is most common.
- channels channelSpec
- Valid values are mono, stero, and integers greater than
0. The default is 'mono'.
- encoding sampleFormat
- The sample encoding format: legal values for sampleFormat are
_Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float,
Alaw, or Mulaw. Default encoding is _Lin16_.
- skiphead noOfBytes
- Used to skip used an unknown file header of size noOfBytes bytes
- byteorder endianess
- Valid values are littleEndian and bigEndian
- guessproperties boolean
- Valid values are 1, 0, or the ruby true and false.
Specifies that Snack should try to infer
properties, such as byte order, sample encoding format, and sample rate for
a raw file by analyzing the contents of the file.
- fileformat RAW
- Sets the file format. Legatimate formats are Current supported file formats
are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command
returns the file format detected.
It is possible to force a file to be read as RAW using fileformat
RAW. In this case the properties of the sound data should be
specified by hand
- start startPos
- Starting position of data to be read
- stop endPos
- Stoping position of data to be read
- progress boolean
- Progress callback. Acceptable values are true or false
(Also _"show"_ or _"hide"_)
[Example] file = Snack::getOpenFile
sound.read(file){progress true}
Starts recording data from the audio device into the sound object. Valid
options are
- input jack
- Specifies the input jack
- device device
- Specifies the input device
- append boolean
- true means append to sound. Applies only to in memory recordings.
- fileformat format
- Use when writing data to a channel or possibly file.
Reverses a sound of self Valid options are
- start startPos
- Specifies starting position of data sample to be reversed.
- stop endPos
- Specifies stoping position of data sample.
Displays a FFT log power spectrum section of this sound onto the canvas The
first argument, canvas, specifes the canvas to which the section
of this sound will be attached. The integers x and y are
the upper lefthand coordinates of the display image on the canvas. Options
can be any of the following
- analysistype analysistype
- Permissible values are: either FFT (default) or LPC
- channel channel
- Selects the channels.For 2 channel system: -1, means both; 0 is left, 1 is
right. Default is -1.
- stop endPos
- Specifies the ending position of the sample to be display
- fftlength fftlength
- Specifies the number of FFT points must be one of the values 8, 16, 32, 64,
128, 256, 512, 1024, 2048, or 4096. Default is 512.
- fill fillColor
- Specifies the fill color
- frame boolean
- Is a boolean value, where true means a frame is drawn about section, false
means no frame and is the default.
- height height
- Specifies the height of section
- lpcorder order
- Specifies the lpc order when the analysis type is LPC. The default value is
20
- maxvalue max_dB
- Specifies the max dB displayed. Default is 0.0
- minvalue min_dB
- Specifies the min dB displayed. Default is -80.0
- preemphasisfactor factor
- Specifies the amount of preemphasis applied to the signal prior to the FFT
calculation, the default 0.0.
- skip no_of_skip_points
- specifies how many points to move the window forward at each step
- start startPos
- Specifies the starting position of the sample to be display
- stipple bitmap
- Specifies the bitmap for stipple
- tags tagList
- Specifies the tagList for canvas
- topfrequency topFreq
- Specifies the frequency at the right end of the section
- width width
- Specifies the width of section
- windowtype type
- Specifies the type of windowing function: must be one of the choices:
Hamming, Hanning, Bartlett, Blackman, or Rectangle. The default is Hamming.
- winlength size
- Specifies the size of the (hamming) window, it is required to be no greater
than fftlength Note the default is 128
See also TkCanvas#attachSection
Sets sample value at index for left and/or right channel
Displays the spectogram of this sound onto the canvas The first argument,
canvas, specifes the canvas to which the spectogram of this sound
will be attached. The integers x and y are the upper
lefthand coordinates of the display image on the canvas. Options can be any
of the following
- brightness brightness
- Determines the brightness, must be chosen from the range of [-100, 100].
The default is 0.
- channel channel
- Selects the channels.For 2 channel system: -1, means both; 0 is left, 1 is
right. Default is -1.
- colormap colormap
- Specifies a list of colors to determine the intesity. If the list is
non-empty, it must have at least 2 colors, which will be interpreted as
lowest to highest. The default is an empty list means produces a 32-level
grey-scale display.
- contrast contrast
- Determines the contrast, must be chosen from the range of [-100, 100]. The
default is 0.
- stop endPos
- Specifies the ending position of the sample to be display
- fftlength no_fft_pts
- Specifies the number of FFT points must be one of the values 8, 16, 32, 64,
128, 256, 512, 1024, 2048, or 4096. Default is 512.
- gridcolor gridcolor
- Specifies the color of the grid
- gridfspacing gridfreqspacing
- Specifies the spacing along the frequency axis value (in Hz) The default is
0, ie. no grid
- gridtspacing gridtimespacing
- Specifies the spacing along the time axis (in milliseconds) The default is
0, ie. no grid
- height height
- Specifies the height of the spectrogram
- pixelspersecond pps
- Specifies the number of horizontal pixels representing one second of
elapsed time.
- preemphasisfactor factor
- Specifies the amount of preemphasis applied to the signal prior to the FFT
calculation, the default 0.97.
- start startPos
- Specifies the starting position of the sample to be display
- tags tagList
- Specifies atagList for the canvas
- topfrequency maxFreq
- Specifies the maximum frequency of spectrogram. Default is the Nyquist
- width width
- Specifies the width of the spectrogram. Maximum of 32767 pixels.
- windowtype type
- Specifies thetype of windowing function: must be one of the choices:
Hamming, Hanning, Bartlett, Blackman, or Rectangle. The default is Hamming.
- winlength size
- Specifies size of the (hamming) window, it is required to be no greater
than fftlength Note the default is 128
See also TkCanvas#attachSpectrogram
Stops current play or record operation.
Displays the waveform of this sound onto the canvas. The first argument,
canvas, specifes the canvas to which the spectogram of this sound
will be attached. The integers x and y are the upper
lefthand coordinates of the display image on the canvas. Options can be any
of the following
- anchor anchorPos
- Specifies the anchor positon on the canvas
- channel channel
- Selects the channels.For 2 channel system: -1, means both; 0 is left, 1 is
right. Default is -1.
- stop endPos
- Specifies the ending position of the sample to be display
- fill fillColor
- Specifies the fill color
- frame boolean
- Determines whether the display is to be framed: true means a frame
is drawn about the section, false means no frame and is the default.
- height height
- Specifies the height of the waveform
- limit maxAmplitude
- Specifies the maximum amplitude to be displayed
- pixelspersecond pps
- Is the number of horizontal pixels representing one second of elapsed time.
- progress cmd
- Specifies a callback procedure: NOT IMPLEMENTED YET!
- shapefile fileName
- Specifies the filename of a file for storing/retrieving precomputed
waveform shape information. Used for speeding up display.
- start startPos
- Specifies the starting position of the sample to be display
- stipple bitmap
- Specifies bitmap for stipple.
- subsample timeStepsPerPoint
- Specifies the number of time steps between the points used for the
generation of the waveform envelope. The default is 1, which gives the most
faithful rendering, but is the slowest. Higher numbers will give quicker
results, but risk degradation of the resulting waveform.
- tags tagList
- Specifies tag list
- width width
- Specifies the width of the waveform
- zerolevel boolean
- Determines whether to draw horizontal axis. The default value is true,
which produces a horizontal line indicating 0 amplitude.
See also TkCanvas#attachWaveform
Writes sound data to a file. Valid options are
- start startPos
- Specifies the starting position of the data sample to be written.
- stop endPos
- Specifies the stoping position of the data sample to be written.
- fileformat RAW
- Specifies the format of the file to be written. Legitimate formats
currently supported are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW
binary.
- byteorder endianess
- Valid values are littleEndian and bigEndian
- progress boolean
- Progress callback. Acceptable values are true or false
(Also _"show"_ or _"hide"_)
Retrieves sample at a given index (for mono only)
- Example
- v=sound[0]
Sets a sample value at a given index (for mono only)
- Example
- sound[0]=v
Concatenates the sample data from trailingsound to the end of self. Both
sounds must be of the same format (type). Self is unchanged. Returns the
join of both data samples.