Class: Sound

Creates a new Sound object. Valid options are Valid options are

load fileName: reads file filename into self
file fileName: links on-disk file to sound
channel channelName: links sound to channel. Used for audio streaming or playing large files. Audio data not loaded into memory.
rate sampleRate: specifies sample rate, a positive integer which should supported by hardware. For speech applicatons 16000 is most common.
channels channelSpec: Valid values are mono, stero, and integers greater than 0. The default is 'mono'.
encoding sampleFormat: The sample encoding format: legal values for sampleFormat are _Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float, Alaw, or Mulaw. Default encoding is _Lin16_.
skiphead noOfBytes: Used to skip used an unknown file header of size noOfBytes bytes
byteorder endianess: Valid values are littleEndian and bigEndian
guessproperties boolean: Valid values are 1, 0, or the ruby true and false. Specifies that Snack should try to infer properties, such as byte order, sample encoding format, and sample rate for a raw file by analyzing the contents of the file.
precision precision: Specifies the precison of the buffer data. Valid values are single and double.
buffersize size: Specifies the internal buffer size used for samples obtained through a channel.
fileformat RAW: Sets the file format. Legitimate formats currently supported are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command returns the file format detected. It is possible to force a file to be read as RAW using fileformat RAW. In this case the properties of the sound data should be specified by hand
changecommand cmd: Specifies a procedure to be called whenever a sound property is changed
debug level: Controls the level of debugging. Valid values are 0 through 5.

Retrieve the current value of a given sound option. Permissible values for option argument are: Valid options are

load: returns the name of the file which was loaded into self
file: returns file linked to sound
channel: lreturns channel which sound is linked to (Used for audio streaming or playing large files. Audio data not loaded into memory.)
load fileName: reads file filename into self
file fileName: links on-disk file to sound
channel channelName: links sound to channel. Used for audio streaming or playing large files. Audio data not loaded into memory.
rate sampleRate: specifies sample rate, a positive integer which should supported by hardware. For speech applicatons 16000 is most common.
channels channelSpec: Valid values are mono, stero, and integers greater than 0. The default is 'mono'.
encoding sampleFormat: The sample encoding format: legal values for sampleFormat are _Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float, Alaw, or Mulaw. Default encoding is _Lin16_.
skiphead noOfBytes: Used to skip used an unknown file header of size noOfBytes bytes
byteorder endianess: Valid values are littleEndian and bigEndian
guessproperties boolean: Valid values are 1, 0, or the ruby true and false. Specifies that Snack should try to infer properties, such as byte order, sample encoding format, and sample rate for a raw file by analyzing the contents of the file.
precision precision: Specifies the precison of the buffer data. Valid values are single and double.
buffersize size: Specifies the internal buffer size used for samples obtained through a channel.
precision: Returns the precison of the buffer data. Valid values are single and double.
buffersize size: Returns the internal buffer size used for samples obtained through a channel.

Note: At most one value at at time can be specified. The value may be specified either as a single string or as a block.

*Example 1*: cget("precision")
*Example 2*: cget(){ channels; }

In the rare case that Snack needs to be forcefully notified that a sound has changed invoke the change method on the sound object that has changed. Permissible values for notificationMessage are

More Used only when more samples are appended
New Used for all other cases. (default)

Concatenates the sample data from trailingsound to the end of self. Both sounds must be of the same format (type). Self is now the join of both data samples

To configure (or reconfigure) a sound use configure Valid options are

load fileName: reads file filename into self
file fileName: links on-disk file to sound
channel channelName: links sound to channel. Used for audio streaming or playing large files. Audio data not loaded into memory.
load fileName: reads file filename into self
file fileName: links on-disk file to sound
channel channelName: links sound to channel. Used for audio streaming or playing large files. Audio data not loaded into memory.
rate sampleRate: specifies sample rate, a positive integer which should supported by hardware. For speech applicatons 16000 is most common.
channels channelSpec: Valid values are mono, stero, and integers greater than 0. The default is 'mono'.
encoding sampleFormat: The sample encoding format: legal values for sampleFormat are _Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float, Alaw, or Mulaw. Default encoding is _Lin16_.
skiphead noOfBytes: Used to skip used an unknown file header of size noOfBytes bytes
byteorder endianess: Valid values are littleEndian and bigEndian
guessproperties boolean: Valid values are 1, 0, or the ruby true and false. Specifies that Snack should try to infer properties, such as byte order, sample encoding format, and sample rate for a raw file by analyzing the contents of the file.
precision precision: Specifies the precison of the buffer data. Valid values are single and double.
buffersize size: Specifies the internal buffer size used for samples obtained through a channel.
precision precision: Specifies the precison of the buffer data. Valid values are single and double.
debug level: Controls the level of debugging. Valid values are 0 through 5.

Copies sample data from source_sound into self, destroying original data. Valid options are

start startPos: Specifies the starting position of the section to be copied.
stop endPos: Specifies the ending (stoping) position of the section to be copied.

Removes all sample data outside of the range [startPos..endPos]. Self now consists of sample data inside the range [startPos..endPos] If endPos is nil, end of data sample is used in place of endPos

Removes all sample data inside the range [startPos..endPos]. This is the opposite of crop

Removes and frees the storage for this sound. associated with it.

Computes the log FFT power spectrum of the sound (at the time given by the start option) and returns a list of dB values. Permissible options include

start position

Sets the starting positon of the sample

stop postion

Sets the ending (stoping) positon of the sample

fftlength length

Sets the length of the Fast Fourier transform. It is the N in

S(k) =

N-1
å
t=0

s(t)exp

æ
ç
è

2pikt

ö
÷
ø

Permissible values are 8, 16, 32, 64, 128, 256, 512, 1024, 2048, or 4096. The default is 512.

windowlength length

Sets the length of window. See Windowing

windowtype type

Sets the type of window. Permissible types are Hamming, Hanning, Bartlett, Blackman, or Rectangle.

skip pointsPerStep

which specifies how many points to move the FFT window forward at each step.

channel channelNumber

Selects which channle to use for multichannel sounds

preemphasisfactor factor

Sets the amount of preemphasis applied to the signal prior to the FFT calculation (default 0.0).

analysistype type

Sets the analysis type. Valid values are either FFT or LPC. FFT is the default.

lpcorder order

Sets the order, N, (number of poles) when using Linear Predictive Code analysis. In particular, this assumes within a given frame the speech signal s(t) is determined by constands b₀...b_N by

s(t) =

N
å
i=0

b_i ´ s(t - i)

Applies the givenFilter to self. ie. Filters self by givenFilter. Note: the givenFilter must be of type Filter Valid options are

start position: Sets the sample starting position where we begin to apply the filter
stop postion: Setssample stoping position where we cease to apply the filter
continuedrain bool: Specifies whether to use.. boolean
progress boolean: Progress callback. Acceptable values are true or false (Also _"show"_ or _"hide"_)

Removes all audio data from self, ie empties self.

Estimates speech formant trajectories. Dynamic programming is used to optimize trajectory estimates by imposing frequency continuity constraints. The formant frequencies are selected from candidates proposed by solving for the roots of the linear predictor polynomial computed periodically. The local costs of all possible mappings of the complex roots to formant frequencies are computed at each frame based on the frequencies and bandwidths of the component formants for each mapping. The cost of connecting each of these mappings with each of the mappings in the previous frame is then minimized using a modified Viterbi algorithm.

Returns an array of frames, each frame consists of a pair of arrays: The first member of the pair consists of an array of formant values, lowest to highest; the second member of the pair consist of of an array of formant band widths. Thus the return value looks like

[Frame₁,Frame₂,...]

where each Frame_i looks like

[[F₁,F₂,...][B₁,B₂,...]]

where the F_i's are formant means and the B_i's are the bandwidths. The first frame corresponds to a start time of half the window length. The most recent frame is the last frame.

Valid options are

start position: Sets the sample starting position where we begin to apply the filter
stop postion: Sets sample stoping position where we cease to apply the filter
framelength t: Specifies the intervals between the values (default 0.01).
numformants n: Controls how many formants to calculate (default 4, maximum 7).
windowlength length: Specifies the size of the window in seconds (default 0.049).
preemphasisfactor factor: Specify the amount of preemphasis applied to the signal prior to windowing (default 0.7).
windowtype type: Select windowing function Valid values are Cos^4 (default), Hamming, Hanning, and Rectangle)
lpctype t: Valid values are either 0 (autocorrelation) or 1 (stabilized covariance).
lpcorder n: Specifies the order of the LPC Analysis (default is 12).
progress boolean: Progress callback. Acceptable values are true or false (Also _"show"_ or _"hide"_)
ds_freq f: Specifies the sampling rate of the data to be used in the formant frequency analysis (default 10000).
nom_ff_freq f: Specifies the nominal value of the first formant frequency. This value is used to adjust the nominal values of all other formants and of the ranges over which the formants are permitted to exist. The default value of 500Hz assumes that the vocal tract length is 17 cm and that the speed of sound is 34000 cm/sec. Nominal F1 values scale directly with sound velocity and inversely with vocal-tract length.

Gets sample value at index

Returns a list with information about the sound. The entries are [length, rate, max, min, encoding, channels, fileFormat, headerSize]

insert applies only to in-memory sounds. It is used to insert into the orignal sound (self) another sound (sound) at the given position. Moreover, a range of the samples from sound to be inserted can be specified by specifing startPos and stopPos

Sets length of sound in number of seconds or samples (default) To set in terms of seconds use sound.length=(n, 'seconds')

Gets the length of this sample in terms of number of seconds or number of samples(default). To get in terms of seconds uses len=sound.length('seconds')

Loads new sound data from a file. Synonym for "read". Valid options are

rate sampleRate: specifies sample rate, a positive integer which should supported by hardware. For speech applicatons 16000 is most common.
channels channelSpec: Valid values are mono, stero, and integers greater than 0. The default is 'mono'.
encoding sampleFormat: The sample encoding format: legal values for sampleFormat are _Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float, Alaw, or Mulaw. Default encoding is _Lin16_.
skiphead noOfBytes: Used to skip used an unknown file header of size noOfBytes bytes
byteorder endianess: Valid values are littleEndian and bigEndian
guessproperties boolean: Valid values are 1, 0, or the ruby true and false. Specifies that Snack should try to infer properties, such as byte order, sample encoding format, and sample rate for a raw file by analyzing the contents of the file.
start startPos: Starting position of data to be read
stop endPos: Stoping position of data to be read
fileformat RAW: Sets the file format. Legatimate formats are Current supported file formats are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command returns the file format detected.
progress boolean: Progress callback. Acceptable values are true or false (Also _"show"_ or _"hide"_)

It is possible to force a file to be read as RAW using fileformat RAW. In this case the properties of the sound data should be specified by hand

Returns the largest positive sample value of the sound.

Returns the largest negative sample value of the sound.

Pause current record/play operation. Next pause invocation resumes play/record.

Returns a list of pitch values, spaced 10ms, computed using the AMDF method. Valid options are

start startPos: Starting position of sample range
stop endPos: Stoping position of sample range
maxpitch maxpitch: Maximum pitch (default is 400)
minpitch minpitch: Minimum pitch (default is 60)

Plays the sound. Valid options are

start startPos: Determines the starting position of segment to be played
stop endPos: Determines the stoping position of segment to be played
output outputJack: Determines the output jack
blocking booleanValue: False means asynchronous playback, true means the the function doesnot return until playback is completed.
command callback: The callback to be excuted at the conclusion of the playback.
device outputDevice: Specifies audio output device.
filter filter: Used to filter sound during playback.
devicerate freq: Used to specify alternate device rate.Rarely used.
devicechannels n: Used to specify alternate device channel. Rarely used.

Returns a list of windowed of log-power values Valid options are

start startPos: Specifies the starting position of the sample to be examined.
stop endPos: Specifies the stoping position of the sample to be examined
framelength length: Specifies interval length between values
windowtype type: Specifies the type of windows to be used. Permissible types are Hamming, Hanning, Bartlett, Blackman, or Rectangle.
windowlength length: Specifies the number of points in each window.
preemphasisfactor factor: Specifies the preemphasisfactor applied to signal prior to windowing. This

Reads new sound data from a file. Synonym for "load". Valid options are

rate sampleRate: specifies sample rate, a positive integer which should supported by hardware. For speech applicatons 16000 is most common.
channels channelSpec: Valid values are mono, stero, and integers greater than 0. The default is 'mono'.
encoding sampleFormat: The sample encoding format: legal values for sampleFormat are _Lin16_, _Lin8offset_, _Lin8_, _Lin24_, _Lin32_, Float, Alaw, or Mulaw. Default encoding is _Lin16_.
skiphead noOfBytes: Used to skip used an unknown file header of size noOfBytes bytes
byteorder endianess: Valid values are littleEndian and bigEndian
guessproperties boolean: Valid values are 1, 0, or the ruby true and false. Specifies that Snack should try to infer properties, such as byte order, sample encoding format, and sample rate for a raw file by analyzing the contents of the file.
fileformat RAW: Sets the file format. Legatimate formats are Current supported file formats are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command returns the file format detected.

It is possible to force a file to be read as RAW using fileformat RAW. In this case the properties of the sound data should be specified by hand

start startPos: Starting position of data to be read
stop endPos: Stoping position of data to be read
progress boolean: Progress callback. Acceptable values are true or false (Also _"show"_ or _"hide"_)

[Example] file = Snack::getOpenFile

sound.read(file){progress true}

Starts recording data from the audio device into the sound object. Valid options are

input jack: Specifies the input jack
device device: Specifies the input device
append boolean: true means append to sound. Applies only to in memory recordings.
fileformat format: Use when writing data to a channel or possibly file.

Reverses a sound of self Valid options are

start startPos: Specifies starting position of data sample to be reversed.
stop endPos: Specifies stoping position of data sample.

Displays a FFT log power spectrum section of this sound onto the canvas The first argument, canvas, specifes the canvas to which the section of this sound will be attached. The integers x and y are the upper lefthand coordinates of the display image on the canvas. Options can be any of the following

analysistype analysistype: Permissible values are: either FFT (default) or LPC
channel channel: Selects the channels.For 2 channel system: -1, means both; 0 is left, 1 is right. Default is -1.
stop endPos: Specifies the ending position of the sample to be display
fftlength fftlength: Specifies the number of FFT points must be one of the values 8, 16, 32, 64, 128, 256, 512, 1024, 2048, or 4096. Default is 512.
fill fillColor: Specifies the fill color
frame boolean: Is a boolean value, where true means a frame is drawn about section, false means no frame and is the default.
height height: Specifies the height of section
lpcorder order: Specifies the lpc order when the analysis type is LPC. The default value is 20
maxvalue max_dB: Specifies the max dB displayed. Default is 0.0
minvalue min_dB: Specifies the min dB displayed. Default is -80.0
preemphasisfactor factor: Specifies the amount of preemphasis applied to the signal prior to the FFT calculation, the default 0.0.
skip no_of_skip_points: specifies how many points to move the window forward at each step
start startPos: Specifies the starting position of the sample to be display
stipple bitmap: Specifies the bitmap for stipple
tags tagList: Specifies the tagList for canvas
topfrequency topFreq: Specifies the frequency at the right end of the section
width width: Specifies the width of section
windowtype type: Specifies the type of windowing function: must be one of the choices: Hamming, Hanning, Bartlett, Blackman, or Rectangle. The default is Hamming.
winlength size: Specifies the size of the (hamming) window, it is required to be no greater than fftlength Note the default is 128