HARK version 1.1.0 Document : SaveWavePCM

6.1.4 SaveWavePCM

Outline of the node

This node saves speech waveform data in time domain as files. The difference between this and the SaveRawPCM node is only the format of the output files, that is, the wave file format has a header. Therefore, audacity or wavesurfer can easily read the output files of this node. If you want to read a waveform using the AudioStreamFromWave node, use this node instead of using the SaveRawPCM node.

Necessary files

No files are required.

Usage

When to use

The same as the SaveRawPCM node. This node is used when wishing to convert separated sound into waveforms in the Synthesize node to confirm the sound or when wishing to record sound from a microphone array by connecting it to the AudioStreamFromMic node.

Typical connection

The usage is almost the same as for the SaveRawPCM node. The only difference is the SAMPLING_RATE parameter. You can use this node by replacing the SaveRawPCM node with SaveWavePCM node in Fig. 6.7 and 6.8.

Input-output and property of the node

Table 6.5: Parameter list of SaveRawPCM

Parameter name	Type	Default value	Unit	Description
BASENAME	`string`	sep		Prefix of the name of the file to be saved.
ADVANCE	`int`	160	[pt]	Shift length of the analysis frame of the speech waveform to be
				saved in a file.
SAMPLING_RATE	`int`	16000	[Hz]	Sampling rate. This parameter is used to set its header.
BITS	`string`	int16	[bit]	Quantization bit rate of speech waveform to be saved
				in a file. Choose int16, int24, or float32.

Input

INPUT: Map<int, ObjectRef> or Matrix<float> type. The former is a structure containing a sound source ID and waveform data (such as a separated sound) and the latter is a waveform data matrix of multiple channels.

Output

OUTPUT: Map<int, ObjectRef> or Matrix<float> type. The output data is the same as the input.

Parameter

BASENAME: string type. The default filename prefix is sep_. The output filename is “BASENAME_ID.wav" when a sound source ID is attached. For example, the file names of separated sounds as a result of separating three sound mixtures is sep_0.wav，sep_1.wav，sep_2.wav when BASENAME is sep_.
ADVANCE: int type. It is necessary to make this the same as the values of ADVANCE of other nodes.
SAMPLING_RATE: int type. It is necessary to make this the same as the values of SAMPLING_RATE of other nodes. This value is used only for the header and you cannot change the SAMPLING_RATE of the A/D converter.
BITS: string type. Quantization bit rate of the speech waveform to be saved in a file. Select int16, int24, or float32.

Details of the node

Format of the files saved The files saved are recorded as Wave PCM sound data that have header information. Therefore, when reading the files, users don’t need to specify sampling frequency, track quantity and quantization bit rate. Moreover, the written files vary depending on the types of inputs as follows.

Matrix<float> type: The file written is a multichannel audio file with the same number of channels as the number of rows in the input.
Map<int, ObjectRef> type: The written files have filenames with an ID number after BASENAME, and monaural audio files are written for each ID.