HARK Document Version 3.3.0. (Revision: 9509) : SourceIntervalExtender

6.2.18 SourceIntervalExtender

6.2.18.1 Outline of the node

This node is used when wishing to output source localization results earlier. Localization results are output sooner than usual using the parameter PREROLL_LENGTH given by the user. For example, when PREROLL_LENGTH is 5, a localization result is output five frames sooner than the normal localization result output.

6.2.18.2 Necessary file

No files are required

6.2.18.3 Usage

When to use

This node is used as preprocessing when sound source separation is performed after source localization. Since a sound source is localized after sound is input, the sound source localization is slightly delayed from the time when the actual sound occurs. Therefore, the beginning of the separated sound is cut by this delay time. This node is used to prevent this problem.

Typical connection

Figure 6.45 shows a typical connection example. As shown in the figure, when wishing to separate sound based on a localization result, the starting delay of sound source separation can be avoided by inserting the SourceIntervalExtender node in between.

$\includegraphics[width=\textwidth ]{fig/modules/SourceIntervalExtender}$

Figure 6.45: Connection example of SourceIntervalExtender

6.2.18.4 Input-output and property of the node

Input

SOURCES: : Vector<ObjectRef> type. Vector of source localization results expressed as Source type are input. ObjectRef refers Source type data.

Output

OUTPUT: : Vector<ObjectRef> type. An output source location result is output soon. ObjectRef refers to Source type data.

Parameter

Table 6.35: Parameter list of SourceIntervalExtender

Parameter name	Type	Default value	Unit	Description
PREROLL_LENGTH	`int`	50	[frame]	The number of frames the localization result is output sooner by.

PREROLL_LENGTH: : int type. This value determines how much sooner the localization result is output. When this value is too small, the start of sound source separation is delayed and therefore users should set this value considering the delay of the source location method used in the following paragraph.

6.2.18.5 Details of the node

When sound source separation is performed based on a localization result without SourceIntervalExtender , the beginning part of the separated sound is cut by the processing time of the source localization as shown in Figure 6.46. In the case of speech recognition in particular, since the cut-off of the beginning part influences recognition performance, it is necessary to output a localization result beforehand with this node. This node reads ahead in each repeat for the length set at PREROLL_LENGTH. When a localization result is identified, the output of the localization result is started from the point in time when it was identified. (See Figure 6.47)

$\includegraphics[width=\linewidth ]{fig/modules/SourceIntervalExtender-detail}$

Figure 6.46: Necessity to output a source location result sooner than usual

$\includegraphics[width=\linewidth ]{fig/modules/SourceIntervalExtender-detail2}$

Figure 6.47: Operation of SourceIntervalExtender