From the multi-channel complex spectrum that is output from the MultiFFT node, generate the correlation matrix for a sound source when specified by the input flag.
None.
In what case is the node used?
The scenario for usage is same as CMMakerFromFFT node; for details refer to the CMMakerFromFFT node. The main difference is in the calculation of the correlation matrix. In the CMMakerFromFFT node, the correlation matrix is updated at a fixed period (PERIOD), but in this node it is possible to generate a correlation matrix for a specified section according to the flag value obtained from the input terminal.
Typical Examples
Figure. 6.15 shows the usage example of CMMakerFromFFTwithFlag node. The INPUT input terminal is connected to the complex spectrum of the input signal calculated from a MultiFFT node. The type is Matrix<complex<float> > type. ADDER_FLAG is int type or bool type of input, and controls the events related to the correlation matrix calculation. Event control details are given in the Module details section. This node calculates and outputs the correlation matrix between channels for each frequency bin from the complex spectrum of the input signal. The output type is Matrix<complex<float> > type, but to handle a correlation matrix, convert the three dimensional complex array to a two dimensional complex array and then output.
Parameter |
Type |
Default |
Unit |
Description |
DURATION_TYPE |
FLAG_PERIOD |
Flag-based or Frame-period-based generation |
||
WINDOW |
50 |
Number of averaged frames for a CM |
||
PERIOD |
50 |
Frame rate for renewing the correlation matrix |
||
WINDOW_TYPE |
FUTURE |
Frame selection to normalize CM |
||
MAX_SUM_COUNT |
100 |
Maximum number of normalized frames of a CM |
||
ENABLE_ACCUM |
false |
Enable averaging with past correlation matrix |
||
ENABLE_DEBUG |
false |
ON/OFF of debugging information output |
Input
: Matrix<complex<float> > type, the complex spectrum expression of an input signal with size $M \times ( NFFT / 2 + 1)$.
: int type or bool type. Controls the events related to correlation matrix calculation. Refer to the Module Description section for event control details.
Output
: Matrix<complex<float> > type. A correlation matrix for each frequency bin. An $M$-th order complex square array correlation matrix outputs $NFFT/2 + 1$ items. Matrix<complex<float> > contains rows corresponding to frequency ($NFFT/2 + 1$ rows), and columns containing the complex correlation matrix ($M * M$ columns across).
: bool type. This outputs trueif the correlation matrix from OUTPUT is updated. Otherwise false. This port is invisible by the default. To visualize it, see Fig. 6.25 in LocalizeMUSIC .
Parameter
: string type. Default value is FLAG_PERIOD. This changes the algorithm for renewing and averaging a correlation matrix. If DURATION_TYPE=FLAG_PERIOD, it renews/averages a correlation matrix based on the value of ADDER_FLAG. If DURATION_TYPE=FRAME_PERIOD, it renews/averages a correlation matrix based on a constant frame period. Refer to the Module Description section for event control details.
: int type. Default value is 50. This parameter is active when DURATION_TYPE=FRAME_PERIOD. Specifies the number of average smoothed frames when calculating the correlation-matrix. The node generates a correlation matrix for each frame from the complex spectrum of the input signal and outputs a new correlation matrix by averaging the frames that are specified in WINDOW. The correlation matrix calculated at the end is output between the PERIOD frames. If this value is increased, the correlation matrix is stabilized but the calculation cost becomes high.
: int type. Default value is 50. This parameter is active when DURATION_TYPE=FRAME_PERIOD. Specifies the frame rate for renewing the correlation-matrix. The node generates a correlation matrix for each frame from the complex spectrum of the input signal and outputs a new correlation matrix by averaging the frames that are specified in WINDOW. The correlation matrix calculated at the end is output between the PERIOD frames. If this value is increased, the time resolution of correlation matrix is improved but the calculation cost becomes high.
: string type. FUTURE is the default value. The selection of used smoothing frames for correlation matrix calculation. Let $f$ be the current frame. If FUTURE, frames from $f$ to $f+WINDOW-1$ will be used for the normalization. If MIDDLW, frames from $f-(WINDOW/2)$ to $f+(WINDOW/2)+(WINDOW\% 2)-1$ will be used for the normalization. If PAST, frames from $f-WINDOW+1$ to $f$ will be used for the normalization.
: int type. Default value is 100. This parameter is active when DURATION_TYPE=FLAG_PERIOD. Specifies the maximum number of average smoothed frames when calculating the correlation-matrix. This node can control the number of average smoothed frames of a correlation matrix by ADDER_FLAG. For this reason, if the ADDER_FLAG is always 1, only addition of correlation matrix is performed and there will be no output at all. Thus, when it reaches the maximum count of average smoothed frames by correctly setting the MAX_SUM_COUNT, the correlation matrix will be output forcefully. To turn OFF this feature specify MAX_SUM_COUNT = 0.
: bool type. Default value is false. This parameter is active when DURATION_TYPE=FLAG_PERIOD. This enables averaging a corrlation matrix with current one and past one together.
: bool type. Default value is false. When true, the frame number is output to the standard output at the time of generating the correlation matrix.
The algorithm for the CMMakerFromFFT node and correlation matrix calculation is the same. Refer to the Module description of the CMMakerFromFFT node for details. The difference with CMMakerFromFFT node is that the average smoothed frames of a correlation matrix can be controlled with the ADDER_FLAG input terminal flag.
In the CMMakerFromFFT node, the correlation matrix was computed with the following formula with the number of frames specified by PERIOD.
\begin{equation} \label{eq:CMMakerFromFFTwithFlag_ Rn} {\boldsymbol R}’(\omega ,f) = \frac{1}{{\rm PERIOD}}\sum _{i=W_ i}^{W_ f}{\boldsymbol R}(\omega ,f+i) \end{equation} | (4) |
where the frames used for the averaging can be changed by WINDOW_TYPE. If WINDOW_TYPE=FUTURE, $W_ i = 0$ and $W_ f = {\rm WINDOW}-1$. If WINDOW_TYPE=MIDDLE, $W_ i = {\rm WINDOW}/2$ and $W_ f = {\rm WINDOW}/2+{\rm WINDOW}\% 2-1$. If WINDOW_TYPE=PAST, $W_ i = -{\rm WINDOW}+1$ and $W_ f = 0$.
When DURATION_TYPE=FLAG_PERIOD, this node generates a correlation matrix based on the value of the ADDER_FLAG as follows.
When ADDER_FLAG changes from 0 (or false) to 1 (or true)
The correlation matrix returns to zero matrix and PERIOD returns to 0.
${\boldsymbol R}’(\omega ) = {\boldsymbol O}$
PERIOD = 0
where ${\boldsymbol O} \in \mathbb {C}^{(NFFT / 2 + 1) \times M \times M}$ represents the zero matrix.
When ADDER_FLAG is 1 (or true)
Add the correlation matrix.
${\boldsymbol R}’(\omega ) = {\boldsymbol R}’(\omega ) + {\boldsymbol R}(\omega ,f+i)$
PERIOD = PERIOD + 1
When ADDER_FLAG changes from 1 (or true) to 0 (or false)
Take the average of the added correlation matrix and output it to OUTPUT.
${\boldsymbol R}_{out}(\omega ,f) = \frac{1}{{\rm PERIOD}}{\boldsymbol R}’(\omega )$
When ADDER_FLAG is 0 (or false)
Keep the correlation matrix generated in the end.
${\boldsymbol R}_{out}(\omega ,f)$
Here, ${\boldsymbol R}_{out}(\omega ,f)$ is the correlation matrix that is output from the OUTPUT terminal. In other words, the new correlation matrix will be stored in ${\boldsymbol R}_{out}(\omega ,f)$ in phase C).
When DURATION_TYPE=FRAME_PERIOD, this node renews the correlation matrix only when ADDER_FLAG is 1 (or true) based on eq. ().