9.3 Parameter tuning of sound source separation

Problem

How should I adjust its parameters when sound source separation is suboptimal?

Solution

This section describes the settings for GHDSS , the primary module for sound source separation.

Settings dependent on pre-measured spatial transfer function

Since GHDSS separates using information on spatial transfer functions measured beforehand or calculated from microphone positions, its settings must be in accordance with those of the transfer functions concerned. Concretely, they correspond to the following set values.

LC_CONST = FULL
When transfer functions are measured properly, set this to FULL. Otherwise, DIAG.
Setting of TF_CONJ
TF_CONJ = DATABASE: Pre-measured transfer function mode
Designate a transfer function file in TF_CONJFILENAME
TF_CONJ = CALC: Calculation mode with the microphone locations
Designate microphone layout in MIC_FILENAME

Determination of the curvature of non-linear constraints

In GHDSS , the coefficient SS_SCAL, corresponding to the curvature (gradient at the origin) of a sigmoidal function, determines the performance. Increasing this curvature brings it closer to the linear constraint, whereas decreasing this curvature increases its non-linearity. Since too low a setting would result in a dull adaptation, its value should depend on the target environment.

Setting for stationary noise

Set FIXED_NOISE = true if the separation is performed in the presence of stationary noise

Initial separation matrix

You can specify the initial separation matrix by setting INITW_FILENAME. An appropriate initial separation matrix reduces the convergence time for the separation. If you do not specify a file name, the initial matrix is constructed by the transfer function file and the localization result from LocalizeMUSIC .

Step size

There are two types of step sizes, SS_METHOD for cost error, and LC_METHOD for linear constraints. The performance of both is improved by setting them to ADAPTIVE , unless the environment has been highly optimized. If you set SS_METHOD = LC_METHOD = 0 and input an initial separation matrix by INITW_FILENAME, separation may be realized by fixed beamformer. The details of each parameter are shown below:

Step size : SS_METHOD

1. FIX
If SS_METHOD is set at FIX, SS_MYU will appear in the property window. You can input a value, e.g., 0.001, in SS_MYU. A larger value reduces convergence time, while decreasing stability and accuracy. In contrast, a smaller value improves convergence stability and accuracy, while decreasing convergence time.

2. ADAPTIVE
If SS_METHOD is set as ADAPTIVE, the step size will be automatically optimized, improving the stability and accuracy of convergence.

3. LC_FIX
In this case, step size is defined by the LC_METHOD.

Step size : LC_METHOD

1. FIX
If you set LC_METHOD as FIX, you can see LC_MYU. The description is the same as that for SS_METHOD.

2. ADAPTIVE
If LC_METHOD is set to ADAPTIVE, the step size will be automatically optimized.

Discussion

See above description of solution