9.3 Parameter tuning of sound source separation

Problem

How should I adjust its parameters when sound source separation is suboptimal?

Solution

This section describes the settings for GHDSS , the primary module for sound source separation.

Since GHDSS separates using information on spatial transfer functions measured beforehand or calculated from microphone positions, its settings must be in accordance with those of the transfer functions concerned. Concretely, they correspond to the following set values.

In GHDSS , the coefficient SS_SCAL, corresponding to the curvature (gradient at the origin) of a sigmoidal function, determines the performance. Increasing this curvature brings it closer to the linear constraint, whereas decreasing this curvature increases its non-linearity. Since too low a setting would result in a dull adaptation, its value should depend on the target environment.

You can specify the initial separation matrix by setting INITW_FILENAME. An appropriate initial separation matrix reduces the convergence time for the separation. If you do not specify a file name, the initial matrix is constructed by the transfer function file and the localization result from LocalizeMUSIC .

There are two types of step sizes, SS_METHOD for cost error, and LC_METHOD for linear constraints. The performance of both is improved by setting them to ADAPTIVE , unless the environment has been highly optimized. If you set SS_METHOD = LC_METHOD = 0 and input an initial separation matrix by INITW_FILENAME, separation may be realized by fixed beamformer. The details of each parameter are shown below

1. FIX
If SS_METHOD is set at FIX, SS_MYU will appear in the property window. You can input a value, e.g., 0.001, in SS_MYU. A larger value reduces convergence time, while decreasing stability and accuracy. In contrast, a smaller value improves convergence stability and accuracy, while decreasing convergence time.

2. ADAPTIVE
If SS_METHOD is set as ADAPTIVE, the step size will be automatically optimized, improving the stability and accuracy of convergence.

3. LC_FIX
In this case, step size is defined by the LC_METHOD.

1. FIX
If you set LC_METHOD as FIX, you can see LC_MYU. The description is the same as that for SS_METHOD.

2. ADAPTIVE
If LC_METHOD is set to ADAPTIVE, the step size will be automatically optimized.

Discussion

See above description of solution

See Also

For details of the algorithms for sound source separation, see “Technical description of HARK (sound source separation)” in the HARK training session material.