Problem
How should I adjust its parameters when sound source separation is suboptimal?
Solution
This section describes the settings for GHDSS , the primary module for sound source separation.
Settings dependent on pre-measured spatial transfer function
Since GHDSS separates using information on spatial transfer functions measured beforehand or calculated from microphone positions, its settings must be in accordance with those of the transfer functions concerned. Concretely, they correspond to the following set values.
LC_CONST = DIAG
When transfer functions are measured properly, it can be set to FULL. In most cases, however, a setting of DIAG is more appropriate.
Setting of TF_CONJ
TF_CONJ = DATABASE: Pre-measured transfer function mode
Designate a transfer function file in TF_CONJFILENAME
TF_CONJ = CALC: Calculation mode with the microphone locations
Designate microphone layout in MIC_FILENAME
Determination of the curvature of non-linear constraints
In GHDSS , the coefficient SS_SCAL, corresponding to the curvature (gradient at the origin) of a sigmoidal function, determines the performance. Increasing this curvature brings it closer to the linear constraint, whereas decreasing this curvature increases its non-linearity. Since too low a setting would result in a dull adaptation, its value should depend on the target environment.
Setting for stationary noise
Set FIXED_NOISE = true if the separation is performed in the presence of stationary noise
Initial separation matrix
You can specify the initial separation matrix by setting INITW_FILENAME. An appropriate initial separation matrix reduces the convergence time for the separation. If you do not specify a file name, the initial matrix is constructed by the transfer function file and the localization result from LocalizeMUSIC .
Step size
There are two types of step sizes, SS_METHOD for cost error, and LC_METHOD for linear constraints. The performance of both is improved by setting them to ADAPTIVE , unless the environment has been highly optimized. If you set SS_METHOD = LC_METHOD = 0 and input an initial separation matrix by INITW_FILENAME, separation may be realized by fixed beamformer. The details of each parameter are shown below:
Step size : SS_METHOD
1. FIX
If SS_METHOD is set at FIX, SS_MYU will appear in the property window. You can input a value, e.g., 0.001, in SS_MYU. A larger value reduces convergence time, while decreasing stability and accuracy. In contrast, a smaller value improves convergence stability and accuracy, while decreasing convergence time.
2. ADAPTIVE
If SS_METHOD is set as ADAPTIVE, the step size will be automatically optimized, improving the stability and accuracy of convergence.
3. LC_FIX
In this case, step size is defined by the LC_METHOD.
Step size : LC_METHOD
1. FIX
If you set LC_METHOD as FIX, you can see LC_MYU. The description is the same as that for SS_METHOD.
2. ADAPTIVE
If LC_METHOD is set to ADAPTIVE, the step size will be automatically optimized.
Discussion
See above description of solution
See Also
For details of the algorithms for sound source separation, see “Technical description of HARK (sound source separation)” in the HARK training session material.