FAQ

 
Note: FAQ may be updated from time to time. Last updated on December 4, 2017
We apologize for the inconvenience, but this page is currently undergoing maintenance to change the page structure. Not all categories have been classified yet.

How to use (for users)

How to create HARK network file

How to use (for developers)

Link to documentation
How to create HARK nodes

Source Compilation

An error message was displayed in CMake

 
——– Below is information about the old environments. ——–

An error message was displayed in autotools (autoconf, automake, etc)
  • In installing HARK via source compilation, how do I fix the following error?
    autoheader: command not found
    Please try:

    $ apt-get install autoconf

    and proceed with the installation via source compilation.

Ubuntu

Install or Uninstall
  • How do I perform clean uninstall of HARK?
    You can uninstall HARK packages using the following command:

    $ sudo apt purge PACKAGENAMEs

    e.g.) HARK 3.5.0

    $ sudo apt purge librasp-* libhark-* libharkio* hark-* harktool*

    You can find all package names that can be uninstalled from this page.
    Please don’t forget to remove remaining data in: /usr/share/hark-designer/userdata

HARK
HARK-Designer
  • How to use HARK-Designer?
    Please see the latest manual on HARK-Designer at:
    http://www.hark.jp/document/packages/hark-designer-en/ (English)
    http://www.hark.jp/document/packages/hark-designer-ja/ (Japanese)
  •  

  • How do I run HARK-Designer?
    You can run HARK-Designer in Linux by inputting in the console/terminal:

    $ hark_designer

    or input

    $ cd /usr/lib/hark-designer
    $ sudo node app.js

    Then, open a browser and access http://localhost:3000

  •  
    ——– Below is information about the old version. ——–

  • How do I restart HARK-Designer for version 2.3.0 and above?
    Please input in console:

    $ hark_designer stop
    $ hark_designer start
    

    You can find other options about HARK-Designer by typing:

    hark_designer help
HARKTOOL
  • Should each channel be written to its own .flt file?
    Flt is a format for simple raw data (just matrix or array of raw data). It can be N-ch format or a single channel format. But, harktool accepts only single channel flt files for TSP responses. If you have 8ch microphone array, you should have 8 flt files for each sound direction.
    The use of “wios” will be better. It (almost) synchronously plays and records a TSP sound. TSP responses are saved into N-ch wave files (*.wav). HARKTOOL also accepts N-ch wave files of TSP responses in order to generate transfer functions for localization and separation of HARK.
  • for HARKTOOL5
    ——– Below is information about the old version. ——–
    for HARKTOOL4

  • I’m trying to install HARK onto my computer, and it can’t find the file sndfile.h when installing the harktool4 and harktool4-gui packages. How can I fix this issue?
    Please do the following before installing them:

    sudo apt-get install libsndfile1-dev
  • for HARKTOOL3

WIOS
  • I am using wios to perform signal playback/recording measurements, but the following warning appears and the playback signal will be momentarily interrupted. How can I overcome this?
    WARNING: over run

    It seems that the recording has not been fully processed. Please set -Z option to 100 when running wios.

  •  

  • I recorded using wios. The start of the sample is missing which means that the sample number is below 16384. What happened?
    Data loss occurred due to failures in processing. Please increase the buffer size to prevent this from happening.
  •  

Windows

HARK
  • Why are the results processed by HARK different between Ubuntu(Linux) and Windows? I used the same microphone array in both environment.
    It is caused by AGC in Windows OS, but it can be turned off in the current version of Windows. You can try it again by turning off AGC at your own risk.
  •  

  • When I installed HARK, the HARK exe file cannot be found. What should I do?
    After installing HARK in Widnows, exe file should be found at:
    C:\Program Files (x86)\HARK\Flowdesigner\bin\
    If you can’t still run HARK, it is advised to reinstall.
HARK-Designer
  • How do I run HARK-Designer with a command-line?
    You can also run HARK-Designer via command:

    > cd C:\ProgramData\HARK\hark-Designer
    > node app.js
    

    The, open a browser and access http://localhost:3000

HARKTOOL
WIOS
  • Why are the results recorded by wios different between Ubuntu(Linux) and Windows? I used the same microphone array in both environment.
    It is caused by AGC in Windows OS, but it can be turned off in the current version of Windows. You can try it again by turning off AGC at your own risk.
  •  

  • When I tried to run the wios command on Windows, Usage was displayed. What could be wrong?
    I get the following message:

    Usage: wios [OPTION ...]
    • I tried to run the command in Windows:
      wios -x 0 -f 44100 -t 2 -o voice.wav

      Only option “-x 2” (RASP device) is supported in Windows. Please use the appropriate device.

Microphone devices
  • ASIO
    • I have created my network for recording. I am using the AudioStreamFromASIO node and the HARK-ASIO plugin was already installed. When executing the network, I encouter the following error:
      NodeNotFoundException occured, nodeName: NETWORK_CONDITION
      What should I do to fix this?
      Please make sure that the ouput terminal of the iterator node to CONDITION and not as OUTPUT.
    •  

    • In the version 1.9.9 and and later of HARK, there is no AudioStreamFromASIO or AudioStreamFromMicASIO node. How can I install these nodes?

      Because of the ASIO License requirement, the plugin for ASIO (AudioStreamFromASIO) is separated.
      You can download from this page: https://hark.jp/install/windows_2_x/

    •  

    • Can I use ASIO – compatible audio devices with HARK?
      If the device work with ASIO, then it should work with HARK. However, there is neither guarantee nor support when it doesn’t work. Please use those devices at your own risk.
  •  

Support devices (microphone arrays)

Transfer Function

 
——– Below is information about devices that are no longer supported. ——–

  • Why are previously supported devices no longer supported?
    The device no longer works with the current OS, the vendor no longer provides drivers, the product has been discontinued and is no longer available, the OS specifications have changed and there is no way to get the device to work. These are a device that we are no longer able to support. Since I don’t have the device at hand and it is impossible to check its operation, those who can solve the problem on their own should use it at their own risk. Even if you ask a question on the forum, we will not be able to answer it.
  • How can I use a microphone array device (e.g. Kinect, PS-EYE, etc…) that is no longer supported in HARK?
    Please note that there is a high possibility that it will not work with HARK versions other than those that have been explicitly supported, so you will be using an older version of HARK.
    You can use some files we created during the supported period are available at:
    https://www.hark.jp/document/supported/
    We do not support the microphone array devices listed below as we do not own any end-of-support devices. Therefore, use the microphone arrays listed below at your own risk.
Kinect
  • I would like to try to use HARK in ROS with the Xbox One Kinect. Will you be able to provide the drivers for Ubuntu and ROS?
    Unfortunately, we do not Kinect anymore and we supported only Windows 8 64bit for Xbox One Kinect using Microsoft’s SDK.
  •  

  • How do I perform simultaneous recording an playback using Kinect for Windows?
    Kinect for Windows is not supported by Linux version of HARK. Recording with wios is not supported.

    If you want to use Kinect for Windows, please download the transfer function file at:
    https://www.hark.jp/document/supported/
    We do not support Kinect anymore because we do not own Kinect anymore. So, please use Kinect and our transfer function at your own risk.

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

  •  

Microcone
  •  

  •  

FAQ before categorization

 

I have installed HARK 2.1 before, but now I want to use HARK 2.0 due to some reason. I uninstalled 2.1 first then downloaded HARK 2.0 for Windows from your website, and it shows that I have successfully installed it. But I cannot find the .exe file anywhere so I am not even able to run it. Do you have any suggestions for me to solve this problem?

It seems that some HARK program has failed to be uninstalled.
Please open Programs and Features in Control panel, and find programs of Publisher Kyoto University and Honda Research Institute Japan, Co., Ltd.. Please find all of them, and try to re-install HARK 2.0.

 

When installing HARK manually in Ubuntu following the steps in “Installation from Source Compilation,” how do I run HARK?

The execution method of HARK is only to start the network file with the .n extension with execution permission, but internally the program called batchflow is running. This is the same as python or bash, etc., only you enter script file (network file) can invoke batchflow on terminal.
e.g. ./your_network.n as same as batchflow ./your_network.n

 

How can I create my own HARK node?

Please find the instructions on how to create your own HARK node in this link: http://www.hark.jp/document/hark-cookbook-en/subsec-Advanced-001.html

 

How to set the PYTHON_PATH in Linux?

In console, please input:
export PYTHON_PATH=$PYTHON_PATH:

 

I would like to try sound localization using HARK in Ubuntu. How can I proceed with this?

Please install HARK through source code compilation as instructed in this page: https://www.hark.jp/install/linux/

 

I was able to run the test program and get the result of Figure 2.5 at http://www.hark.jp/document/hark-cookbook-en/subsec-LearningHARK-002.html
I have some questions:
1. What are the axes of X and Y?
2. What does each line (green, yellow, red) represent?
3. How do I change these results to the angle of sound source?

1. What are the axes of X and Y?
=> X-axis stands for time (sec), and Y-axis stands for angles (-180 deg to 180 deg)

2. What does each line (green, yellow, red) represent?
=> Sound sources are indicated by color. It stands for three sound sources.

3. How do I change theses results to the angle of sound source?
=> Y-axis is the angle of sound source.

 

When installing HARK manually in Ubuntu following the steps in “Installation from Source Compilation,” how do I confirm if installation is successful?

You can check the installation destination with the command below.
which batchflow
If execution file path is not displayed, there is a possibility that the environment variable PATH is not set correctly.
In many Linux including Ubuntu, if officially distributed packages installed in /usr, and installed in /usr/local if you build the source packages.
If you check the environment variable PATH and already included both /usr/bin /usr/local/bin, you will be able to launch HARK without setting anything else.
If /usr/local is not found in the environment variable PATH, you also need to check about other environment variables. The environment variables LD_LIBRARY_PATH, PKG_CONFIG_PATH contains the location of the library loading path and the path for pkgconfig’s config at build time.

[Another solution] If you want to install to /usr as same as like a Ubuntu binary package, set the following value to the prefix option by ./configure option.
./configure --prefix=/usr .......

 

I have read the information of the nodes in the HARK Document. Are there other documents that I can read to understand the nodes?

HARK is implemented on top of the middleware called FlowDesigner and understanding it may help you understand the nodes better. You may read about FlowDesigner from the following links:
http://flowdesigner.sourceforge.net/doc/nodes.html
https://sourceforge.net/p/flowdesigner/wiki/FlowDesigner_User%27s_Guide/#Tutorials_

 

Where can I find the information on your supported microphone array and where can I buy them?

The supported hardware for HARK and the seller’s website can be found at: https://www.hark.jp/document/supported/

 

Which link contains the sample networks according to the HARK wiki?

Please access the sample networks at: https://www.hark.jp/download/samples/

 

I would like to try HARK and mfcnet for recognition. Do you have required datasets and files of speech recognition on English words for HARK?

You can download the sample from here: https://www.hark.jp/download/samples/

The directory Recognition-en contains the files for speech recognition of English words.

 

I have downloaded samples-hark-1_2_0.tar.gz. Can you explain what the files are and how to use them?

Localization_000.txt is a sample output of the sample network file. SaveSourceLocation is used to generate this.

 

I am doing the localization samples from the HARK website. When I execute the network, I get the following error:
ALSA lib pcm.c:2211:(snd_pcm_open_noupdate) Unknown PCM playhw:1,0
cannot open audio device playhw:1,0 (No such file or directory)

How can I fix this?

In the DEVICE parameter of AudioStreamFromMic, please use plughw:0,1 instead of playhw:0,1.

 

I am trying the first recording sample in the HARK Cookbook, but when I execute my network, the following error occurs:
Failed to load resource: the server responded with a status of 404 (Not Found)
How can I fix this?

Please check if OUTPUT is set in the output terminal of MAIN_LOOP node of the MAIN subnetwork.

 

I am doing speech recognition using TPIP and Radio RASP but I encounter the following errors:
Control Socket Connection Error.
Can not open Wireless System.

What should I do to fix this?

Since radio is transmitted and received in RASP, it is impossible to communicate because TPIP connection format/protocol does not match. Although ALSA is compatible with HARK, it seems that wireless LAN communication is being conducted. It is better to write you own node for this purpose. Please see this link for helpful information: http://www.hark.jp/document/hark-cookbook-en/subsec-MicrophoneArray-006.html
Please download the HARK source code from: https://www.hark.jp/download/source-code/

 

Is Raspberry Pi supported by HARK?

HARK has no support for Raspberry Pi.

 

What are the supported architectures by HARK?

HARK officially support the architectures x86 and x86_64 (amd64). For arm and arm64 architecture, you may try to build using source compilation. This method was successful on Beagle Bone Black, Jetson TX 1, etc. For RPi3, it hasn’t been tried yet, but there’s a high possibility that it will succeed.

 

I am trying to install HARK-Python in Ubuntu with Anaconda but error occurs. How can I fix this?

Unfortunately, HARK-Python doesn not support Anaconda. Please install HARK-Python with Python(x,y).

 

I have installed the HARK-Python package, but I cannot see the PyCodeExecutor node in HARK-Designer. What should I do to load the HARK-Python nodes?

In HARK-Designer, please click Preferences -> Packages. You will find that the name hark-python is deactivated. Just click the button on the left, then, the HARK-Python nodes will be shown on the node list.

 

I created my own network using HARK but I encountered issues. How can I resolve these issues?

To avoid encountering errors, please make sure that you have used the nodes properly. You may find information on the nodes at: http://www.hark.jp/document/hark-document-en/

For HARK 2.3.1, you may access it at: http://www.hark.jp/document/2.3.1/hark-document-en/

 

I wanted to run my network from the command line/console, but my network file doesn’t have .n extension. Can it still be run?

Yes, it can still be run. Since HARK Designer does not add .n extension automatically, users are expected to add .n manually. You can just add .n or just leave it, although it is recommended to add for clarity.

 

When following the HARK Cookbook, there is a node called MAIN_LOOP but it’s not found in the node list of HARK Designer. Is this node no longer supported?

MAIN_LOOP is a dynamic node which corresponds to a user-created sheet. It is analogous to calling a user-defined function. You have to create a sheet for you to be able to use a dynamic node.

 

How do I execute network files in command line?

When you want to execute sample.n:
In the Linux environment such as Ubuntu, it can be executed by: ./sample.n
It is the same as .py and .pl which are script files such as python and perl.
Or you can use the batchflow: batchflow ./sample.n
You can also execute the network with preceding parameters: batchflow sample.n

In Windows, you can execute through cmd:
.\sample.n
Or
batchflow sample.n

 

When I run my network the following error occurred: Parameter TYPE is 'string' not to be 'String' type of '.def' file.
How can I fix this?

It seems that the parameter type of the node InputStream of the MAIN subnetwork is String instead of string.
Please open the network file with HARK-Designer, open the InputStream property dialog and set the parameter to string. Or you can open the network file with any text editor and replace String with string.

 

After installing the Windows versions of HARK and node.js are installed according to the instruction from your webpage and running by double clicking the “HARK Designer” icon, a web page is popped up, directing showing “127.0.0.1:3000, ” however, the web address cannot be connected, and no HARK designer GUI is shown. What could be the problem?

A typical reason is that your security software blocks to access to the port 3000. Please check the security configuration.

In command prompt:
(1) cd to C:\ProgramData\HARK\HARK-Designer
(2) remove node_modules in that directory
(3) execute npm install in that directory
Then, the npm will re-install all the dependent modules.
Please also check the version of nodejs to be v6.9.1 as prescribed in: https://www.hark.jp/install/windows/

 

I am using HARK and my own microphone array to do sound localization. But when I execute my network, no sound can be detected. What can be the cause?

Please do the following:
(1) set THRESH in SourceTracker smaller.
(2) check if all the input channels are active.
(3) check that the microphone ID of MICARY-File is the same as the channel ID of your input signal in HARK. If they don’t match, you cannot localize sound sources.

 

Where can I find the information on HARK-BINAURAL?

Please access the HARK-Binaural documentation at the following URL: http://www.hark.jp/document/2.3.0/packages/hark-binaural+/

 

How to check if HARK-BINAURAL is correctly installed in my environment?

Please check instructions at: https://www.hark.jp/packages/binaural/

 

Why does command not found error occurs when I try to run flowdesigner?

HARK-Designer is now used to replace flowdesigner in the later versions of HARK.

 

In the HARK Automatic Speech Recognition Pack from the Cookbook, you have to run: ./recog.n MultiSpeech.wav loc_tf.dat sep_tf.dat
But error occurs, what could be wrong?

The file format for the transfer function has been changed from .dat to .zip.
You may create your own zip file using the instructions at: https://www.hark.jp/packages/faq/
Then, please use the following command for running the network: ./recog.n 3sp_multi.wav output.zip output.zip

 

I have an error message when trying to use HARK in raspberry pi:
Some index files failed to download. They have been ignored, or old ones used instead.
What can I do to resolve this?

The error occurs because we do not provide binary package for raspberry pi (binary-armhf).
Please download the source code from our website or run apt-get source with Ubuntu, and compile them manually.

 

I want to use HarkDataStreamSender node in Windows to transfer data through network, but can't find the host error occurs all the time. What can I do about this?

Please check or try following three things:
(1) Check whether the parameters of HarkDataStreamSender matches to your environment, especially, HOST and PORT.
(2) Make sure that your data receiving server is listening the host/port specified in (1)
(3) Check whether your security software may block the communication or not.

 

I got this error when building a simple HARK network:
cannot open audio device plughw:1,0 (No such file or directory)
How can I fix this error?

The error message indicates that the device name of your microphone array is not 1,0. Note that the device name can differ depending on a computer.
Please check the device number of your microphone array. The following URL explains how to do it: http://www.hark.jp/document/hark-cookbook-en/subsec-LearningHARK-001.html

 

When I try to upload a network in HARK-Designer running in Linux, I get this error on the console:
Uploading failed. Error: EACCES, permission denied
What should I do?

This problem occurs because you do not have the permission to write the file in the directory. Please run hark_designer as a root user, or run sudo hark_designer.

 

I have an issue with using HARK-Python. It was working until I installed Anaconda package for scientific computation using Python.
The error is: ImportError: cannot import name MAXREPEAT
How can I fix this error?

It seems that the error is in Anaconda and not in HARK-Python. Setting PYTHON_PATH might solve the problem.

 

I encountered a problem in sending HARK data over TCP. The error message is:
error:
error: Unknown Data Recv[0]
Can not open Deivce.
Base Exception
C:\...... \ AudioStreamFromMic.cc, line 339: Device can not be initialized.
How can I fix this?

The Device means a microphone array or an audio device. Therefore, the message means that HARK failed to initialize your audio device.
If you set the DEVICETYPE = NETWORK parameter of AudioStreamFromMic, the implementation (including format) depends on the device.

 

I am learning HARK from the Cookbook. But when I run my network, the following error occurs:
LSA lib pcm.c:2211:(snd_pcm_open_noupdate) Unknown PCM plughw0,0
cannot open audio device plughw0,0
What should I do?

Please correctly set the DEVICE parameter of AudioStreamFromMic to plughw:0,0 instead of plughw0,0.

 

I am doing practice2, HARK Automatic Speech Recognition Pack. I am using TAMAGO as my microphone and my environment is Ubuntu. When I execute the network, the following error occurs:
Can not open audio device
TAMAGO

What should I do to fix this?

Please modify the DEVICE parameter of AudioStreamFromMic to the appropriate name which is in the format plughw:??.
Please see this link for more information: http://www.hark.jp/document/2.3.1/hark-cookbook-en/sec-demo-record-ubuntu.html

 

I am trying to run a network in HARK. But I encountered the following error: Network.cc line 649: no condition defined for iterator
How can I fix this?

It seems that you forgot to put the CONDITION in your iterator sheet. Open your network in HARK-Designer. In your iterator sheet, hold CTRL key and click the terminal NOT_EOF of your AudioStreamFromWave or AudioStreamFromMic. CONDITION will then be placed.

 

I am looking for a microphone that will not be too big and heavy for our robot. Aside from the recommended hardware on the HARK website, what are the other products that you have tested with HARK?

If the microphone array products shown in the HARK web page does not suit your robot, embedding microphones on the robot is a good way.

 

What are the supported microphones of HARK?

This page is the list of supported microphones. https://www.hark.jp/document/supported/

All microphones can be used both on Linux and Windows.

 

I reached out to the dev-audio technical support (the company that made the Microcone) and they told me that there was a known issue between Ubuntu 14.04 and Microcone. Which OS should I use to run HARK with Microcone?

The Microcone is working on several Ubuntu 14.04 laptops. You may use the network found in this link to test it out:
http://www.hark.jp/document/hark-cookbook-en/subsec-LearningHARK-001.html

 

I’ve tried all the examples obtainable in your webpage, and with a Kinect, they work perfectly, but with the Microcone, the Record demo records unintelligible audio, and the Separation and Localization examples don’t even recognize the input audio, and they don’t show anything. I’m using Ubuntu 14.04. What should I do?

Please do these things to check if your Ubuntu 14.04 recognizes the Microcone.

(1) Connect the Microcone to your Ubuntu 14.04 computer, and run
cat /proc/asound/cards
If the system recognizes the Microcone, you will find the Microcone in the output.

(2) Try to record a sound with the Microcone in Ubuntu 14.04 with another recording program, e.g., Audacity.

Since HARK uses the same interface, ALSA, to record sound, the reason of the problem might not be HARK but Ubuntu configuration.
These URLs will be helpful
https://help.ubuntu.com/community/AudioCapture
http://linuxconfig.org/how-to-test-microphone-with-audio-linux-sound-architecture-alsa

or Ubuntu or ALSA community will give some good advice to you.

 

Can we use HARK without “flowdesigner“?

Yes. In the newer versions of HARK, HARK-Designer is being used. You can find the documentation for it in this link: http://www.hark.jp/document/hark-designer-en/

The installation instructions for Windows and Linux is found at: https://www.hark.jp/document/packages/hark-designer-en/harkdesigner.html

 

How to run HARK on ARM processors?

HARK will need to be hacked to successfully compile it on ARM processors. Since HARK will need to run its middleware called batchflow which in the current version has a big overhead and requires a lot of memory, it is not advisable to do such migration.

 

When I run my network for sound localization, the plot window closes after execution. Is there a way to prevent the window from closing?

DisplayLocalization do not support keeping the window.
A similar function exists in HARK Python (plotSource).
You can write a python code to modify the behavior of a node: http://www.hark.jp/document/2.3.0/packages/hark-python/harkpython.html

Or, save an xml file using SaveLocation, and write a code that visualizes the locations.

 

Is there any English acoustic model available for HARK?

Unfortunately, there is no English acoustic model for HARK.

 

How can I create one wav file as output instead of four which will contain the sound from the localized source?

To achieve this, please modify your network by following:
(1) Add a new node SaveWavePCM or SaveRawPCM.
(2) Connect from the AUDIO of AudioStreamFromMic to the INPUT of the added node.
(3) Please remove MatrixToMap before SaveWavePCM. Because MatrixToMap changes a matrix data into Map<int, AudioData>, where int is a number of channels

 

For my localization network, when I tried to record two persons, it could not separate speeches properly. Why is this happening?

If the two persons are too close, the localization/separation performance degrades. Please set the two persons to stand with 90 degrees distance.
You can also change the parameter MUSIC_ALGORITHM of LocalizeMusic to SEVD because this is the most basic one.

 

After plotting the X, Y and Z position for each sound, I obtain that X and Y are correlated, but Z does not vary. Why is this so?

Since LocalizeMusic estimates only a direction, not distance. The X, Y, Z values are the points on a unit sphere. This is why the values are smaller than you expect.
The reason why Z is fixed may be the transfer function file (A_MATRIX file) you use contains transfer functions from one height.

 

I implemented the real-time network described in the localization section of the HARK cookbook using PS3 microphone but it does not process any incoming audio. The execution finishes immediately after starting. What could be wrong?

A typical reason is the following:
The LocalizeMusic node has a parameter TF_CHANNEL_SELECTION.
Since the PS3 mic has 4 microphones, it must be <Vector<int> 0 1 2 3>. If the number of channels is not matched the network, execution aborts immediately because it tries to access unexisting channel.

 

We are trying to localize a sound source using Hark, but eveytime we run our file, we get the same error:
[E] [zip_open failed] [harkio_TransferFunction_fromFile]

Since the format of transfer function has been changed, you need to use transform the file.
You can use harktool5 tool. The command is like this:
harktool-conv-tf -l OLD_FORMAT_TF.dat -o NEW_FORMAT_TF.zip

 

Using Hark Designer, I did sound localization by using DisplayLocalization node to display my results. However, I would like another way (apart from the plotting a diagram), and specifically i would prefer to get the specific value of the azimuth from the network. Is there any way to get it?

Please replace the DisplayLocalization node with SaveSourceLocation node. The node will save localization results to a file.
Please see details of this node at: http://www.hark.jp/document/hark-document-en/subsec-SaveSourceLocation.html

 

I would like to work on sound localization of our own robot. I would like to ask you whether it is possible to make our own recording system (an array of 4 mics+preamplifier+sound card) and use Hark for our purpose. If the answer is yes, I suppose we have to calculate the transfer function of the system?

Yes, you need to measure the transfer function of your system. Our youtube videos (https://www.youtube.com/user/harktv) will be helpful.
For example:
(1) How to measure the TSP response for making a transfer function: https://www.youtube.com/watch?v=9v5RUOrkyhw

(2) How to create a transfer function file: https://www.youtube.com/watch?v=_Tpn94mPtj4

 

I am trying to perform localization but when I run my network the following error occurs:
Missing Parameter: A_MATRIX
What should I do about this?

LocalizeMusic requires transfer functions for each direction to be localized, you need to give a transfer function file in parameter A_MATRIX. Double click the LocalizeMUSIC node, then you will find the parameter A_MATRIX.
Please upload the transfer function file and input its name to the parameter.
If you are using our supported microphone array, you can just download it from: https://www.hark.jp/document/supported/
If you have your own microphone array, you need to measure and generate a transfer function file.
We provide some tutorial videos: http://www.youtube.com/user/harktv/videos

 

I am doing the HARK-ROS Tutorials. But I keep getting this error:
LocalizeMUSIC. Input channel size is not matched to the steering array. How can I fix this issue?

It seems that the parameter TF_CHANNEL_SELECTION of LocalizeMusic does not match with the input channel. For example, if you are using 4 channel microphone array, you have to set <Vector<int> 0 1 2 3>. Please check this parameter.

 

How do I use the HARK Tool?

You can learn how to use the HARK Tool through this video: https://www.hark.jp/document/document-en/

 

In Harktool4, what value should be set for direct sound length? The default value is 32.

We extract a direct sound part from an impulse response for sound source localization (Basically, a recorded impulse response includes both a direct sound and reverberation. We remove reverberation). Direct sound length is the length of samples which we regard as the direct sound. It is recommended to not change the deault value.

 

In Harktool4, when choosing measurement option, the following error occurs:
Wave length(131072) is short when sync_add_freq = 8.
status(65280) Unknown error 255

What should I do?

When you create the TSP List File Template, please set TSP Offset as 0.

 

I am doing speech recognition using HARK with RASP-ZX in Windows. When I execute the network, it outputs the following errors:
RASP is not found
Invalid number of channels
How can I fix these errors?

For the error RASP is not found, please ensure that RASP device is recognized by HARK. You can confirm this in Windows by clicking HARK->Sound Device List in the Start menu. You will be able to see the list of microphones currently recognized by the system and RASP-ZX should be listed.
For the Invalid number of channels error, please ensure that the actual number of microphone channels of the device and the set value of the CHANNEL_COUNT parameter of AudioStreamFromMic matches so that the network will be executed properly.

 

I have tried the samples in your documentation for HARK but the BeamForming node cannot be found in the Ubuntu version of HARK. What might be the problem?

Since BeamForming is provided with another package, hark-sss, you need to install it beforehand:
apt-get install hark-sss
Then, in HARK Designer, click Preferences -> Packages and activate hark-sss package.

 

I find that the node BeamForming needs the result of localization as input, but what if I want to do beamforming before the localization? Is it possible for HARK to do beamforming without the result of localization?

Since the node BeamForming is for separation, it requires the localization result. In other words, which direction to form the beam.
Please see: http://www.hark.jp/document/hark-document-en/subsec-BeamForming.html

For example, you can give a direction manually using ConstantLocalization node instead of LocalizeMUSIC. In this way, you can enhance a sound from the specified direction.
However, if you are trying to do localization with the BeamForming node, it is not the node’s function.

 

I have saved source location output in a text file. What are the meaning of these coordinates?

The coordinates are relative to the center of the microphone array. (x, y ,z) indicates a point on a unit sphere (radius = 1) corresponding to the direction of the sound. To calculate azimuth, use this equation: 180 / pi * atan2(y, x)

 

Can I build my own microphone array? Is there a documentation for this?

Yes, you can build your own microphone array. The point is that you have to measure a transfer function file for your own array.
These tutorial videos will help you:
http://www.youtube.com/watch?v=9v5RUOrkyhw
http://www.youtube.com/watch?v=_Tpn94mPtj4

 

Is there a limitation on the number of simultaneous sound source separation?

Theoretically, total number of microphones minus 1 would be the maximum separable number of sound sources per frequency bin in sound source localization by LocalizeMUSIC.
(Because it calculates by taking the sum in the frequency direction, on broadband spectrum, it can happen that sound sources more than total number of microphones are recognized.) With GHDSS algorithm, total number of microphones is the maximum separable number of sound sources. Either way, these numbers are theoretical values. In practical use, in order to secure the performance, it is desirable to make the number of sound sources less than half of total number of microphones.

For the further information, please see below:
http://www.hark.jp/document/hark-cookbook-en/subsec-MicrophoneArray-001.html
http://www.hark.jp/document/hark-cookbook-ja/subsec-MicrophoneArray-001.html

 

What are the supported file format for the transfer function?

For HARK version 2.1 and above, supported file format for transfer function is .zip. For older versions, .dat is used.

 

I have generated the transfer function for my microphone, and tried the BeamForming using the sample network in the HARK documentation, however, it seems that little effect on the output data. What could be wrong?

Please do the following:
(1) Please verify the value in ConstantLocalization. Using localization results of your network for the value will improve the performance because it is the real position.

(2) Simulated transfer function’s performance is basically limited. You can try real TSP recording for transfer function calculation.

(3) BeamForming does not completely separate the sounds (and any methods cannot do it completely). Please try other methods using BF_METHOD parameter in BeamForming node. Or, use GHDSS node instead.

 

I am trying to generate TF for localization but I encountered the error: Can't open all the flt data. Read sice 501
What should I do?

The problem stems from the number of samples in flt files. It seems that you saved only 500 samples in your file. The length of the impulse response at least equal to the length of the FFT. If you set 1024-points FFT, please save at least 1024 samples in flt files.
Please change from: imp = IMPULSE(1:500,i); % select the RI of the microphone i
to: imp = IMPULSE(1:1024,i); % select the RI of the microphone i

 

What is the dimension of ei (ω, f )? It should be [4×1], if the n mics is 4. Is this correct?
I have got my eigenvectors in a [4×4] matrix. If the summation in the denominator is performed along the noise subspace — i.e from (Ns+1) to M, and if we assume that Ns = 1, then should I take the 2nd, 3rd and 4th columns of the [4×4] matrix for matrix multiplication followed by summation?

Yes, that is correct.

 

Kindly explain if the matrix obtained from numpy.linalg.eig is the matrix that I am supposed to use in the denominator of the Power spectrum calculation.

Yes. Since the matrix is a Hermitian matrix, the right and left singular matrices are the same.

 

The matrix [e1( ω,f ), e2( ω,f ), …, eM( ω,f )] is the [nmics x nmics] eigen matrix. My doubt here is whether I have to rearrange the eigen values and vectors in a descending order of the eigen values. I did not understand this very well from the HARK document. If yes, how do I make a corresponding change in choosing the correct eigen vectors corresponding to the eigen values (if they are sorted in descending order of values)?

Look at the all eigen values {lambda1( ω,f ), lambda 2( ω,f ), …, lambda M( ω,f )}. If lambda 2( ω,f ) > lambda 1( ω,f ) > lambda 3( ω,f ), reorder [e1( ω,f ), e2( ω,f ), e3( ω,f )] as [e2( ω,f ), e1( ω,f ), e3( ω,f )]. That’s what the document means.

 

I have seen that the lambda values are complex. Is this correct?

It is not correct since the correlation matrix must be Hermitian.

 

Do the very small imaginary parts of Eigen values change the localization results drastically?

Basically, you can make the imaginary part of the eigen values zero.

 

What is the meaning of “Number of elements of distance (Nr)”? Is this the distance from the source to the microphones?

Yes, it means the distance from the source to the microphone array. However, we do not have multiple distance estimation, so Nr should be 1.

 

What does the Nd and Nr data mean? Is the Nd data an angle in degrees?
As far as I can see, the MUSIC algorithm only checks for the direction i.e 2-D search. If Nr is the distance (the 3rd dimension) from the microphones, how is it being used in the algorithm?

Nd : Number of azimuth
Nr : Number of distance (should be one)
Please note that we support only 2D search (azimuth and elevation). Nr is not used currently.

 

I am having trouble creating a tsp file for the kinect v2 in the Windows installation. The sensor works, but I need still need a configuration file for localization. I found the following webpage: https://www.hark.jp/document/supported/
This has files for the kinect v2. But the LocalizeMUSIC algorithm does not appear to work using any of them. I also tried using the HarkTool4 program to generate a new file from geometric calculations, but when I tried to generate a new source list file and use that for making a tsp file, the program crashes. What should I do?

If you are using HARK ver 2.1 or later, you have to use a zip format for transfer function. However, if you are using older version, you have to use a .dat file (an original binary format) for localization. Also, please make sure to download the appropriate transfer function from the link.

How can I terminate HARK externally, for instance, using a socket?

It can be done by creating a node which has a “bool” OUTPUT terminal. Its default value should be “true”, and it turns to “false” when you want to terminate HARK.
Place the node in the panel, and set CONDITION to the OUTPUT terminal. The node can be implemented in C++, or Python.

 

What is the purpose of setting “Length of one TSP [sample]” for Harktool?

It will skip N samples from the start of the recorded sound.

 

How many microphones/microphone arrays are required to use a SemiBlindICA node?

SemiBlindICA needs one channel for reference. One channel of a microphone array should be used as this reference channel because a network file of the current HARK can connect only with one microphone array.
For example, RASP-24,which has 8ch Mic-in and 1ch Line-in (reference) can record 9ch audio data synchronously using AudioStreamFromMic. However, it provides the 9ch audio data using just one AUDIO terminal, and you need to separate the 9ch data into a reference channel and the others using ChannelSelector, because SemiBlindICA has two INPUT terminals such as REFERENCE and INPUT.

 

I would like to follow the tutorial you provided for measurement of the Transfer Function and I would like to ask you whether I can use my smartphone instead of a loudspeaker. Is it necessary to use a loudspeaker or can I use a simple speaker?

Of course, you can try with a smartphone, but it is recommend to use a loudspeaker because a smartphone is difficult to play a loud TSP sound, and its speaker’s frequency characteristics are not flat compared to a loudspeaker.

 

I have tried creating the transfer function using harktool4 in Ubuntu 14.04. I created my own TSP and the mic array loc. But when I proceeded in the creation of the transfer function, some error occurs. How can I fix this?

In Ubuntu, please try to
(1) replace the file names tsp and mic to tsp.xml and mic.xml
(2) do the same TF generation steps.

 

Which do I use in creating the transfer function file in Harktool4, Measurement or Calculation?

For creating transfer function files using Harktool4, you should use Measurement.

 

I am using Harktool4 to create the transfer function but I am getting this error:
harktool4_cui error [Localization transfer function can not create]
What might be the problem?

Please check the following before clicking create:
1. Wav file name and the name written in your xml file are matched
2. Actual number of microphone and channel matches the value of NumMic

 

Can you please explain the meaning of “MP2G version (major, minor)” in the transfer function?

M2PG is the format for the transfer function database for sound source localization.

 

How can I determine the value for Synchronous Averaging?

In synchronous averaging, a certain length of the data from the beginning which is specified by Length of one TSP is excluded, and thus, Synchronous Averaging should be determined by (total recorded data length – Length of one TSP)/Length of one data.

For example, you have a TSP of 16384 samples, and recorded it 10 times repeatedly without any offset. Then, you should set the value of “Synchronous Averaging” to 9 instead of 10, and set the value of “Length of one TSP” to 16384.

 

In the Transfer Function for Localization in “harktool”, what does Direct Sound Length mean exactly and what is the difference between this variable and IR Length?

“Direct Sound Length” means the sample length of the direct sound we consider for sound source localization.

“IR Length” is the sample length of the TSP signal which you used for generating TF for separation.

 

Is the parameter TF_CONJ_FILENAME of the BeamForming node the same with the A_MATRIX parameter for LocalizeMUSIC?

Yes. The file name of the TF file is set for these parameters.

 

Is the provided TF for TAMAGO for 3D?

No, it is for 2D or Azimuth only. If you unzop the file and check the source.xml, all the z coordinates defined in the position tag is 72.

 

When I perfromed sound source localization using the transfer function I generated, the localization result displays only around 0 degree. It is flat for every sound source. What could be wrong?

The reason for this result is in the geometric calcuation of the transfer function. It is assumed that there is no change in power for each frequency.

 

Since the file format of the transfer function is changed from .dat to .zip, how can I convert my transfer functions to the zip format? Do I need to generate my transfer function again using the Harktool?

CLIs are available for the newer versions of HARK. There is a transfer function conversion tool in it. It is harktoolcli-conv-tf. Please see this link for more information:
https://www.hark.jp/packages/faq/

 

I have generated the transfer function for my microphone. How can I confirm if it is correctly generated?

One way to check if the transfer function is correct is to looking at the contents directly. Please unzip the file and open the .mat file under the directory of localization or separation. Please refer to this page to see if the file conforms with the correct format: http://www.hark.jp/document/hark-document-en/sect0034.html

Another is to use the transfer function for localization, separation, and recognition. If there are strange results , it might be that the transfer function is incorrect.

 

Is the transfer function file generated by harktool recognized as a file associating microphones with impulse responses?

The transfer function file associates the position of the speaker with the impulse response. Coordinate information is included in the TSP List File in harktool 4. The file that associates the coordinates with the impulse response is the transfer function file. In addition, since multi-channel recording is performed for each recording. One multichannel recording corresponds to one TSP recording position (= speaker position) (not ch number).

 

How do you indicate the number of samples for the impulse response when generating the transfer function?

Please see more information on this from the link: http://www.hark.jp/document/hark-document-en/subsec-HARKTool-ImpulseResponse.html

Transfer functions are created using samples from the start index to the end index. Using the sample from the start index (first input field) to the end index (second input field), the transfer function will be created.

 

I am using the Harktool to generate a transfer function. What measurement signal and inverse filter will be used as input to the Harktool?

Please refer to this document on how to use Harktool: http://hark.jp/document/hark-document-en/sec-harktool.html

This video will also be helpful:. https://www.youtube.com/watch?v=5ThC5zEn98M

 

What is the difference between the transfer function for localization and separation?

The difference between the two transfer functions is not explicitly written in HARK Document. The sample number of impulse responses to generate each transfer function is different. The localization transfer function uses only direct sound whereas the separation transfer function uses all samples, both direct sound and reflected sound. To generate the localization transfer function, use only [Direct Sound Length] (32 samples by default) displayed after clicking on the Localization button of harktool4. This is why two transfer functions are needed.

 

I would like to create a transfer function. When creating the Micarray Location File, how do I create the location information file when the microphone array is placed horizontally in a row?

Harktool only supports circular microphone arrays. But you can mannually edit the contents of <X>, <Y>, and <Z> tags of the created microphone array file. Please refer to the following page for the coordinate system:
http://hark.jp/document/hark-document-en/sect0028.html

 

Is there a tool to extract the contents of the transfer function file? What type of information processing should be done to obtain meaningful data from the transfer function file?

Unfortunately there is no tool available to visualize the transfer function. The information of the format of the transfer function file is found in the page below which may be of help in the analysis:
http://hark.jp/document/hark-document-en/sect0031.html

 

What is the transfer function calucaulation process and the coordinate system used?

Please see the tutorial video of the transfer function calculation for answers:
https://www.youtube.com/watch?v=5M3pS1WUHec

 

Is the transfer function for robot NAO available in HARK website?

No, the transfer function for robot NAO is not published in the HARK website.

 

I have created my own transfer function for PSEye. When using the official transfer function provided in the website, there are peaks in the time and direction at which the voice came out, but when using my own created transfer function, peak exists even if there is no voice. What should I do fix my transfer function?

It can be the case that at the beginning of the wav file recorded for transfer function generation has a silent part because the recording failed. You will need to redo recording. Delay the start of audio playback and increase the TSP offset by that amount so that it will ignore that part.

 

I am creating transfer function for PSEye. In Harktool4 under TSP, what values should I set in the Azimuth, 180 – 175 or 0 – 355?

Either one is fine. The point is whether the direction matches the path written in the created TSP file. The path is a character string in /hark_xml//File/Name. The direction is each number (Cartesian coordinate system) in /hark_xml//Position/{X|Y|Z}.

 

In creating transfer function, should the position of the receiving part of the microphone be considered?

When the transfer function is created from real recording data such as TSP signal, microphone location information may be appropriate. (Because the information on the positional relationship between the microphone and the sound source is contained in the recorded data)
When the transfer function is performed by geometric calculation using a propagation wave model, microphone location information must be accurate.

 

When creating a transfer function, does the number of microphones affect it?

In sound source localization, the MUSIC method is used. It is recommended to use at least 4 microphones for better performance.

 

I am trying to generate TF for my microphone array. I am using measured response impulse: the elevation is 0 and the azimuth is from 0° to 355° with 5° step (so 72 angles), and I am using 16 microphone array (kinda circular). I converted the measured response into a “float binary” format as it explained in the HARK document. I have also generated the microphone array location file. But when I generate the transfer function for the separation using the same impulse response, it is different with that for the localization. Why is this so?

In HARK, only the direct path of sound sources is used for sound source localization, while all paths are used for separation. That is why TF for localization and separation are prepared separately.

In your case, please check if the flt files are correctly generated. The TF for localization and separation should not be totally different.

 

Since the transfer function is a 4D array of dimensions (72 x 1 x 4 x 257), how is the multiplication being done? What is the dimension of the resultant matrix after multiplication? *Since the P(Θ, ω, f ) is calculated for each angle and frequency (and for a collection of ‘PERIOD’ number of frames), is it a [1 x nmics] matrix multiplied with a [nmics x 1] matrix resulting in a [1 x 1] number?

Yes. It is just a inner product of two complex vectors. {[1 x nmics] complex matrix} x {[nmics x 1] complex matrix}, resulting [1 x 1].

 

I am using HARK with PS Eye. I am doing sound localization and it can show the positions of the speakers. But if no person is speaking, there’s a continuous track at around 0. I tried to set the THRESH parameter of SourceTracker until this track disappears. As soon as it had disappeard I kept the threshold, but no track has been found anymore as soon as the person starts speaking again (even if shouting or > speaking very close to the mic array). When again decreasing the threshold, the continuous track at 0 degrees is there again (even if no one is speaking), which again disappears when there’s > actually sound (which again delivers correct localization tracks), etc. What should I do?

There are 2 possible reasons about this issue.

(1) Your microphone device problem
Please check if your device can record audio stream of all the channels correctly. Use Audacity (or other substitute software) and check whether all 4 channels can record audio stream. If some of them are missing, you don’t see any result.

(2) Clipping of audio stream
If you speak in too loud voice with too close distance, the input audio stream would be clipped. Then the sound source localization doesn’t work at all. Please speak farther from pseye with “kindly” loud voice. The transfer function is created with the assumption that the speaker is around 1 meter far from pseye. If you do so with nicely adjusted threshold, you would see the result.

 

We have used Microsoft Kinect with HARK for sound localization and separating and it has small files. It works well, but Kinect is large for embedding.
We focused on the Playstation Eye Camera, but sound localization and separation works well in the presence of noise. Without noise, obtained files are large and not appropriate for voice recognition.
How can we resolve issue with Playstation Eye Camera to output smaller files?

Firstly, please make sure that you are using the transfer function of PlayStationEye. Check A_MATRIX of LocalizeMUSIC, and TF_CONJ_FILENAME of GHDSS.
These files can be found in:
https://www.hark.jp/document/supported/

Next, please check the MUSIC Spectrum, a kind of an indicator of sound existence for each sound. You can check it by setting the parameter DEBUG of LocalizeMUSIC to true.

 

Is it possible to plot TF, which is already generated, without tsp records or required xml files by using Harktool4? I want to cross-check the TF plots of the PSEye given in the website of HARK and compare it to my own recorded and generated version of TF.

The current version of Harktool4 does not support plotting the loaded TF files. You can use Harktool clis to get the text file version of equivalent data.
Installation:
sudo apt-get install harktool5
How to use:
[1] frequency domain
harktoolcli-zip2coord -t LOC -z amp -i microcone_rectf.zip -o out.csv
[2] time domain
harktoolcli-ifftTF -t LOC -z amp -i microcone_rectf.zip -o out.csv
For both commands:
-t option can be LOC or SEP. This decides which TF to output:
localization or separation.
-z option can be amp or db or phase. This decides the unit of z axis of 3D plotting.

 

I would like to ask your help about the Sound Transfer function as in the document: If generating transfer functions by geometry calculation (simulation), the recorded files are not necessary, and only the location information is used as the location of virtual sound sources.
Would you please help to provide the instruction on how to generate the transfer function by geometry calculation? And would you provide a sample for tsp file (http://www.hark.jp/document/2.3.1/hark-document-en/sec-TspListFile.html ) and localization transfer function (http://www.hark.jp/document/2.3.1/hark-document-en/subsec-Localization.html)?

Please find the TF files made by geometry calculation from the HARK web page.
For example, https://www.hark.jp/document/supported/
The links under By Geometrically Calculation contain the TF files that you want, and those under Location of Microphones is the microphone location files.
To create transfer function, you can generate one by using harktool4.
See the following website to create it:
https://www.youtube.com/watch?v=5M3pS1WUHec
https://www.youtube.com/watch?v=cru2Ah5-W-Y

 

How can I integrate HARK-Python with HARK-ROS?

For HARK Python, you can run a python code from the node. See the following url: https://www.hark.jp/document/2.4.0/packages/hark-python/harkpython.html
or in python interpreter, run:
import harkpython
help(harkpython)

for detailed explanation.

 

I am trying to generate a Transfer Function using Harktool4 but I encounter the error:
The file has a invalid type for making TF. Ignoring:
/mic_array
The file has a invalid type for making TF. Ignoring:
/mic 0
Need a mic array location file for making TF

How can I overcome this?

Since mic_array and mic 0 are XML formats, please specify the file with extension when you set file path in the GUI of Harktool4.

 

I am trying to generate a Transfer Function using Harktool4 but I encounter the error:
Terminate called after throwing an instance of 'Glib :: ConvertError'
Abort (core dump)

What should I do to fix this?

Please rebuild the TSP list file and try generating the Transfer Function again.

 

I am trying to generate a Transfer Function using Harktool4 but I encounter the error:
Wavfile is not found.
What should I do to fix this?

Please do the following:
1. Check if the indicated file path of the wav file is correct
2. Check if the number of channels of the wav file matches what is indicated in the TSP.xml

 

I would like to implement a Python script to read the transfer function. Is it right that I read first the real value with 4 bytes of float, and then the imaginary value with 4 bytes of float, and then make (real + imaginary*j) into each of the values in the 4-D matrix? Are the real and imaginary parts stored in sequence for each of the values in the transfer function?

Yes. It is in the sequence.

 

I would like to ask if there is a way to increase the duration of the output matplotlib window.

The python code is installed the standard python path.
In python interpreter, do this
import harkpython
print harkpython.__file__
then, you can find where the python codes are.
You can find and modify plotSourceCode.py to change everything.

 

Can I use HARK-Python and HARK-ROS simultaneously?

Yes, you can use both HARK-Python and HARK-ROS in the same network file.

 

I am doing sound localization using HARK and Kinect. Currently, human voice can be localized. But localization of tone sound (sine wave) made by specifying frequency does not go well. What can be the problem?

LocalizeMUSIC has parameters related to the range of frequencies used for localization:
LOWER_BOUND_FREQUENCY
UPPER_BOUND_FREQUENCY
http://hark.jp/document/hark-document-en/subsec-LocalizeMUSIC.html
See Table 6.26 of.

By default, the frequency range 500 Hz – 2800 Hz is used which is the approximate human voice. Therefore, if the tone sound is out of this range, the sound is not being used for localization.

 

In the transfer function part of the M2PG chunk of the kinect_loc.dat file, I would like to know the number of *float data-types* to be input in each row. Since the transfer function is a 4D array — the dimensions are going to be 72 x 1 x 4 x 257 (i.e Nd x Nr x nmics x (fftlength/2 +1)), — I will have to read 257 complex values in sequence. Is this correct?

This is right. complex<float> has two float data-types. In total, 72 x 1 x 4 x 257 x 2 float data.

 

I have been using the Kinect transfer function file to upload at the A_MATRIX parameter file in the LocalizeMUSIC node. Isn’t the transfer function matrix file dependent on the environment, such as the acoustics of the room etc.? How is it possible to use a STANDARD transfer function file (like as provided in the website)?

Your understanding is correct. Since the transfer functions are measured in a “normal” room, not anechoic but not very reverberant room, it will work in normal rooms that has small mismatches. These files are convenient for HARK beginners because they will already work in the usual environment. However, the performance degrades from its best since the acoustics of the rooms obviously mismatches, Therefore, if you measure the transfer functions in your room, you can improve the localization/separation performance.

 

I am using Harktool4 to create transfer function for kinect. The error occurred in the step of generation of the localization transfer function file in both cases of using tsp_list1.xml and using tsp_list2.xml. The harktool4 gives the error message:
Wave length (163840) is short when sync_add_freq = 10. status(65280) Unknown error 255
How can I overcome this error?

The reason of the error is TSPOffset in TSP list file. TSPOffset represents how many zero values are inserted at the beginning of the played TSP file. Since the default is 16384, harktool4 requires 16384 * 11 samples for each .wav files.
(10 TSP repetition + 16384 samples)
Because your TSP file starts at the beginning of the wave file, you have to set TSPOffset to zero.
You can find it in the Setting2 of TSP List File Template Create.

 

What is the ordering of the 72 azimuth angles in the transfer function matrix are in the M2PG chunk in the kinect_loc.dat file in terms of the HARK standard coordinate system?

The order is also stored in the transfer function file. the name is Coordinate of source position
http://www.hark.jp/document/2.3.0/hark-document-en/subsec-ZipFormat.html

 

Could you please suggest us which version of HARK must be used for Real time Audio Processing. We are thinking of using Kinect or Array of mics to get the audio input. Which modules are suitable for this purpose?

Sound source localization and separation functions will be appropriate modules. Please see more information from the link: https://www.hark.jp/document/document-en/

 

I am doing experiments of sound source localization using HARK and Kinect. Therefore, to analyze the voice LOWER_BOUND_FREQUENCY of LocalizeMUSIC node and UPPER_BOUND_FREQUENCY, it is necessary to analyze the voice data after passing through the filter. Is there any way to think about getting the above data? Connecting the SaveWavePC node directly to the LocalizeMUSIC node resulted in an error indicating that the type is incompatible.

Since LocalizeMUSIC is a node that performs sound source localization, the input is voice, but the output is the direction of the voice, not the voice signal. Please read more about it in this link:
http://hark.jp/document/hark-document-en/sect0027.htm
Therefore, SaveWavePCM which receives and saves the audio signal causes an error.
Next, LOWER_BOUND_FREQUENCY and UPPER_BOUND_FREQUENCY are used as frequency band in the internal processing that localizes the sound source. These are parameters of LocalizeMUSIC which affects the sound source localization result. Sounds with frequency components only in that range are being outputted.
Although bandpass filter is not provided by HARK, for FlowDesigner, the middleware used by HARK, there is a DSP node.
If you use this it may work. Please see more information at:
http://flowdesigner.sourceforge.net/doc/nodes.html#FIR

 

I am trying to run the sound localization using HARK and Kinect in a normal working environment, the two speakers are in a radial distance of about 1.5 m and in an angular distance of about 50 degrees around the center, they are talking with their normal conversation voice. It turned out that your sample network (with adjusted SourceTracker threshold) fail in locating (and separating) the speakers correctly under the circumstances above. Why is this so?

When impulse responses of a Kinect was recorded, the speaker and Kinect are placed at the same height. Also, the speaker is positioned 1 meter from the Kinect. This means that the best performance is when the speaker is talking to Kinect with the same height and 1 meter far from Kinect. Also, it’s better facing to Kinect when you talk.

 

Do you think that it is possible to achieve separation of two speakers using HARK and Kinect under the conditions above in a quality suitable for speech recognition?

It is possible but not easy. It depends on speaker location, reverberation time of the room, and the power of your back ground noise, etc. Simultaneous speech recognition with only 4 microphone is still a challenging work.

 

Where can we find the sample wav files for Kinect for creating the TF?

Unfortunately, wav files for Kinect are not available.

 

I have a question about field of view of LocalizeMUSIC node. I understand we can set the min and max of source direction by setting the MIN_DEG and MAX_DEG parameters. I am using Kinect for sound source localization. My question is can I get any sound direction behind of the Kinect? i.e. is it possible to have (-180 , 180) field of view? Right now i am able to get sound direction from (-90, 90) field of view.

Since Kinect is a linear microphone array, sound sources in front and back of the Kinect are geometrically symmetrical.
So, the localization of both for front and back sides is difficult.

 

What is the meaning of complex<float>? Is this a 4D array? In the kinect_loc.dat file, what is the dimension of this transfer function?

Yes. 4D array of complex.
The format of kinect_loc.dat file is written in:
http://hark.jp/document/hark-document-en/sect0026.html

So after the specific header, 4D complex matrix with dimensions 72 x Nr x 4 x 257 is written.

 

I am doing speech recognition and I have noticed that the THRESH value should be reduced if the sound made is louder. Why is this so?

THRESH is not the threshold of the volume itself.
LocalizeMUSIC compares the correlation matrix of each of the input speech and the transfer function. It outputs its proximity. Please see this link for more information:
http://www.hark.jp/document/hark-document-en/subsec-LocalizeMUSIC.html

The important thing is to compare with the transfer function, so it does not get bigger as the volume increases. In this case, as sox forcibly increases the number of channels,it can be inferred that increasing the volume decreases the closeness.

 

I am doing speech recognition using PSEye. Even if I change the THRESH, there are times that it cannot recognize. Is there another way to change the sensitivity of recognition aside from changing the value of THRESH?

For speech recognition by HARK,
(1) sound source localization, (2) sound source separation, (3) speech recognition
It is done in this order.
Since the result may not appear in the localization system,
(1) sound source localization seems to be the point of the problem now.

If adjustment of THRESH does not give sufficient localization accuracy, it may be good to remeasure the transfer function.
THRESH is not actually just looking at the volume. It is more of sound likeness in that direction compared with the previously measured transfer function. Therefore, if the size of the room or the strength of the reverberation changes significantly, the accuracy drops.

The following video is a tutorial on the measuring method: http://www.youtube.com/watch?v=9v5RUOrkyhw

 

It seems that we could not get the source position when making a sound at right above the mic. The sound must have been recognized as ambient noise since each microphone gets almost the same amount of volume. Is there any way to know if the sound comes from right above the mircrophone?

The localization transfer functions that are distributed in the HARK website are transfer functions which are measured circularly distant from the microphone array approximately 1.5 to 2 meters. The source position cannot be obtained because it hasn’t been measured in the first place. (= every direction of MUSIC spectrum value is small )

If you want to determine whether the source position is right above or not, you can check the difference in arrival time since always having the same value for the difference in arrival time is its characteristic.

 

I am doing sound localization but I am getting null ouput and no graph for the localization is displayed. What could be wrong?

Please check the following on your network:
1. Threshold parameter is set correctly
2. Appropriate transfer function (TF) for the microphone array is used

 

How do I calculate for the threshold?

Threshold is manually tuned.

 

I am doing localization from two sources but why is it getting a lot of noise?

If you changed the parameter NUM_SRC of LocalizeMUSIC, a node for sound source localization, the whole MUSIC spectrum increases. Therefore, the appropriate threshold changes.
Even if you did not change it, it can happen because of the tracking algorithm in SourceTracker. Once the MUSIC spectrum power becomes larger than THRESH in SourceTracker, it keeps the sound source for PAUSE_LENGTH even if the sound is not localized. This works for connecting the localization results of one continuous speech including some short pauses. However, this also lengthens the short noise sound.

 

Is HARK capable of performing 3D sound source localization? Do I need to create a new transfer function for that purpose, or change anything in the LocalizeMUSIC parameters? Is the distance somehow mathematically related with the source power value?

Determining the distance of a sound with one microphone array is difficult for LocalizeMUSIC because it uses a kind of time difference of arrival for each microphone. Since it changes only slightly when the sound source gets far by
keeping its direction the same, LocalizeMUSIC is incapable of estimating the distance of a sound.

It might be possible to localize in 3D if you create a new transfer function with different distances.

The distance is mathematically related with the source power value. The power value of the output of LocalizeMUSIC is MUSIC spectrum. It indicates how likely the sound is from the place where the transfer function is measured. Therefore, as the sound source gets far from the microphone array, the power decreases because the sound propagation differs from the measured one.

See the following web page for the details.
http://hark.jp/document/2.3.1/hark-document-en/subsec-LocalizeMUSIC.html

 

How do I compute for the azimuth angle from vector?

The formula is:
azimuth = atan(y/x)

 

Is there a robust method to get the THRESHOLD value for the SourceTracker node?

There is no robust method enables to determine a value for each parameter automatically. However, there are some approaches based on technical know-how to adjust threshold when input data are known. Since the procedures by those approaches require human judgements, theycannot be automated. Once the threshold is determined, the value can be kept as fixed oneunless the environment and configuration of the mic are not changed.

Example #1:
Taking the intermediate value of the mean Spectrum ofsoundless(voiceless) section and the mean Spectrum of sound(voiced)section.

Example #2:
Taking a value in between *the “MINIMUM” Spectrum of sound(voiced) section and *the “MAXIMUM” Spectrum of soundless(voiceless) section as the threshold.
*except values deviate from typical value in the whole recording

Example #3:
When data in both soundless(voiceless) section and sound(voiced) sectioncannot be obtained separately, the value needs to be lower little bylittle according to Spectrum appear in sound(voiced) section in order todetermine the threshold.

You may also look at the information on how to tune major parameters at this link:



 

I have followed the instruction in flowdesigner-0.9.1-hark/compile-macosx.txt. I have successfully compiled all the v2.1.0 HARK binaries on Mac (Yosemite) with latest Homebrew. However, when I run the demo in package HARK_recog the program crashed in hark.tlb. What is the stable version of HARK for Mac?

There is no stable version yet for Mac. For the crash, one reason can be the install path is in a case insensitive drive.

 

How do I run a network file in command prompt or console?

In command prompt or console, please do:
1. cd <directory of network file>
2. batchflow YOURNETWORK.n

 

Batchflow compiled from source code doesn’t run. What could be the reason?

Please add /usr/local/bin to the Environment Variable Path. The batchflow will be installed in /usr/local/bin by default when it was compiled from source code.

 

For the speech recognition sample files provided in the HARK website, is it okay to have the DEVICE and DEVICETYPE parameters of AudioStreamFromMic to be string? Is it okay in the MAIN sheet that only LOOP node is present?

Yes, the DEVICE and DEVICETYPE parameters of AudioStreamFromMic is string. In the speech recognition network, LOOP is the only node in the MAIN sheet.

 

I am doing Learning HARK 001 of the cookbook but I encountered the following errors:
Toolbox load error: libroscpp.so: cannot open shared object file: No suchfile or directory
How can I fix this issue?

The reason of this problem is that the environment variables are not the same as your account.

You have two solutions:
(1) download (From File button) the network, and run from your terminal. (use batchflow command for that)
(2) update HARK Designer.
Press Preferences button, and click Environment button. Then, you can select ROS version, and any environment variables.

 

For the recordings, does it matter if I use -e 24 or -e 16?

It doesn’t matter whichever you use.

 

Are the displayed values when you click See More in the HARK-Designer after execution a standard output when you set DEBUG_TRUE to true parameter of LocalizeMUSIC?

Yes, it is a standard output. However, if you want to obtain the output data, you may run the network via command line.

 

Can I use HARK-Designer over the network?

Yes, HARK-Designer can be used over the network. However, because it does not authenticate, access by non-local is blocked by default.
Please make sure to add allowremote option at runtime: hark_designer allowremote

 

I am trying to run HARK-Designer but when I access http://localhost:3000, it redirects to another IP address and displays that it can’t connect. What should I do?

Please try to access HARK-Designer through http://127.0.0.1:3000. It seems that there is a configuration in your network that redirects localhost to another IP address.

 

I am doing sound source localization using HARK-Designer. The execution seems to be slow. Is there a way to increase the speed?

If you are using a virtual machine, please increase the number of CPUs to speed up the processing.

 

I am using HARK-Designer in Windows. Internet Explorer popped out when I run it and it cannot load the page. Why is this so?

HARK-Designer does not support Internet Explorer. Please use Chrome or Firefox.

 

I tried to run HARK-Designer but it seems there is a problem when loading. When I run “node app.js”, the following is logged:

Express server listening on port 3000
found defs: []

What should I do?

Please set FLOWDESIGNER_PATH to your environment variable. For Windows, set it to C:\Program Files (x86)\HARK\FlowDesigner. For Linux, set it to /usr/lib/flowdesigner/. After setting, please try to run HARK-Designer again.

 

I am trying to run HARK-Designer via the command line but the following error is logged:
Error: listen EADDRRINUSE
How do I fix this?

Thw error indicates that the port number 3000 is being used on the machine that is already running. If you know the process which uses port number 3000, you may kill it and try to run HARK-Designer again.

You can also set the port number in app.js. For example, set it to 3001 and rerun HARK-Designer. You will now be able to access HARK-Designer through http://localhost:3001 or http://127.0.0.1:3001.

 

What are the main parameters for sound localization to be tuned?

The following are the nodes and its parameters used for tuning:
1. LocalizeMUSIC
– A_MATRIX: contains path to the transfer function
– NUM_SOURCE: number of simultaneously occurring sounds
2. SourceTracker
– THRESH: to be ignored if the MUSIC power of the sound is smaller than set value

 

For our network, we tried only with 4 microphonea (on a circular plan 90deg oriented) for the localization but the results are not the best as we expected. Do you think that 4 microphonea are enough for the localization and separation?

For a single speaker separation/localization, 4 microphones on a circular plan should work if the transfer function is correctly generated, and the reverberation is small.

 

I am doing sound localization using HARK. When I execute my network it stops immediately. What could be the problem?

If you are using HARK ver 2.1 or later, you have to use a zip format for transfer function. Please refer to this link on how to convert to zip:
https://www.hark.jp/packages/faq/

 

When using DisplayLocalization, the sound source angle can be localized and displayed as a graph, but can it be ouputted as numerical value?

When SourceTracker’s parameter DEBUG is set to true, the localization result is output to standard error output.
http://hark.jp/document/hark-document-en/subsec-SourceTracker.html

 

I am planning to obtain the localization result output of LocalizeMUSIC. When I attached the OUTPUT of LocalizeMUSIC to SaveSourceLocation, the localization result looks strange. What could be the cause?

Please connect the OUTPUT for SourceTracker to the SOURCES of SaveSourceLocation to save the correct localization results.

 

Regarding the setting of the THRESHOLD parameter of LocalizeMUSIC, if you set it to 23 or less, you will recognize more than one sound source and if you set it more than 24 you will not recognize the sound source at once. Even if I get close to the microphone and give out loud voice, I do not get a response. Does this have any effect in the time lag?

LocalizeMUSIC Spectrum of MUSIC does not represent the magnitude of the sound but the direction. The setting of the threshold value depends on how the room situation differs from that at the time of measurement, and does not significantly affect the time lag.
Also, since the transfer function is measured about 1 m away, the value becomes smaller as you approach the microphone. Also note that volume can not be calculated because it breaks if the voice is too big.

 

In HARK, is it possible to input the position information of the plot as position information of the sound source?

Yes, please create a localization result in the following format: http://hark.jp/document/hark-document-en/subsec-LocalizeMUSIC.html

For the network, replace LocalizeMUSIC sith LoadSourceLocation: http://hark.jp/document/hark-document-en/subsec-LoadSourceLocation.html

 

I performed sound source localization and sound source separation in real time under the condition that three people are about 1 m apart. However, as a result, multiple sounds were output, and when I listened to it, I did not know which person’s voice. How can I improve my results?

This is because the performance of the sound localization affects the performance of separation.
Please tune sound localization by referring to this link: http://hark.jp/document/hark-cookbook-en/subsec-Localization-002.html
Next, tune the separation by following instructions from this link: http://hark.jp/document/hark-cookbook-en/subsec-Separation-003.html

 

I am doing sound source lozalization using HARK-Designer. I have observed that localization results are only around -60 to 60 degrees appear. Why is this so?

This may indicate that only the MUSIC spectrum in that direction is high.

 

How much does the environment affect sound source localization?

It affects sound source localization, for example, when there are boards (displays, walls, etc.) that are likely to reflect sound within 1 m, the reverberation of the room is strong.

 

Is HARK capable of measuring the distance of the sound source to the microphone array?

Please see below and follow the description in the page:
http://www.hark.jp/document/2.3.1/hark-cookbook-en/subsec-Localization-009.html
http://www.hark.jp/document/2.3.1/hark-cookbook-ja/subsec-Localization-009.html


 

How do you do sound amplification in HARK?

For amplifying the waveform, please use the MultiGain node.
http://www.hark.jp/document/hark-document-en/subsec-MultiGain.html

 

How do I do multi-channel recording using Audacity?

Please check the following URL on how to do multi-channel recording in Audacity:
http://hirorororo.blog.so-net.ne.jp/2011-02-22
http://achapi2718.blogspot.jp/2011/12/audacity-file-menu.html

 

Will updating packages recommended by Ubuntu system affect batchflow?

No since batchflow is an executor of a HARK network.

 

Can I use the cloud version of HARK to send data to other sound engines?

The cloud version of HARK () sends an acoustic file. It can send sound data sequentially. However, this does not have a voice recognition function, so it will be possible to obtain a separated sound.

 

Where can I find the sample files for speech recognition using HARK?

Please download the sample files for speech recognition using HARK at this page: https://www.hark.jp/download/samples/

 

We are trying to send data to HARK over TCP. There is no problem with the control messages (100-101,200-201), and there is no problem with the network speed. It receives only about ~1.5 Mbps, and 2 is needed. After days of testing we are almost sure, that this
problem is caused by the HARK. Do you know any software limitations to the network speed in the hark-system.

Transferring data over TCP might limit the transfer speed. If you are using Iterator subnetwork for receiving, since the Iterator subnetwork is a kind of while loop, the configuration or implementation of another node in the subnetwork can be a bottleneck of the whole loop.

 

How can I send HARK data to other process?

A HARKDataStreamSender node can be used to send HARK data to other process via a socket connection.
Please see information about this node at: http://www.hark.jp/document/hark-document-en/subsec-HarkDataStreamSender.html

 

Can I directly send the data ouputted by HARK to other sound engines?

There are nodes that can send sequentially to Julius/Kaldi (SpeechRecognitionClient), but it can’t communciate to the other speech recognition APIs since those have different transmission methods in streams. A suggestion is for to create a stream sending node according to the API you want to use.

 

When I am running a network for sound separation from mic, it is creating a lot of wav files, with one having an arbitrary duration (found in 2 sec – 29 sec range during my tests) and rest of the files are of 1 second duration. What could be the possible reason behind this?

The parameter THRESH of SourceTracker might be a reason because it is too low. The thresholded value, a MUSIC spectrum (that can be seen by setting DEBUG of LocalizeMUSIC to true), is around 20 – 40 in normal conditions. A possible reason of such a fragmented sound detections is this too-low threshold.

 

I am doing sound separation using HARK. What are the parameters to be tuned to improve my output?

Please check the following page in the HARK Cookbook fo information on the parameters to be tuned for sound separation: http://www.hark.jp/document/hark-cookbook-en/subsec-Separation-003.html

 

The cookbook states “replaying signals only once” under manual operation. Is the signal to be replayed the entire wave file? Or is it just two audible pulses within the wav file?

The sound to be played is called time stretch pulse (TSP). Once you record TSP response, it is converted into impulse response by a convolution technique. With this method, we can get impulse responses with high SNR compared to recording of just impulse signals.
The wave file mentioned includes only one TSP, it is better to concatenate TSP signals (maybe 8 TSPs) and play the concatenated one for practical use. In this case, you can get TSP responses including several TSPs. Harktool accepts such a TSP response. The number of TSPs can be specified in TSP-ListFile as Num of Synchronous Averaging

 

Why does sound tracking in HARK not include detection of voice activity?

The voice activity detection is done by thresholding process in SourceTracker. It is based on the MUSIC spectrum calculated in LocalizeMUSIC.


 

How are flowdesigner, HARKdesigner, and batchflow related to each other?

Flowdesigner is a GUI programming environment and was configured by its execution environment, batchflow. Now, HARK-Designer is used as a GUI programming environment instead of flowdesigner, but batchflow is used for its execution environment same as before. Hence, HARK currently uses only HARK-Designer and batchflow.

 

I am using Ubuntu 13.10. When installing HARK via source compilation, the following error is encountered:
/usr/lib/x86_64-linux-gnu/libboost_system.so.1.53.0: error adding symbols:
DSO missing from command line collect 2: error: ld returned 1 exit Status

How can I fix this?

The cause of this error is the version of boost. Please try to install boost version 1.49.0 so that it can compile.

 

Is PyCodeExecutor available in HARK 2.2?

HARK-Python is not changed from 2.1.
So, you will be able to use PyCodeExecutor by referring to the document: http://www.hark.jp/document/2.3.0/packages/hark-python/harkpython.html

 

Can I display the localization results to another Python program and use it?

If you want save the result to be used later, you can use the SaveSourceLocation node. It can be saved in text file format, so you can use it by reading the file with python.
If you want to use it sequentially, you may use HARK-Python. Please access this page on how to install and use HARK-Python:
https://www.hark.jp/document/2.4.0/packages/hark-python/harkpython.html

 

Can I control the loop of the start/end of processing through PyCodeExecutor?

Unfortunately, HARK-Python has no loop control.

 

What is the Python path used by PyCodeExecutor?

It uses the environment variable PYTHONPATH containing the installation path of Python. It is usually set to /usr/lib/python2.7/dist-packages.

 

The HARK tool assumes fully synchronized inputs (like those of the RME Hammerfall interface boards), correct? I, however, am using a general purpose DAQ with a 1 micro-sec sequential sample rate. My audio streams are recorded at 44kHz, but the streams are each offset from each other by 1-usec. Can that be accommodated in the existing software?

Basically, fully synchronized inputs are assumed. But, 1 micro sec delay can be regarded as synchronous enough. The effect of 1us delay is 0.1% at 1kHz (1us x 1kHz = 0.001).

 

Can Harktool4 and Harktool5 be used over the network?

Harktool4 and Harktool5 can only be used locally.

 

I am doing speech recognition using HARK and Julius. But it seems that the recognition rate is very low because there are only a few words recognized. How can I fix this?

Please make sure that your firewall is not blocking the communication of Julius.

 

Where can I find the instructions on how to install Kaldi?

Please the installation instructions for Kaldi at:
http://www.hark.jp/document/2.3.1/hark-document-en/subsec-KaldiDecoder.html#a0000000717

 

I started learning Hark 2.0 using a 4 microphone array and tried running an offline localization based on the Kinect sample. I seem to have an issue with the transfer function calculated from the microphone layout only. It outputs vertical lines and results of the type :
MUSIC spectrum: 26,767601
0: 30,00 60,00 1,00 26,77
...
I used the following command to generate the transfer function:
harktool4_cui -m tf -o config/loc.dat --output-type m2pg mics.xml
sample_tsp.xml --geometry-calculus 1
Is there an error in this way of creating a tf file, should I modify some network node parameters, or should I try recording the tsp instead?

The steps are correct but the microphone array location file seems to be different from the original Kinect microphone locations.
Please modify the location in mics.xml by referring the following page: https://www.hark.jp/document/supported/

Also, to localize sound sources, multiple sources of several directions is needed.

 

I am using HARK with Kinect in Ubuntu 12.04, but I noticed the output given by the Kinect microphone array is very low. How can I improve it?

You can connect a MultiGain node after AudioStreamFromMic. Then you can set more than 1 as the parameter of the MultiGain module.

 

We’re interested in knowing the possibility of using the Hark-Kinect module to locate the origin of speech using the microphone array from the Kinect through the ManyEars module. Is that possible?

Hark-kinect is a generic driver of Kinect and mounts the Kinect microphone array as an ALSA device. If ManyEars supports ALSA devices, you can use it. You still can use HARK for sound source localization and sound source separation instead of ManyEars.

 

I am trying to install the kinect in combination with HARK on ubuntu 12.04. I proceeded as it is stated on the HARK page, installed all the dependencies and HARK-kinect, but the Kinect audio device is not appearing when I do a cat /proc/asound/cards. How can I troubleshoot this?

Please follow the instructions from this link: http://www2.hark.jp/wiki.cgi?page=HARK-KINECT+Installation+Instructions+%28as+a+USB+recording+device%29#p4

If you already tried the installation steps described in the above link and still the problem persists, please try to remove hark-kinect by sudo apt-get --purge remove hark-kinect and try the installation again.

 

Can HARK and OpenNI communicate with Kinect at the same time?

You may find the answer at: http://answers.ros.org/question/48684/kinect-microphone-array-linux-hark-in-combination-with-openni/

Please install HARK-Ros by source compilation. For the different Ubuntu versions, the instructions can be found at: https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

I am doing sound localization using HARK and Kinect. I have downloaded the transfer function from your site and used it in my network. When I tried to execute my network, the sound source localization appears not to work properly in my setup. Even in clean environment, DisplayLocalization continuously shows many directions and does not respond to the sound source. What should I do?

Please try the following:
1. THRESH parameter of SourceTracker will increase/decrease localization results. Set it around 40.0 since Kinect has only 4 microphones.
2. Set NUM_SOURCE parameter of LocalizeMUSIC to 1.
3. The sound source should be positioned at 1 meter away and the same height with the Kinect.

 

I examined your sample separation networks for the Kinect and found out that your (offline) localization and separation works far better than mine. I noticed that your multichannel recording kinect_20words.wav hardly contains any background noise compared to mine. So I am curious about the conditions under which your recording was made, mainly the distance of the speakers from the Kinect and the kind of room where recording was performed, e.g. silent room or recording studio etc.

As you guessed, there was no particular noise source when the wave file was recorded. It is recorded not in anechoic room but in a normal room. The specification of the recording is as follows:
Size of the room : 7m * 4m
Reverberation time : 0.2[sec]
Wall material: Low-refrection material in 3 directions, and a glass wall in one direction.
Noise source : Not in particular but there should be a low back ground noise based on an air conditioner and PCs.
Distance of speakers from Kinect : 1[m] (2people : 0 degree and 45 degree in azimuth)
Location of the Kinect : On a tripod. The height of the Kinect from the floor was about 1.5[m]. We put the tripod at the center of the room.
Volume of the speaker : Normal voice

 

I would like to do speech recognition using HARK-Kinect and HARK-ROS but my sound card is not recognized. What should I do to fix this?

Please check the following:
1. It is required to have the p7zip-full installed prior to HARK-Kinect installation.
2. Restart your computer after installation.
3. Please use USB 2.0 since USB 3.0 is not yet fully supported.
4. If it still doesn’t work, try to reinstall HARK-Kinect.

 

How can I use multiple mircrophone arrays?

You can run HARK processes for each microphone array and combine them using ROS, Socket connection, or other communciation protocols.
Please refer to the following for more information: http://www.hark.jp/document/hark-cookbook-en/subsec-Advanced-003.html

 

Is it possible to use two microphone arrays and use it in HARK?

There is a limitation in HARK that only one microphone array can be connected in one network file. But you can execute two networks at the same time and read data from different microphone arrays.

 

If the origin is the center of the mircorphone array, can you place a microphone array at the origin?

You should be able to put it.

 

If I convert a monoaural wav file into 4 channels and use it in speech recognition, will it affect localization and separation but not recognition results?

The exact same waveform of 4 channels means that the four microphones are completely in the same position. Therefore, when separated by transfer functions such as PS Eye and Kinect, because the microphone arrangement is different, the sound source localization/separation may fail, possibly distorting the sound. This distortion may lower the speech recognition rate.

 

I am trying to execute my network but error occurs. What should I check to know that my network willl run successfully?

Please ensure that:
1. INPUT and OUTPUT of you network is configured correctly
2. Nodes are used correctly. Please refer to the following link for information about the node and how to use it:
http://www.hark.jp/document/hark-document-en/

 

How do I add own nodes in HARK Windows version?

Adding own nodes in HARK Windows is not supported.

 

When will it be difficult for HARK to recognize voice?

If the voice is too small and the sound localization itself can not be done. It is better to lower the THRESH parameter. Other methods can be found in this link:
http://www.hark.jp/document/hark-cookbook-en/sec-SomethingIsWrong.html

 

Can I install HARK-OPENCV on Windows 8? I already run Hark designer but i can’t find instruction on how to install HARK-OPENCV on Windows 8.

Currently we do not support HARK-OpenCV on Windows. To use HARK-OpenCV, please use HARK on Linux (Ubuntu 14.04 is recommended).

 

For playing audio, can I selectively route specific channels of a multichannel wavefile to a selected channel at the output?
Example, in the case when the user wants to route channel 3 of a 4 channel wave file to the left channel of the stereo audio playback system (2 channel sound card).

Yes. You can use ChannelSelector node for this.
http://www.hark.jp/document/hark-document-en/subsec-ChannelSelector.html
Connect between AudioStreamFromWave and PlayAudio (INPUT_MULTI_CHANNEL), and set the parameter as <Vector<int> 3 4>
Then, the output of ChannelSelector has two channels (3, 4).

 

Are there any samples or instructions available on how the default routing of channels is being assigned?

The PlayAudio document below is only the information provided:
http://www.hark.jp/document/hark-document-en/subsec-PlayAudio.html
However, since libsdl is used for this implementation, its document will be helpful to find the default behavior: https://www.libsdl.org/

 

Can I use PlayAudio to play the four channels simultaneously?

No. Current version of HARK supports up to 2 channels for each sound card.

 

PSEye has one-dimensional microphone placement. For this case, is it appropriate to measure at 360 degrees or 180 degrees?

It should be measured at 180 degrees.

 

When trying to install manually Harktool4 in RPi3 (Step7:Harktool4-cui installation instruction) I get the following error:
No package 'libharkio2' found
How can I fix this error?

HARK Tool is necessary ONLY FOR CREATING transfer functions. If it is not important for you to create a transfer function on RPi3, HARK will work just by creating a transfer function file on the PC and copying the file from PC to RPi3. Usually transfer function file is created only once for the microphone array device or acoustic environment (e.g. room).

==== Case 1 ====
If there is no problem in creating a transfer function on PC, you do not need to do Steps 7 to 9. For example, if you have an Ubuntu machine, the native machine or the virtual machine such as VMware, you can create a transfer function file for RPi3 on the PC by executing and installing “sudo apt-get install harktool4 harktool5” and using “harktool5”. You just copy the created transfer function file to RPi3.

Because HARK programs were installed completed through steps 1 to 5, ignore the succeeding steps. Step 6 is Julius (ASR), Step 7 to 9 is HARK Tool, and Step 10 is KaldiDecoder (ASR).

==== Case 2 ====
If you want to create a transfer function file on RPi3, please try the following bcause the error depends on the version of Boost.
> configure: error: invalid value: boost_major_version=

1. Try to matche the version of Boost included in Ubuntu’s LTS.
For your information:
Ubuntu 14.04 is 1.54.0
Ubuntu 16.04 is 1.58.0
You can get all version of Boost from official site:
http://www.boost.org/users/history/
2. Then try to compile libharkio2 and HARK Tool4 again.

3. If it does not improve, the following may be helpful:
https://github.com/cruppstahl/upscaledb/issues/59
Please download the boost.m4 file from this git and update it. The location is in the directory name m4. Then you will have to run autoreconf on libharkio2 directory.
4. Then try to compile libharkio2 and HARK Tool4 again.

 

In the HARK Documentation, RASP-LC is mentioned. How much does it cost and where can I purchase one?

The following URL is the product page of RASP-LC: http://www.sifi.co.jp/system/modules/pico2/index.php?content_id=5&ml_lang=en
You can download the brochure and contact the company.

 

I am doing speech recognition using HARK with RASP-ZX. When I execute the network, it stops immediately. What should I do?

Please check if you have set the parameters of AudioStreamFromMic node:
DEVICETYPE = DS
DEVICE = RASP

 

I am using RASP as microphone. I set the sampling frequency for measurement of the signal to 16 KHz. The generated transfer function with this setting will not be able to cope with sounds of 8 KHz or more. Is that right?

Yes, that is correct. Since the obtained information is up to 9 KHz, only the characteristics up to that can be expressed.

 

How should I do speech recognition using monoaural or stereo wav file?

Please skip localization and separation because sound localizatio and sound separation are implemented by using multiple microphones and their transfer functions. In your network, find the input of the node to which the output of GHDSS is connected and delete it, insteadm connect it directly to the ouput of MultiFFT connected to AudioStreamFromWave.

 

Can I use HARK to make a single multichanel wavefile from the multichannel recording set (sep_0, sep_1…etc)?

Yes. If you connect AudioStreamFromWave and SaveWavePCM directly, the saved wave file has multiple channels.
Technically speaking, if the input type is Map<int, <Vector<float>>>, the saved wave files have a single channel, and its suffix (_0, _1) is the same as a key of the map. If the input type is Matrix<float>, the saved wave file have multiple channels.

 

I have used two microphones for sound localization and are positioned differently. But why is it that the recorded sound in the two microphones are similar?

Current HARK supports only one AudioStreamFromMic for one network file. It might be that since the two outputs of the nodes share the memory, the two recordings are identical.

 

I would like to use HARK with TAMAGO. How do I test if it is already connected?

Please refer to the instructions on how to do recording in the Cookbook to confirm if your microphone is connected:

 

Where can I get information about the composition of the Multiface equipment made by RME?

Please see the information about Multiface from the following page: http://hark.jp/document/hark-document-en/sect0047.html

 

I use a webcam (Logitech HD Webcam C910) that has a stereo microphone. I use HARK-ROS to execute networks from the terminal with batchflow. But when I run, the following error occurs:
Missing Parameter: A_MATRIX
What can I do to resolve this issue?

Since LocalizeMUSIC node detects the direction of the sound using a transfer function, you need give the path to the file in the A_MATRIX property of this node.

If you will use our supported microphone arrays, you can download them from https://www.hark.jp/document/supported/

Otherwise, you can measure and create a transfer function file for your webcam. This video explains what the transfer function is and how to measure it: https://www.youtube.com/watch?v=9v5RUOrkyhw

and this video explains how to create a transfer function file: https://www.youtube.com/watch?v=_Tpn94mPtj4

 

I am trying to install HARK-ROS in ubuntu 12.04 (precise) for ROS-hydro using the instructions mentioned in your page. When I run HARK Designer, no ROS library is visible in it. What should I do.

On the HARK Designer, please click Preferences --> Packages.
If you can find hark-ros.def and its color is gray, please click the circle button to activate it. Then, you will find the HARK ROS nodes in the node list.

 

I would like to set the HARK ROS in my HARK Designer but I didn’t find the hark-ros.def in Prefrences->Package. What should I do?

In HARK Designer, click Preferences --> Package. The path for the hark-ros should be added. On the dialog box,
(1) click Add New Variable
(2) set Name to FLOWDESIGNER_PATH and
Value to /usr/lib/flowdesigner/ (where your hark-ros.def exists)
(3) click Apply
then, click Package again.

If the HARK Designer found the package, you will find hark-ros.def.

 

I saw the hark_turtlebot_follower.cpp in your samples. I am using ROS hydro which uses a catkin package. Does a catkin package also exists for this demo or is there any other demo that uses HARK-ROS?

We support only rosbuild for the turtlebot stack. Since Hydro supports both catkin and rosbuild, you can use the package.

 

I have a problem with publishing and subscribing Hark to Ros. I have tried to do the first tutorial but I always encounter the error message:
Node not found: RosNodeGenerator

A workaround is to compile the source code. Please follow the instructions at: https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

I am trying to use HARK for sound localization with ROS. I have a difficulty to use it with ROS (indigo) since the provided hark-ros-indigo package is not catkinized. I could not find any information about catkinized hark-ros packages except ‘jsk_hark_msgs’. If you already updated your packages using ‘catkin’, could you tell me where I can download it? If not, could you tell me any suggestion for ROS(indigo)+catkin users?

If the package does not work, you can try the source compilation. Please download hark-ros-stacks-indigo_2.1.0.tar.gz from the following page: https://www.hark.jp/download/source-code/

Unzip it in your HARK_ROS_STACKS directory. Then you can do catkin_init_workspace and catkin_make as you know.

 

I’m currently running Ubuntu 14.04 with the ROS indigo installed and having trouble when executing the network file that includes the ROS module. When loading the toolbox it produces the error:
Toolbox load error: /usr/lib/flowdesigner/toolbox/hark-ros/hark-ros.tlb: undefined symbol: _ZN19dynamic_reconfigure14__init_mutex__E
What should I do to fix this?

If you installed HARK via apt-get, please reinstall the related packages again. If you used hydro before, please uninstall hydro-related packages and install indigo-related packages.
If you installed via source compilation, please recompile and reinstall again. If you are now using indigo, please re-link to the indigo-related libraries.

Another thing to remember is that the recording must be done in a loop, please change the network as follows:
1. create a new sheet whose type is iterator (name is LOOP0 in this case), and put all nodes in it.
2. put a LOOP0 node in MAIN loop.

This document will be helpful: http://www.hark.jp/document/hark-cookbook-en/subsec-LearningHARK-001.html

 

I want to publish a HARK message(localization result) to ROS Indigo, but I got error as following:

*UINode.cc line 221: Node not found: RosHarkMsgsPublisher
UINode.cc line232: Exception caught while creating node_RosHarkMsgsPublisher_2 (typeRosHarkMsgsPublisher)
UINetwork.cc line 559: Exception caught while building network
MAIN_LOOPUINode.cc line 232: Exception caught while creating node_MAIN_LOOP_1 (type MAIN_LOOP)
UINetwork.cc line 559: Exception caught

I encourntered this while building network MAIN. Would you please help me to solve this?

The error message Node not found: RosHarkMsgsPublisher means that batchflow (an executing program of HARK network) cannot find the HARK ROS nodes.
Please confirm that you have installed HARK ROS successfully. You will find a directory /usr/lib/flowdesigner/toolbox/hark-ros if you have succeeded the installation.

 

We have been using Hark with ROS hydro for almost a year without any problem.
Recently, we decided to upgrade all the packages to ROS indigo. When we tried to install hark indigo we faced some problems. Have you been successful in using Hark with indigo?

There are no problems with HARK even after the update from hydro to indigo.
If you created your own HARK node which is related to ROS, please recompile and install it again so that your code can link to the indigo libraries.

 

I’m working on integration with ROS. I can already publish a HarkSource message and echo it in a terminal. Now, I’m trying to grab the message in my C/C++ code. I have installed the package, but on ROS Hydro it creates a stack like what is used in the past with rosbuild.
In my package.xml I have added :
hark_msgs
hark_msgs
In my CMakeLists I have added hark_msgs like usual in find_package and CATKIN_DEPENDS. But it doesn’t find the message package when I build the workspace :
CMake Error at /opt/ros/hydro/share/catkin/cmake/catkinConfig.cmake:72
(find_package):
Could not find a package configuration file provided by "hark_msgs" with any of the following names:
hark_msgsConfig.cmake
hark_msgs-config.cmake
Is there a specific way to use the stack with catkin ?

Currently, hark_msgs is built by rosbuild not catkin. Therefore, CATKIN_DEPENDS cannot find the package.

 

Does HARK support ROS Hydro?

Yes. The debian package is already provided from Ubuntu 12.04 and up or you may download the source code for source compilation at:
https://www.hark.jp/download/source-code/

 

When I updated to the new version of HARK, it gives me an error when using with ROS:
[ERROR] [1389458410.955169304]: Client [/rostopic_10111_1389458405985] wants topic /HarkSource to have datatype/md5sum
[hark_msgs/HarkSource/21a049a3207fea2bd7dd556299ef6e38], but our version has [hark_msgs/HarkSource/967ad288a031a049b6e521f5e5c98ed7]. Dropping connection.
This happens even when I try to: rostopic echo /HarkSource. Somehow the msgs seem to have different md5sum.
How can I fix this error?

The version of hark-ros-stacks is not matched to the version of hark-ros. Please update hark-ros-stacks as well.

 

I am doing the HARK-ROS Tutorials. But I keep getting this error:
UINode.cc line 221: Node not found: RosHarkMsgsSubscriber
or
UINode.cc line 221: Node not found: RosHarkMsgsPublisher
How can I fix this issue?

Since ROS requires some environment variables which are set by the “setup.sh”, HARK-ROS also needs these variables. The first error message suggests that the batchflow, the hark network executing program, fails to find ROS nodes because of the variables. Please try running the network from the terminal to avoid the problem.

 

When and why do we get message saying, “Segmentation fault (core dumped)”? I get this message on the terminal when I use HARK with ROS, and specify a low ROS_LOOP_RATE. Is this the reason?

Since the frame rate is 100Hz for audio processing, the ROS_LOOP_RATE has to be at least 100. Normally, more value is better for ROS_LOOP_RATE because this is only related to the sleep time in the module. If the ROS_LOOP_RATE sleep time + audio processing time > 10ms, you see the error. Please use the default value.

 

I am trying out the HARK-ROS Tutorial but I keep getting the following error:
UINetwork.cc line 647: No output defined for network
How can I fix this?

This error means that you didn’t set the outputs of RosHarkIntPublisher and RosHarkIntSubscriber. Please do the following:

1) Shift+LeftClick the output ports of both RosHarkIntPublisher and RosHarkIntSubscriber. Then you will see the output port names written in blue letters.
2) Go to the MAIN sheet.
3) Shift+LeftClick the two output ports of LOOP0. Then you will see the output port names written in blue letters.

 

I am trying out the HARK-ROS Tutorial but when I try my network, it crashes instantly. How can I fix this?

Please check the following:
(1) CONDITION can be added by Ctrl + Left-click the output port of Sleep node. You don’t have to add the CONDITION output port. Just Ctrl + Left-click the output port of Sleep node. Then, you will see CONDITION in purple letters not blue letters.

(2) You don’t have to add OUTPUT2 in RosHarkIntSubscriber. Just Shift + Left-click the output port of RosHarkIntSubscriber. The names of the output/input ports do not have to be exactly the same as the example on the web. It should be only unique. You can choose any preferred names.

 

Is HARK compatible with ROS-Hydro?

Yes. Please see installation instructions at:
https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

Where can I find the installation instructions for HARK in ROS Groovy?

Please find the installation instructions in this link: https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

I’m trying to install ros stacks of HARK to my pc but unfortunately I come up with some errors that tells I have to install cturtle version of ros. As I am using Ros Electric I want to ask you how can I troubleshoot this problem and install it on to my ros electric on ubuntu 10.10.

The problem comes from the compilation environment of the debian packages. The debian packages are compiled with ROS cturtle. For other ROS versions, you need to do installation from source compilation.
All the detail about the source compilation process is available on the following page: https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

I am doing the HARK-ROS tutorials 4 and 5. When I execute the localization result based on Publish, the following error occurs:
ERROR: Can not load message class for [hark_msgs/HarkSource]
How can I fix this?

In addition to hark-ros-indigo, the definition of hark_msgs also requires the following packages:
Sudo apt-get install hark-ros-stacks-indigo

 

I have created a network that uses HARK-ROS to send data. But when I execute the network, the following error occurs:
Node_HarkMsgsStreamFromRos_1 constructor end ...
Std :: exception

What should I do?

Please check in your network if the iterator sheet (LOOP0 in the sample networks) is set properly as iterator and not as subnet.

 

I have installed HARK-ROS. But when I execute my network in HARK-Designer, the following error occurs:
Toolbox load error:
/usr/lib/flowdesigner/toolbox/hark-ros/hark-ros.tlb:undefined symbol:
_ZN19dynamic_reconfigure 14__init_mutex__E

How can I fix this?

Since HARK-ROS has multiple versions installed and there are many environment variable settings, you have to select the version you will use. In HARK-Designer, please go to Preferences -> Environment. The version of ROS found by HARK Designer will be displayed. Then, please select the version to be used.

 

I am using HARK-ROS in Hydro. When I run my network, the following error occurs:
Batchflow: symbol lookup error:
/usr/lib/flowdesigner/toolbox/hark-ros/hark-ros.tlb: undefined symbol

What should I do to fix this?

It seems that HARK-ROS is not installed correctly. Please install HARK-ROS by following the instructions in this link:
https://www.hark.jp/packages/ros/hark-ros-inst-inst/

 

I am trying to run my network containing HARK-ROS nodes. But it seems that HARK-ROS is not able to load the HARK-ROS packages properly. How can I fix this?

Since HARK ROS does not work unless the environment variable is set. Please set it in HARK-Designer by pressing:
Preference button -> Environment button
Then, it should display the version of the ROS installed. Please select it and click Apply.

 

HARK-Fd is required for HARK-ROS compilation. Will I still be able to use it if I replace it with HARK-Titech?

HARK-Fd header file Source.h and Map.h is required when you compile HARK-ROS. Since HARK-Titech doesn’t include these files, it is necessary to install HARK-Fd.

 

I’d like to install HARK-ROS instead of apt with rosinstall. Are the repositories such as svn and git public?

The version control system of HARK is not open to the public now. If the primary purpose is installing with rosinstall, you may locally install and modify the source:
https://www.hark.jp/packages/ros/hark-ros-inst-inst/
You can use it as a local package.

 

What is the difference between HARK-ROS and HARK-ROS-stacks?

HARK-ROS contains nodes to be used for communicating with ROS the localization results, feature quantities, etc. calculated using HARK.
HARK-ROS-stacks is not directly related to the HARK package. It is a module group that defines the HARK-ROS messages for publishing and subscribing. Please use this when you want to use the message handled by HARK-ROS in the ROS side.

 

How do I set HARK to publish sound source localization messages to ROS at a certain time interval?

The appropriate frequency for the desired time interval should already be set and will be reflected when performing:
rostopic hz
Then, you have to set the TIMESTAMP_TYPE parameter of RosHarkMsgsPublisher to CONSTANT_INCREMENT.

 

Is there a way by which I can convert hark_messages to std_msgs?

Since HARK-ROS assumes that you can use hark_msgs, it is difficult to use HARK-ROS without hark_msgs.
It might be possible to breakdown hark_messages into std_msgs, since hark_messages contains std_msgs as its fields.
For this topic, you may want to ask ROS community because this is pure ROS message handling.

 

Can you help me with integrating HARK with ROSMATLAB I/O? I am using Matlab 2012b in my ubuntu 12.04 64 bit system and I am unable to subscribe hark_message. I guess it supports only std_msgs by default. Can you please help?

As for integrating HARK and ROSMATLAB, I think that the problem is: How to make ROSMATLAB subscribe a message which is not std_msgs.
Therefore, I recommend you to ask ROSMATLAB community about it.


 

When I run HARK in Chromium browser on an ARM Core, the following error occurs:
ERROR:sandbox_linux.cc(343)] InitializeSandbox() called with multiple threads in process gpu-process.
How can I work around this?

This is a bug in the Chromium browser when running in ARM Core. Please make use of another browser for running HARK.

 

I woul like to clarify the dimensions of the Correlation matrices explained in the Hark Document. It has been explained that the time-averaged correlation matrix:
R'(ω,f) = X(ω,f) X*(ω,f)
is calculated for each of the frequency bins. Is this true? I had the understanding that the correlation matrix would result in a (Number_of_channels x *Number_of_channels) matrix.
So, does this mean that the correlation matrix R'(ω,f) is a 3D matrix of dimensions (Number_of_channels x *Number_of_channels x 512) i.e for each frequency?
Is this the reason why the 3D array later provide us the ability to have a broadband integration by ‘integrating’ from ω_min to ω_max?

Yes, you are right. We do the sound source localization in the frequency domain so that the correlation matrix is calculated in each frequency bin, resulting a 3D complex matrix sized Number_of_channels x Number_of_channels x 257. (We only calculate the first half of frequency bins since the latter half is just a symmetric. Thus, 512 is reduced to 257.)

We firstly calculate the MUSIC spectrum for each bin and take summation from ω_min to ω_max.

 

I am making an offline implementation of the MUSIC-SEVD algorithm, and for this, I needed a transfer function file to read the H(θ,ω) for calculating the MUSIC spectrum. Is there a better or alternative way for getting a transfer function file? Is it possible for me to get values of a standard transfer function (like kinect_loc.dat) so that I can directly use it in my python code for offline MUSIC – SEVD?

Hack the H(θ,ω) by seeing the format and reading the file byte-by-byte. Currently we don’t have any python code for the purpose.

 

What is the unit of the outputted MUSIC spectrum (power)?

It is difficult to give physical significance to the value of MUSIC spectrum.
Please see formulas (15) and (16) in the document: http://www.hark.jp/document/2.3.1/hark-document-en/subsec-LocalizeMUSIC.html

 

The estimated power by MUSIC method is not the exact power of sound arrival direction, I suppose. Does the difference between the estimated two powers (let PA the power of direction A and PB the power of direction B, and let PA – PB the difference between PA and PB) has any significance besides the magnitude correlation?

It is difficult to give a certain specific unit as physical significance to the value of MUSIC spectrum.
Please see formulas (15) and (16) in the document: http://www.hark.jp/document/2.3.1/hark-document-en/subsec-LocalizeMUSIC.html

Theoretically, MUSIC spectrum estimated for a sound source direction should diverge to infinity. However in reality, a high, not infinitely large, peak can be obtained due to noise. It is the idea of MUSIC method that obtaining a high peak in a sound source direction should make detecting the sound direction easy. Therefore, a difference between two power values will not have a specific meaning except that it can be used as a guide to decide which peak corresponds to the sound source with higher power.
It is correlated with the power. This says that big power, big MUSIC spectrum value. If you need power, then it is better to use Delay-and-Sum beamforming method of BeamForming node.

 

How can I install an older version of HARK using sudo apt-get install?

You can download the source code of the older version of HARK and compile it. Please see link for the source codes:
https://www.hark.jp/download/source-code/

 

I want to use HARK with my robot. Do I need to create my own Capture Server?

It is not necessary to create. You will capture sound using HARK’s nodes. There is an AudioStreamFromMic node and you have to install HARK to use it.

 

How do I specify the sound source file when using Constant and InputStream node?

You need to specify the filename of the sound source file as parameter of the Constant node. The InputStream will open this file.

 

In which coordinates are the azimuth and the elevation obtained?

The azimuth and the elevation are obtained based on the Cartesian coordinate.
Please see Figure 4.3: HARK standard coordinate system in the following URL:
http://www.hark.jp/document/hark-document-en/sect0025.html

 

In HARK, can I convert the data types as follows?
1. Matrix> to Matrix
2. Map to Matrix

There is no node that performs type conversion so you need to implement it on your own.

In Linux, you can use HARK-Python. You may download a sample of HARK-Python from this link:
https://www.hark.jp/download/samples/

 

While trying to create noise file for suppressing noise, the real part (NOISEr.dat) is being created but the imaginary part (NOISEi.dat) is never created. Why is this so?

From HARK 2.1.0, the file format has been changed.
Currently, CMSave/CMLoad, which saves and loads the noise, use only FILENAMER.
http://www.hark.jp/document/hark-document-en/subsec-.html
The format is zip:
http://www.hark.jp/document/hark-document-en/sect0032.html

 

I do not know what the format of the .flt files actually is. The documentation has described a function for reading flt files, but it is seemingly unreadable as it runs off the end of the page. Can you tell me what goes into an flt file (is it a raw reading from the microphone? Or is it a processed result?) And what is the format of the data in the file (binary? Raw text?)

“flt” is just a raw file of 32bit, real float, little endian.

 

I have a question on .flt files. I understood that it is a file stored with 32 bit floating point and little endian by the float binary. How can I save in this format or convert to this format?

The .flt format is equivalent to what is called a .raw form without a header. Please try using sox to convert to this format.

 

What is flt format and how to convert to this format?

Please see information on the flt format in the HARK Document:
http://www.hark.jp/document/2.0.0/hark-document-en/sect0029.html
http://www.hark.jp/document/2.0.0/hark-document-ja/sect0032.html
http://www.hark.jp/document/hark-document-en/sec-formats.html

 

I would like to understand the basic concepts of the interfaces between nodes. Is there a document which explains this?

For node implementation, please check the recipes on the Web:
* Creating a node: http://www.hark.jp/document/hark-cookbook-en/subsec-Advanced-001.html
* Connecting HARK to the other systems: http://www.hark.jp/document/hark-cookbook-en/subsec-Advanced-003.html

HARK used FlowDesigner and uses BatchFlow as a middleware. The documents on FlowDesigner might help for a deeper understanding of the software.
https://sourceforge.net/p/flowdesigner/wiki/FlowDesigner_User%27s_Guide/#Tutorials_

 

I tried the sound localization sample from HARK Cookbook. When I click “Execute” in HARK-Designer, it did not show the sound localization result window and error occurs:
GHDSS:TF_CONJ_FILENAME is not set.
How can I fix this error?

Please convert sep_tf using harktool-conv-tf with “-s” option. Please open your network file in HARK-Designer. Open the property dialog of GHDSS node and put the name of the converted file to TF_CONJ_FILENAME parameter.

 

My goal is to send sample values of a wav-file over a socket. To do so I am using the HARK’s component ‘HarkDataStreamSender’. The receiver is ‘RTBOS’.
I created my own flow of samples just with the two possible values 1 and 0.5. This is just for testing purposes:
If I use a ‘char* buffer’ to read from the stream I get this representation:
sending -> receiving ( printing with 'printf( " %d", buffer[i])' )
1 -> 0 -2 -1 70
0.5 -> 0 0 -128 70

If I use a ‘float* buffer’ to read from the stream I get this representation:
sending -> receiving ( printing with 'printf( " %f", buffer[i])' )
1 -> 32767.000000
0.5 -> 16384.000000

As you have already a working connection from HARK to Julius, could you please tell me how to convert the upper values to the right ones?

Since the waveform is sent by float, treating it as a float is a correct way. (see http://hark.jp/document/hark-document-en/subsec-HarkDataStreamSender.html)

You may be able to get the “correct” data by dividing them by 32768.
i.e.
printf( ” %f”, buffer[i] / 32768);
The result is not exactly the same for the value of 1.

 

I am using the HarkDataStreamSender to send the result of the sound localization. How will I process this result using Python? The result looks like garbled characters.

Since information is sent from HARK is binary, your Python script will receive a binary string (string type in Python 2.x) from the socket:
https://docs.python.org/2/library/socket.html
You can convert this by struct parsing.
https://docs.python.org/2/library/struct.html

Since the data just received is a binary string, it must be parsed and raised as a struct.

 

We would like to use HARK with Raspberry Pi 2 for a communication robot. In the installation of HARK-Designer, an error occurs. How should I proceed?

Since HARK-Designer is not necessary to run HARK’s network, you need not install it. ”

” 

I was able to run HARK-Designer before but when I updated Ubuntu, I cannot run it anymore. What should I do?

Please reinstall HARK-Designer by following the instructions at:
https://www.hark.jp/document/packages/hark-designer-en/harkdesigner.html

 

I have installed HARK in Ubuntu 12.04 (Precise) but when I execute “hark_designer” the following error occurs:
ATTENTION: default value of option force_s3tc_enable overridden by
Environment.
[Libprotobuf ERROR
../../third_party/protobuf/src/google/protobuf/message_lite.cc:123] Can not Parse message of type "in_memory_url_index.InMemoryURLIndexCacheItem"

What should I do to proceed?

It is an error at the start of Chromium. You can ignore it and proceed in accessing this address in Google Chrome or Firefox:
http://127.0.0.1:3000 or http://localhost:3000

 

I am trying to install OpenCV but it says that it cannot find Map.h in ImageStreamFromRos.cc. How can I fix this issue?

Please add the –with-hark-inc option when installing. Follow the instructions in this link for the installation:
https://www.hark.jp/packages/opencv/

 

We are considering inputting the result of SemiBlindICA into MIC_WAVE, one of the inputs of HarkDataStreamSender. Is there a way to connect by combining nodes?

Unfortunately, there is no node which performs like that. You may use CombineSource if you want to summarize sound source localization results.

You may also use HARK-Python to write code which will take two matrix_float inputs and giving one matrix_float output. More information on HARK-Python from this link:
https://www.hark.jp/packages/opencv/
https://www.hark.jp/document/document-en/

 

When creating the Localization Transfer Function using harktool, what does the input, “Direct Sound Length [sample]” mean?

It is the number of samples to consider as the direct sound.

Impulse response can be divided as the direct sound arriving directly from the speaker and the reflected sound (e.g. reflected from the wall). The timing at which the direct sound reaches the microphone is mainly determined by the distance between the speaker and the microphone. This information is used for sound source localization. This is why this option is only available when creating the sound source localization transfer function.

If the set number of samples to consider as a direct sound is too long, the samples might already include samples with reflected sound. If it is too short, there is a possibility that not all channels of the microphone array have recorded the direct sound sample.

For Direct Sound Length [sample], the default values is 32 samples. 32 samples is a distance of approximately 70 cm travelled in sound velocity. If the microphone array is smaller than 70 cm and the microphone array is more than 70 cm away from the wall, then, in theory, in the sample considered only the direct sound has been recorded in all channels.

 

My recorded sample has a reflected sound that is louder than the direct sound. How can I set the last index as to not include the part of the sample with the reflected sound?

In generating the impulse response list file or TSP response list file, there is an input for “Cut Index [sample]” in harktool4 or “Peak Search From/To [sample]” in harktool5.
http://hark.jp/document/hark-document-en/subsec-HARKTool-ImpulseResponse.html
In this input, the ending position can be used to ignore (delete) the reflection. By setting the end index to a value smaller than index where the reflected sound is found, the samples with the reflected sound can be excluded from the transfer function algorithm.
However, if it still does not work, then it is recommended to do re-recording.

 

When creating the TSP response list file, what does the input, “TSP Offset [sample]” mean?

This is the number samples from the start of the recorded wav files to discard from processing. The default 16384 is equal to the length of one TSP sample in the 16384.little_endian.wav file.
This default value basically means “discard the first 16384 samples of wav file (one TSP sample)” when generating the transfer function.
There are cases where all the TSP recording samples are needed or there will be an error. In these cases “TSP Offset [sample]” should be set to 0.

 

I have installed HARK in Windows 10. When I run HARK, the following error occurs:
Server startup failed
Module.js: 474
Throw err;
^
Error: Can not find module 'express'

How can I fix this?

It seems that dependency packages necessary for starting HARK Designer are not installed. Please try installing the dependency packages according to the instructions found in this link:
https://www.hark.jp/install/windows/

 

How do I install HARK?

The installation procedure can be found at:
https://www.hark.jp/install/linux/

The information on supported environment is also found in that link.
Although kaldidecoder is missing:
http://www.hark.jp/document/hark-document-en/subsec-KaldiDecoder.html

Please refer to “6.8.2.5 Installation method”. You can install it via:
Sudo apt-get install kaldidecoder-hark

 

I am implementing speech recognition. I have observed that it can pick up the noise does not recognize speech. What can I do to improve this?

Currently, HARK does not provide a module for identifying noise or voice. However, noise rejection function is implemented in the speech recognizer Julius.

Please read more about Julius from this page:
http://julius.sourceforge.jp/index.php?q=doc/gmm.html

 

What is KaldiDecoder for?

KaldiDecoder is an acoustic model developed for HARK please more details from:
http://www.hark.jp/document/hark-document-en/subsec-KaldiDecoder.html

 

I installed HARK-kinect from the package repository. I tried everything both with Ubuntu 10.04 LTS and Ubuntu 11.10 (both 32 bit).
First, I noted that the device doesn’t show up with the recognised alsa devices (/proc/asound/cards). Then tried what happens if I called kinect_upload_fw manually (as in /lib/udev/rules.d/55-kinect_audio.rules).
This gives an error:
$ sudo /usr/sbin/kinect_upload_fw
/lib/firmware/kinect/UACFirmware.C9C6E852_35A3_41DC_A57D_BDDEB43DFD04
libusb:error [op_set_configuration] failed, error -1 errno 110
About to send: 09 20 02 06 01 00 00 00 60 00 00 00 00 00 00 00 15 00
00 00 00 00 00 00
libusb:error [submit_bulk_transfer] submiturb failed error -1 errno=2
Error: res: -1 transferred: 0 (expected 24)

What should I do?

It seems that the following have not been installed on your machine:
sudo apt-get install libusb-dev libusb-1.0.0-dev pulseaudio

After doing this, please reinstall HARK-Kinect.

 

What is the HARK license agreement?

The licenses of HARK packages can be found at:
https://www.hark.jp/packages/package-list-and-license/
The licenses include specific terms of use.

 

Is there a way to know what are the dimension of the unit sphere, (for e.g. x 1 y 0 means 1 meter and half) instead of measuring by hand ?

The transfer function is measured from around 1.5m away from the microphone array. However, since the function changes only a little for the distance-axis variation, the existence of the localization result does not mean that the sound exists at “1.5m” from the microphone array. It says only the direction.

 

I’m using DisplayLocalization or PyDisplayLocalization nodes. I even implement some python code to display my results. I was wondering if there is a way to maintain the localization window at the end of the simulation because the windows disappears at the end of the simulation.

For pyDisplayLocalization, you can add “pylab.savefig(FILENAME)” in the corresponding python script to save the current plot. Please see information of this function at:
http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure.savefig

Other solution is to use SaveSourceLocation. You can save the localization results in a text file, and plot as you like. Please see the document on this node at:
http://www.hark.jp/document/hark-document-en/subsec-SaveSourceLocation.html

 

I am using HARK with TAMAGO. When I run my network, buffer overrun occurs. Below are the relevant parameters of my network:
LENGTH: 512
SAMPLING_RATE: 16000
ALSA is used for the device driver.
The number of channels is 8.

What should be the problem?

Please check if your WINDOW_TYPE parameter of LocalizeMUSIC is set to FUTURE, which is the default value, to avoid this error.

 

Is HARK compatible with Mac OS X?

Unfortunately, there is no HARK package available for Mac OS X because its microphone input library (CoreAudio) is not yet supported in HARK. Although you can compile from source code, the microphone input will be a bottleneck. You may use a virtual machine for Linux in Mac to proceed using HARK for just analuzing audio files and not for sequential processing of microphone input.

 

I’m having troubles connecting a Microcone through HARK for speech recognition purposes using Pocketsphinx. Until now I have followed this ( http://sourceforge.net/p/cmusphinx/discussion/help/thread/b2a9d3bb/) thread for the decoding of the raw data, but parsing the sent data fails before passing it to pocketsphinx.
I have this error:
srcinfo_data = srcinfo.unpack(data)
struct.error: unpack requires a string argument of length 20
What should I do?

As written in the Hark documentation:
http://www.hark.jp/document/hark-document-en/subsec-HarkDataStreamSender.html
The format of the sent data depends on the network connection. Please check the input connections of HarkDataStreamSender.

For any connection patterns, the beginning is HD_Header. Please parse HD_Header first. It is recommend to dump the data to a file, open by a binary editor, and parse it manually.

 

I have tried to do recording in HARK-Designer using Microcone in Ubuntu environment. But I couldn’t hear anything in the outputed file. What could be wrong?

Please check the following:
1. Microphone volume is not 0
2. Sound card is recognized
3. Microcone is recognized
4. Settings of the Microcone by looking at the tutorials at: https://www.hark.jp/install/windows/microcone/

 

I have tried to do recording in HARK-Designer using Microcone in Ubuntu environment. But the recorded sound seems deteriorated. What could be the cause?

Please check if the CHANNEL_COUNT of AudioStreamFromWave is set to 7 since Microcone has 7 channels

 

I am using HARK in Ubuntu 12.04 with Microcone. When I run my network, the following error occurs:
This sound device does not have any playback controls
How can I fix this?

There is a bug in some earlier versions of Pulse Audio, including for Ubuntu 12.04, that cause the microphone channel gains to be set to zero. This issue is not isolated to the Microcone device.
It is confirmed that the issue can be fixed by removing Pulse Audio from your system. Please note that this of course means that Pulse Audio components, such as sound volume icon, will no longer appear.
To do this:
1) Remove Pulse Audio: sudo apt-get autoremove pulseaudio
2) Reboot
3) Install Gnome ALSA Mixer if not already present: sudo apt-get install gnome-alsamixer
4) In Gnome ALSA Mixer, go to the USB Mixer panel for input devices, and check the boxes to enable recording.

 

I would like to use two microphones for sound localization. I have created a network with two AudioStreamFromMic and MergeChannel but we got an error when running:
Trying to cast ObjectRef
(N2FD6MatrixIfEE) into N2FD6VectorIfEE.

Do you have any idea how to merge 8 channels from two different microphones?

Currently, HARK cannot accept two AudioStreamFromMic nodes in one network file.
If you will use two different USB interfaces for connection, two recordings cannot be synchronized completely sample-by-sample. Since many microphone array processing methods, including ones used in HARK, assume this level of synchronization, the difference may degrade the performance.
A research topic called distributed microphone arrays is tackling this problem.
For merging two microphone arrays, two options are suggested:
(1) Hardware: Develop a hardware that merges two microphones that makes one 8 channel microphone array.
(2) Software: Suppose you use device driver ALSA on Linux, you may merge two arrays by setting ALSA properly (e.g., http://alsa.opensrc.org/TwoCardsAsOn)

 

I have enabled DEBUG_TRUE parameter of LocalizeMUSIC. It outputs something like this:
0: -0.87 - 0.50 0.00 0.00 1: - 0.50 - 0.87 0.60 0.00
MUSIC spectrum: 0.000000 0.000000
What does these numbers mean?

0: -0.87 - 0.50 0.00 0.00 1: - 0.50 - 0.87 0.60 0.00
MUSIC spectrum: 0.000000 0.000000
The first line shows the result of searching the peak on the MUSIC spectrum. It shows the direction of the top two peaks (3 dimensional vectors) and the power. The second line is the MUSIC spectrum itself. It is included in the transfer function each value corresponding to the steering vector that has been displayed (if 72 direction 72 direction min.)

 

I want to do MUSIC spectral visualization using another tool (e.g. MATLAB). Do I need to individually extract the numerical values ​​of the MUSIC spectrum displayed by my localization network? Is there a way to ouput the data of MUSIC spectrum visualization into a file?

There are several ways to do this:
(1) If you just want to visualize the MUSIC spectrum, use HARK-python and use the node called plotMusicSpec
(2) Execute your localization network on the command line and put the file name where you want the data to be stored:
batchflow localize.n> log.txt
After that, take out only the line starting with MUSIC spectrum: and remove the MUSIC spectrum: in the head. You can convert the spaces to commas to make file a CSV.

 

Is the origin of the microphone array file the center of the microphone array or the posistion of the sound source?

As described in Table 5.7: microphone position Tag in the following URL, the center of the microphone array is the origin.
http://www.hark.jp/document/2.0.0/hark-document-en/subsec-MICPOSFORMAT.html

 

How can I start or stop execution of network from another application?

Just like HARK-Designer, you may control start and stop of execution of network through “http GET.”
To execute network:
http://localhost:3000/execution/YOURNETWORK.n
To abort:
http://localhost:3000/abort

 

The format of the file output by HARK is not recognized correctly. Is this alright?

Detection of the HARK Designer file format is only partially implemented. Even if the file will be unknown file type, there will be no problem when the network will be executed.

 

In creating my own node. Is it necessary to compile it?

The Hark nodes are implemented in C++ so it is necessary to compile it.

For additional information, HARK has a package called HARK-Python that can run Python code on nodes. With this, compilation in unnecessary.

 

Is there a node in HARK for acquiring pitch?

Unfortunately there is no node for pitch extraction. But there are free software availble for pitch extraction.

 

I am trying to send data to a Python program using HarkDataStreamSender in Windows. But when I execute my network, the following error occurs:
Can not find the host

How can I fix this?

“Can not find the host” error means that the HarkDataStreamSender node was not able to connect to the host or port of destination.

Please do the following:
1. Check the properties of the HarkDataStreamSender
2. Check the host and port of the Python program to receive data
3. Make sure that no security software is blocking the port
4. Check if other nodes in the network file are in conflict with the HarkDataStreamSender

 

To improve the localization, we use the extracted information from the binaural signal AND the information relative to the microphones position (the micorphones can move) in a filtering strategy. So I wonder how the node AudiostreamFromMic works. I mean I would want to have an idea of the “real time” during the sound acquisition. I read in the cookbook p7 :
Recording time = (LENGTH + (number of frames - 1) * ADVANCE)/SAMPLING RATE
(i used to work with non-overlapping signals so for me LENGTH = ADVANCE)
But is there any difference between the current frame and the real time? What happens if the calculation time of the HARK-loop-step is bigger than the time : LENGTH / SAMPLING_RATE ? Do we lose some data? In other words, lets say that k = 0 to numer_of_frames. Could I say that for any step k, the time from the simulation start(the real time) is equal to recording_time(k) ?

Firstly, the equation:
Recording time = (LENGTH + (number of frames - 1) * ADVANCE)/SAMPLING RATE
explains the relationship between the HARK parameters and the duration of the recorded time. This equation is to calculate how long you can record the sound using a particular set of parameters.

Here, real time means that the computation time is less than the input duration. Therefore, if HARK works in real time, one loop step finishes less than LENGTH / SAMPLING_RATE. If you put a very heavy node in the loop then the computation time exceeds LENGTH / SAMPLING_RATE, you will lose the data sometimes because the delay is accumulated.

 

I was able to do sound source localization using HARK. I have saved the data using SaveSourceLocation. To confirm if the data is correct, I used LoadSoureLocation but the network terminates immediately. How can I reproduce the sound source localization result?

The output of SaveSourceLocation is in XML format. It is better to plot the results using an appropriate library that can read XML.

 

I am trying out speech recognition using HARK in Windows. But I encountered an error that socket connection to Julius was not established. What should I do?

Please check the following:
(1) In Windows, the security software blocks internal communication (10500 of localhost)
(2) Since socket communication is established every time, Julius (server) needs to be started first before starting HARK

 

Can I connect HARK with Google and Azure’s speech recognition?

Unfortunately, there is no documentation on how to do this. But since HARK can save separated sound to wav file through SaveSourceLocation (see http://www.hark.jp/document/hark-cookbook-en/subsec-Localization-010.html), you can use these files by uploading it to those sound engines.

 

In receiving acoustiv signal over TCP/IP using AudioStreamFromMic, we have to set DEVICE_TYPE to NETWORK and the DEVICE to IP address. How do you specify the port number? Is there a format to be used?

For the port number, the default value are 52920 for control signal and 52921 for data transmission.

You can directly specify the IP Address:
127.0.0.1

You can also specify IPAddress, control port, and the data port in this order separated by colons:
127.0.0.1: 52920: 52921

 

What should I do to receive acoustic signal over TCP/IP using AudioStreamFromMic node?

Please be sure to set the following:
1. AudioStreamFromMic (DEVICETYPE = NETWORK) works a a client, and your Audio sender works as a server.
2. You need to create two sockets. One is for control (port number is 52920) and the other is for data (port number is 52921).

Before sending data, your program needs to:
1. receive 100 and send 101 on control socket to establish a connection.
2. receive 200 and sen 201 to start data sending.

In sending data, you can send the way written in HARK document:
http://www.hark.jp/document/hark-document-en/subsec-AudioStreamFromMic.html

After sending data, your program needs to:
1. receive 300 and send 301 to stop recording
2. receive 900 and send 901 to close connection.”

 

I am writing you for a problem we had with Hark. We bought some new computers with USB-3 and we found that it is not so very well supported by Ubuntu 12.04. Since we are working a lot with Kinect, we were forced to move to 12.10 that has the support for USB-3. Sadly, when we tried to install Hark we had this problem:
$ sudo apt-get install harkfd
[sudo] password for stephane-isir:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
harkfd : Depends: libboost-thread1.46.1 (>= 1.46.1-1) but it is not
installable
Depends: libharkio2 but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

The version of boost that is installed in Ubuntu 12.10 is libboost-thread1.49. Could you help me on solve this issue?

This boost-version problem occurs for every Ubuntu update.

A pre-compiled package for Quantal (12.10) is available at:
https://launchpad.net/ubuntu/quantal/+package/libboost-thread1.46.1
The apt-get will require other boost libraries because current HARK uses them. You can search the packages required.

Back to Top