HARK Cookbook
Version 2.5.0. (Revision: 9008)
HARK Cookbook
Version 2.5.0. (Revision: 9008)
Hiroshi G. Okuno
Kazuhiro Nakadai
Toru Takahashi
Ryu Takeda
Keisuke Nakamura
Takeshi Mizumoto
Takami Yoshida
Angelica Lim
Takuma Otsuka
Kohei Nagira
Tatsuhiko Itohara
Yoshiaki Bando
1 Introduction
2 Learning HARK
2.1 Learning sound recording
2.2 Learning sound localization
2.3 Learning sound separation
2.4 Learning speech recognition
3 Something is Wrong
3.1 Installation fails
3.2 Sound recording fails
3.3 Sound source localization fails
3.4 Sound source separation fails
3.5 Speech recognition fails
3.6 Making a debug node
3.7 Using debug tools
3.8 Checking a microphone connection
4 Microphone Array
4.1 Selecting the number of microphones
4.2 Selecting the layout of the microphone array
4.3 Selecting types of microphones
4.4 Installing a microphone array in a robot
4.5 Selecting a sampling rate
4.6 Using an A/D converter unsupported by HARK
5 Input Data Generation
5.1 Recording multichannel sound
5.2 Recording impulse response
5.3 Synthesizing multichannel sound from impulse response
5.4 Adding noise
6 Acoustic and language models
6.1 Creating an acoustic model
6.2 Creating a language model
7 FlowDesigner
7.1 Running the network from the command line
7.2 Copying nodes from other network files
7.3 Making an iteration using FlowDesigner
8 Sound source localization
8.1 Introduction
8.2 Tuning parameters of sound source localization
8.3 Using part of a microphone array
8.4 Localizing multiple sounds
8.5 Checking if sound source localization is successful
8.6 Too many localization results / no localization results
8.7 Localization results are fragmented / Isolated sounds are connected
8.8 The beginning of the separated sound is ignored
8.9 Localizing the height or distance of a source
8.10 Saving the localization results to a file
9 Sound Source Separation
9.1 Introduction
9.2 Saving separated sounds to files
9.3 Parameter tuning of sound source separation
9.4 Sound source separation using only the microphone array layout
9.5 Separating sounds with stationary noise
9.6 Reducing noise leakage by post processing
9.7 Separating a moving sound.
10 Feature Extraction
10.1 Introduction
10.2 Selecting the threshold for Missing Feature Mask
10.3 Saving features to files
11 Speech Recognition
11.1 Making a Julius configuration file (.jconf)
12 Others
12.1 Selecting window length and shift length
12.2 Selecting the window function for
MultiFFT
12.3 Using PreEmphasis
13 Advanced recipes
13.1 Creating a node
13.2 Improving the processing speed
13.3 Connecting HARK to the other systems
13.4 Controlling a motor
14 Appendix
14.1 Introduction
14.1.1 Category of sample network
14.1.2 Notation of document and method for execution of sample network
14.2 Sound recording network sample
14.2.1 Ubuntu
14.2.2 Windows
14.3 Network sample of sound source localization
14.3.1 Offline sound source localization
14.3.2 Online sound source localization
14.4 Network sample of sound source separation
14.4.1 Off line sound source separation
14.4.2 Off-line sound source separation (with postprocessing by HRLE)
14.4.3 Online sound source separation (with/without postprocessing by HRLE)
14.5 Network samples of Feature extraction
14.5.1 Introduction
14.5.2 MSLS
14.5.3 MSLS + $\Delta $MSLS
14.5.4 MSLS+Power
14.5.5 MSLS+$\Delta $MSLS+Power+$\Delta $Power
14.5.6 MSLS+$\Delta $MSLS+$\Delta $Power
14.5.7 MSLS+$\Delta $MSLS+$\Delta $Power+Preprocessing
14.6 Speech recognition network sample
14.6.1 Running the speech recognition
14.6.2 Evaluating the speech recognition
HARK Cookbook
Version 2.5.0. (Revision: 9008)