6.6.1 SpeechRecognitionClient

6.6.1.1 Outline of the node

This node sends acoustic features to a speech recognition node via a network connection.

6.6.1.2 Necessary file

No files are required.

When to use

This node is used to send acoustic features to software out of HARK. For example, it sends them to the large vocabulary continuous speech recognition software Julius $^{(1)}$ to perform speech recognition.

Typical connection

\includegraphics[width=100mm]{fig/modules/SpeechRecognitionClient}
Figure 6.74: Connection example of SpeechRecognitionClient 

6.6.1.3 Input-output and property of the node

Table 6.62: Parameter list of SpeechRecognitionClient 

Parameter name

Type

Default value

Unit

Description

MFM_ENABLED

bool 

true

 

Select whether or not to send out missing feature masks

HOST

string 

127.0.0.1

 

Host name /IP address of the server on which Julius/Julian is running

PORT

int 

5530

 

Port number for sending out to network

SOCKET_ENABLED

bool 

true

 

The flag that determines whether or not to output to the socket

Input

FEATURES

: Map<int, ObjectRef>  type. A data pair consisting of a sound source ID and feature vector of type Vector<float> .

MASKS

: Map<int, ObjectRef>  type. A data pair consisting of a sound source ID and mask vector of type Vector<float> .

SOURCES

: Vector<ObjectRef>  type.

Output

OUTPUT

: Vector<ObjectRef>  type.

Parameter

MFM_ENABLED

: bool  type. When true is selected, MASKS is transmitted. When false is selected, MASKS input is ignored; a mask of all 1’s is transmitted.

HOST

: string  type. IP address of a host that transmits acoustic parameters. When SOCKET_ENABLED is set to false, it is not used.

PORT

: int  type. The socket number to transfer acoustic parameters. When SOCKET_ENABLED is set to false, it is not used.

SOCKET_ENABLED

: bool  type. When true, acoustic parameters are transmitted to the socket and when false, they are not transmitted.

6.6.1.4 Details of the node

When MFM_ENABLED is set to true and SOCKET_ENABLED, this node sends acoustic features and mask vectors to the speech recognition module via the network port. When false is selected for MFM_ENABLED, normal speech recognition not based on the missing feature theory is performed. In practice, mask vectors are sent out with all elements set to 1, all acoustic features as reliable in other words. When false is selected for SOCKET_ENABLED, the features are not sent to the speech recognition node. This is used to perform checks of the HARK network file without running the external speech recognition engine. For HOST, designate the IP address of HOST on which the external program that sends vectors runs. For PORT, designate a network port number to send the vector.

6.6.1.5 References:

(1) http://julius.sourceforge.jp/en_index.php