HARK Document Version 2.0.0. (Revision: 6357) : HarkDataStreamSender

6.1.5 HarkDataStreamSender

6.1.5.1 Details of the node

This node sends the following acoustic signal results by socket communication.

Acoustic signal
Frequency spectrum after STFT
Source information of source localization result
Acoustic feature
Missing Feature Mask

6.1.5.2 Necessary file

No files are required.

6.1.5.3 Usage

When to use

This node is used to send the above data to a system external to HARK using TCP/IP communication.

Typical connection

In the example in Figure 6.10, all input terminals are connected. It is also possible to leave input terminals open depending on the transmitted data. To learn about the relation between the connection of the input terminals and transmitted data, see “Details of the node”.

$\includegraphics[width=150mm]{fig/modules/HarkDataStreamSender}$

Figure 6.10: Connection example of HarkDataStreamSender

6.1.5.4 Input-output and property of the node

Table 6.6: Parameter list of HarkDataStreamSender

Parameter name	Type	Default value	Unit	Description
HOST	`string`	localhost		Host name /IP address of the server to which data is sent
PORT	`int`	8890		Port number for outbound network communication
ADVANCE	`int`	160	[pt]	Shift length of frame
BUFFER_SIZE	`int`	512		Size of allocated float-sized memory for socket communication
FRAMES_PER_SEND	`int`	1	[frm]	Frequency of socket communication in frame unit
TIMESTAMP_TYPE	`string`	GETTIMEOGDAY		Time stamped to the sent data
SAMPLING_RATE	`int`	16000	[Hz]	Sampling frequency
DEBUG_PRINT	`bool`	`false`		ON/OFF for outputting debugging information
SOCKET_ENABLE	`bool`	`true`		Flag to determine whether or not to perform the socket output

Input

MIC_WAVE: : Matrix<float> type. Acoustic signal (The number of channels $\times $ acoustic signal of window length size STFT in each channel)
MIC_FFT: : Matrix<complex<float> > type. Frequency spectrum (The number of channels $\times $ spectrum of each channel)
SRC_INFO: : Vector<ObjectRef> type. Source information on the source localization results of several sound sources
SRC_WAVE: : Map<int, ObjectRef> type. A sound source ID and acoustic signal (Vector<float> type) data pair.
SRC_FFT: : Map<int, ObjectRef> type. A sound source ID and frequency spectrum (Vector<complex<float> > type) data pair.
SRC_FEATURE: : Map<int, ObjectRef> type. A sound source ID and acoustic feature (Vector<float> type) data pair.
SRC_RELIABILITY: : Map<int, ObjectRef> type. A sound source ID and mask vector (Vector<float> type) data pair.

Output

OUTPUT: : ObjectRef type. Same output as the input.

Parameter

HOST: : string type. IP address of a host to which data is transmitted. It is invalid when SOCKET_ENABLED is set to false.
PORT: : int type. Socket number. It is invalid when SOCKET_ENABLED is set to false.
ADVANCE: : int type. Shift length of a frame. It must be equal to the value set in previous processing.
BUFFER_SIZE: : int type. Buffer size secured for socket communication.
FRAMES_PER_SEND: : int type. Frequency of socket communication in frame unit.
TIMESTAMP_TYPE: : string type. Setting for time stamped to sent data. If TIMESTAMP_TYPE=GETTIMEOFDAY, the time taken by gettimeofday is stamped. If TIMESTAMP_TYPE=CONSTANT_INCREMENT, the frame time calculated by SAMPLING_RATE is incremented to the stamped current time.
SAMPLING_RATE: : int type. Sampling frequency of the input signal. This is valid only when TIMESTAMP_TYPE=CONSTANT_INCREMENT.
DEBUG_PRINT: : bool type. ON/OFF of debug to standard output.
SOCKET_ENABLE: : bool type. Data is transferred to the socket when true and not transferred when false.

6.1.5.5 Details of the node

Description of the parameters

For HOST, designate a host name or an IP address of the host running an external program to transmit data. For PORT, designate a network port number for data transmission. ADVANCE is the shift length of a frame and must be equal to the value set in previous processing. BUFFER_SIZE is a buffer size to be secured for socket communication. A float type array of BUFFER_SIZE * 1024 is secured at the time of initialization. It must be greater than the transmitted data. FRAMES_PER_SEND is the frequency of socket communication in frame unit. The default value is 1 and sufficient for the most cases, which sends data in every frame. If you want to reduce the amount of socket communication, increase this value. TIMESTAMP_TYPE is the setting for time stamped to sent data. SAMPLING_RATE is the sampling frequency of the input signal. DEBUG_PRINT indicates if debug to standard output should be displayed. This outputs some parts of the transmitted data. For more information, see “Debug” in Table 6.13. When SOCKET_ENABLED is set to false, data is not sent to external systems. This is used to perform a network operation check for HARK without operating an external program.

Details of data transmission

(B-1) Structure for data transmission

Data transmission is performed for each frame, being divided into some parts. The structures defined for data transmission are listed as follows.

HD_Header
Description: A header that contains basic information on top of the transmitted data
Data size: 3 * sizeof(int) + 2 * sizeof(int64)

Table 6.7: Member of HD_Header

Variable name	Type	Description
type	int	Bit flag that indicates the structure of the transmitted data.
		For relations between each bit and data to be transmitted, see Table 6.8.
advance	int	Shift length of a frame
count	int	Frame number of HARK
tv_sec	int64	timestamp of HARK in seconds
tv_usec	int64	timestamp of HARK in micro-seconds

Table 6.8: Each bit and transmit data of the types of HD_Header

Number of digits	Related input terminal	Transmit data
The first column	MIC_WAVE	Acoustic signal
The second column	MIC_FFT	Frequency spectrum
The third column	SRC_INFO	Source localization result source information
The fourth column	SRC_INFO, SRC_WAVE	Source localization result source information
		+ acoustic signal for each sound source ID
The fifth column	SRC_INFO, SRC_FFT	Source localization result source information
		+ frequency spectrum for each sound source ID
The sixth column	SRC_INFO, SRC_FEATURE	Source localization result source information
		+ acoustic feature for each sound source ID
The seventh column	SRC_INFO, SRC_RELIABILITY	Source localization result source information
		+ missing feature mask for each sound source ID

In HarkDataStreamSender , The transmitted data differs depending on whether the input terminal can be opened. On the receiving end, the transmitted data can be interpreted according to their types. Examples are given below. Further details on transmitted data are given in (B-2).

In the case that only the MIC_FFT input terminal is connected, the type is 0000010 in binary number. Moreover, the transmitted data becomes only a frequency spectrum for each microphone.
In the case that the three input terminals of MIC_WAVE, SRC_INFO and SRC_FEATURE are connected, the type is 0100101 in binary. The data to be transmitted are acoustic signals for each microphone, source information of a source localization result and acoustic features for each sound source ID.

For the four input terminals of SRC_WAVE, SRC_FFT, SRC_FEATURE and SRC_RELIABILITY, the data to be transmitted are information for each sound source ID and therefore information of SRC_INFO is required. Even if the above four input terminals are connected without connecting SRC_INFO, no data is transmitted. In such a case, the type is 0000000 in binary.

HDH_MicData
Description: Structural information on the array size for sending two-dimensional arrays
Data size: 3 * sizeof(int)

Table 6.9: Member of HDH_MicData

Variable name	Type	Description
nch	int	Number of microphone channels
length	int	Data length (number of columns of the two-dimensional array to be transmitted)
data_bytes	int	Number of bytes of data to be transmitted. In the case of a float type matrix,
		`nch * length * sizeof(float)`.

HDH_SrcInfo
Description: Source information of a source location result
Data size: 1 * sizeof(int)+ 4 * sizeof(float)

Table 6.10: Member of HDH_SrcInfo

Variable name	Type	Description
src_id	int	Sound source ID
x[3]	float	Three-dimensional position of sound source
power	float	Power of the MUSIC spectrum calculated in LocalizeMUSIC

HDH_SrcData
Description: Structural information on the array size for sending one-dimensional arrays
Data size: 2 * sizeof(int)

Table 6.11: Member of HDH_SrcData

Variable name	Type	Description
length	int	Data length (number of one-dimensional array elements to be transmitted)
data_bytes	int	Number of bytes of transmitted data. In the case of a float type vector, `length * sizeof(float)`.

(B-2) Transmitted data

Table 6.12: Data list in order of sending and connection input terminal (The data with the $\circ $ symbol is transmitted. $\circ ^*$ indicates the data that are not transmitted when the SRC_INFO terminal is not connected)

Details of the transmitted data			Input terminal and transmitted data
	Type	Size	MIC_WAVE	MIC_FFT	SRC_INFO	SRC_WAVE	SRC_FFT	SRC_FEATURE	SRC_RELIABILITY
(a)	HD_Header	`sizeof(HD_Header)`	$\circ $	$\circ $	$\circ $	$\circ $	$\circ $	$\circ $	$\circ $
(b)	HDH_MicData	`sizeof(HDH_MicData)`	$\circ $
(c)	float[]	`HDH_MicData.data_bytes`	$\circ $
(d)	HDH_MicData	`sizeof(HDH_MicData)`		$\circ $
(e)	float[]	`HDH_MicData.data_bytes`		$\circ $
(f)	float[]	`HDH_MicData.data_bytes`		$\circ $
(g)	int	`1 * sizeof(int)`			$\circ $	$\circ ^*$	$\circ ^*$	$\circ ^*$	$\circ ^*$
(h)	HDH_SrcInfo	`sizeof(HDH_SrcInfo)`			$\circ $	$\circ ^*$	$\circ ^*$	$\circ ^*$	$\circ ^*$
(i)	HDH_SrcData	`sizeof(HDH_SrcData)`				$\circ ^*$
(j)	short int[]	`HDH_SrcData.data_bytes`				$\circ ^*$
(k)	HDH_SrcData	`sizeof(HD_SrcData)`					$\circ ^*$
(l)	float[]	`HDH_SrcData.data_bytes`					$\circ ^*$
(m)	float[]	`HDH_SrcData.data_bytes`					$\circ ^*$
(n)	HDH_SrcData	`sizeof(HD_SrcData)`						$\circ ^*$
(o)	float[]	`HDH_SrcData.data_bytes`						$\circ ^*$
(p)	HDH_SrcData	`sizeof(HD_SrcData)`							$\circ ^*$
(q)	float[]	`HDH_SrcData.data_bytes`							$\circ ^*$

Table 6.13: Details of the transmitted data

	Description	Debug
(a)	Transmitted data header. See Table 6.7.	$\circ $
(b)	Structure of acoustic signals	$\circ $
	(number of microphones, frame length, byte count for transmission). See Table 6.9.
(c)	Acoustic signal (number of microphones $\times $ float type matrix of frame length)
(d)	Structure of frequency spectra	$\circ $
	(number of microphones, number of frequency bins, byte count for transmission). See Table 6.9.
(e)	Real part of frequency spectrum
	(number of microphones $\times $ float type matrix of number of frequency bins)
(f)	Imaginary part of frequency spectrum
	(number of microphones $\times $ float type matrix of number of frequency bins)
(g)	Number of sound sources detected	$\circ $
(h)	Source of a source location result. See Table 6.10.	$\circ $
(i)	Structure that indicates that of acoustic signals for each sound source ID	$\circ $
	(frame length, byte count for transmission). See Table 6.11.
(j)	Acoustic signal for each sound source ID (short type linear array of frame length)
(k)	Structure that indicates that of frequency spectra for each sound source ID	$\circ $
	(number of frequency bins, byte count for transmission). See Table 6.11.
(l)	Real part of a frequency spectrum for each sound source ID
	(float type linear array of number of frequency bins)
(m)	Imaginary part of a frequency spectrum for each sound source ID
	(float type linear array of number of frequency bins)
(n)	Structure that indicates that of acoustic features for each sound source ID	$\circ $
	(dimension number of features, byte count for transmission). See Table 6.11.
(o)	Acoustic feature for each sound source ID (float type linear array of dimension number of features)
(p)	Structure that indicates that of MFM for each sound source ID	$\circ $
	(dimension number of features, byte count for transmission). See Table 6.11.
(q)	MFM for each sound source ID (float type linear array of dimension number of features)

Transmitted data is divided for each frame as shown in (a)-(q) of Tables 6.12 and 6.13. Table 6.12 shows the relation between the transmitted data (a)-(q) and the input terminal connected, and Table 6.13 describes the transmitted data.

calculate{
    Send (a)
    IF MIC_WAVE is connected
      Send (b)
      Send (c)
    ENDIF
    IF MIC_FFT is connected
      Send (d)
      Send (e)
      Send (f)
    ENDIF
    IF SRC_INFO is connected
      Send (g)
      (Let the number of sounds ’src_num’.)
      FOR i = 1 to src_num (This is a sound ID based routine.)
        Send (h)
        IF SRC_WAVE is connected
          Send (i)
          Send (j)
        ENDIF
        IF SRC_FFT is connected
          Send (k)
          Send (l)
          Send (m)
        ENDIF
        IF SRC_FEATURE is connected
          Send (n)
          Send (o)
        ENDIF
        IF SRC_RELIABILITY is connected
          Send (p)
          Send (q)
        ENDIF
      ENDFOR
    ENDIF
  }

(B-3) Transmission algorithm Some parts of the algorithm that operate in a loop when executing the HARK network file are shown above. Here, (a)-(q) in the code correspond to (a)-(q) in Tables 6.12 and 6.13.