Problem
How can I increase the computing speed of HARK?
Solution
The processing speed of HARK is dominated basically by the complexity of the nodes and their algorithms created by the user. As examples, the more you process eigenvalue expansion in LocalizeMUSIC , save of SaveFeatures or indications by cout and cerr, the longer the processing time required for one count. The simplest way of increasing processing speed is to construct simple nodes. (Algorithms have been improved for current HARK nodes to increase the speed of real time processing.) Two other methods can improve processing speed, although the improvements would be slight.
Comment-in IN_ORDER_NODE_SPEEDUP
This function acts at the end of a node class. (Comment-in is described concretely in cc files such as LocalizeMUSIC )
Change the optimization option for compiling.
Add this option while constructing the node. Optimization is performed with a stronger condition, in the order -O, -O1, -O2, and -O3, with processing speeds increasing accordingly.
Concretely, in the case of -O2,
/src/Makefile.am
Add the following to the above
libhark_d_la_CXXFLAGS = @GTK_CFLAGS@ -O2 CXXFLAGS = -g -O2 CFLAGS = -g -O2 FFLAGS = -g -O2
Discussion
Evaluate the performance of each, by analyzing the patterns compiled with options of -O, -O1, -O2, and -O3, and those for which IN_ORDER_NODE_SPEEDUP is further added. Thus, the processing times of eight patterns are compared. For comparison, use the algorithm for simple processing for each node, with processing times measured in 100 nodes connected in series.
int count_time = 100000000; for (i = 0;i < count_time;i++)n = n + i;
Tables 13.1 and 13.2 show results without and with IN_ORDER_NODE_SPEEDUP, respectively. Computing times did not differ significantly, with processing speeds being only 3 percent higher with a combination of an optimization option and IN_ORDER_NODE_SPEEDUP.
Option |
O3 |
O2 |
O1 |
O |
14.2408 |
12.7574 |
14.0147 |
14.1765 |
|
13.9518 |
14.0789 |
14.2417 |
14.3901 |
|
13.912 |
14.0633 |
14.5486 |
13.7121 |
|
14.3929 |
13.9978 |
14.2038 |
14.1017 |
|
13.7976 |
14.3931 |
13.8478 |
14.2374 |
|
14.0315 |
13.9962 |
14.5201 |
14.1924 |
|
14.3108 |
14.0069 |
14.1044 |
14.1694 |
|
14.0055 |
14.3397 |
14.2014 |
14.5729 |
|
14.004 |
14.0419 |
14.467 |
14.1911 |
|
14.4457 |
13.8734 |
14.1159 |
14.2177 |
|
Total |
141.0926 |
139.5486 |
142.2654 |
141.9613 |
Average |
14.10926 |
13.95486 |
14.22654 |
14.19613 |
Option |
O3 + speedup |
O2 + speedup |
O1 + speedup |
O + speedup |
14.0007 |
13.8055 |
14.3469 |
14.4444 |
|
14.3702 |
13.5448 |
13.9894 |
14.1628 |
|
14.0753 |
14.371 |
14.4229 |
13.8679 |
|
12.9333 |
13.8942 |
14.1801 |
14.5209 |
|
14.398 |
13.8926 |
13.7115 |
14.0369 |
|
13.6696 |
14.1745 |
14.5278 |
14.7882 |
|
14.0837 |
14.0613 |
13.9905 |
14.5343 |
|
14.4443 |
14.018 |
14.0915 |
14.1182 |
|
13.0798 |
14.4962 |
14.4936 |
14.5952 |
|
13.6339 |
14.1081 |
14.1904 |
14.2751 |
|
Total |
138.6888 |
140.3662 |
141.9446 |
143.3439 |
Average |
13.86888 |
14.03662 |
14.19446 |
14.33439 |
See Also
None