|Authors||Hryvniak M., Shcherbyna Yu.|
|Name of the article||An approach to dynamic additional training of decision trees|
|Abstract||Most algorithms for constructing a decision tree are not intended to be additionally taught. After constructing the model, they cannot be improved on the basis of additional data, but only rebuild. The ability to improve the decision tree using additional data without rebuilding could give significant benefits in some cases. Such benefits as optimization memory consumption on the hard disk or increase in performance when loading a new portion of data. And since the decision trees are used to solve real problems, their dynamic improving would open up the possibility of dynamic (iterative) optimization of processes and solutions.
There was made an attempt to find generalized approach to dynamic additional training of decision tree ML models. There was web application developed to build, save, perform additional training, test and display decision trees in convenient way using algorithms from Accord .NET Framework. Using web application mentioned approach was tested and conclusions has been made about its expediency and area of usage. As a result of testing on two sets of data, it can be concluded that this approach can be effective. However, it is applicable to a narrow class of problems with the imposition of conditions on the amount and content of data in each portion passed to the algorithm. We can say that at first, we have to build a quite informative model, which in the future will only be clarified. First portions should represent the whole set and contain the whole variety of outputs, otherwise the proposed approach leads to dramatically loss of information.
The following conditions are needed for the effectiveness of the algorithm:
1. In a portion of data, tuples should contain all possible outputs.
2. Size of the portion of data should be large enough (depending on the set).
|PDF format||Hryvniak M., Shcherbyna Yu. |