Continual learning (CL) is the ability of an intelligent system to acquire and retain knowledge from a stream of data with minimal computational overhead. Various approaches, such as regularization, replay, architecture, and parameter isolation, have been introduced to achieve this goal. Parameter isolation involves using a sparse network to allocate different parts of the neural network to different tasks and sharing parameters between similar tasks. Dynamic Sparse Training (DST) is a method commonly used to find and isolate these sparse networks for each task. This study aims to fill a research gap by empirically investigating the effect of different DST components under the CL paradigm and identifying the optimal configuration for CL. The study conducts a comprehensive evaluation of various DST components on CIFAR100 and miniImageNet benchmarks in a task-incremental CL setup. The focus is on evaluating the performance of different DST criteria rather than the process of mask selection. The study reveals that at low sparsity levels, Erdos-Renyi Kernel (ERK) initialization efficiently utilizes the backbone and facilitates effective learning of task increments. However, at high sparsity levels, uniform initialization demonstrates more reliable and robust performance. The growth strategy’s performance depends on the initialization strategy and the extent of sparsity. Lastly, incorporating adaptivity within DST components shows promise for enhancing continual learners.
Live Search
Blocksy: Search Block
Posts
Discere veritus detraxit pri ut, sea ei dicunt theophrastus. Eum harum animal debitis cu
Melissa Peterson
Popular Posts
Contact Info
Lorem ipsum dolor sit amet has ignota putent ridens aliquid indoctum anad movet graece vimut omnes.
Blocksy: Contact Info
About Us
Useful Information
Vim in meis verterem menandri, ea iuvaret delectus verterem qui, nec ad ferri corpora.
Euismod nisi porta lorem mollis. Interdum velit euismod in pellentesque.