Sparse-matrix arithmetic operations in computer clusters: A text feature selection application

Abstract

Arithmetic operations on matrices are frequently used in scientific computing areas. They usually become a performance bottleneck due to their high complexity. In this context, the parallel processing of matrix operations in distributed environments arises as an important field of study. This work presents several strategies for distributing sparse matrix arithmetic operations on computer clusters, focusing on the intrinsic characteristics of the operations and the matrices involved. The performance of the proposed strategies for determining the number of parallel tasks to be executed on the computer cluster was evaluated considering a high-dimensional feature selection approach. Additionally, the performance of two alternatives for efficiently representing bigscale sparse matrices was tested. Experimental results showed that the proposed strategies significantly reduce the computing time of matrix operations, outperforming computations based on serial and multi-thread implementations.