High Performance Computing (HPC) systems have become increasingly complex and have significant impacts on the economy and society. However, their high energy consumption is a critical issue in the face of environmental and energy crises. Therefore, it is crucial to develop strategies to optimize the management of HPC systems, ensuring both top-tier performance and improved energy efficiency. One such strategy is to predict job failures before their execution on the system, allowing for resource allocation and scheduling adjustments. This paper focuses on job failure prediction at submit-time using machine learning algorithms, combined with Natural Language Processing (NLP) tools to represent jobs. Additionally, the approach is designed to work in an online fashion with a real system. The study utilizes a dataset from an HPC center in Italy, and the experimental results demonstrate promising outcomes.
Live Search
Blocksy: Search Block
Posts
Discere veritus detraxit pri ut, sea ei dicunt theophrastus. Eum harum animal debitis cu
Melissa Peterson
Popular Posts
Contact Info
Lorem ipsum dolor sit amet has ignota putent ridens aliquid indoctum anad movet graece vimut omnes.
Blocksy: Contact Info
About Us
Useful Information
Vim in meis verterem menandri, ea iuvaret delectus verterem qui, nec ad ferri corpora.
Euismod nisi porta lorem mollis. Interdum velit euismod in pellentesque.