Introduction

An overview of the HPC Dashboard for SLURM clusters

Welcome to the HPC Dashboard! This powerful Next.js application provides comprehensive monitoring and management tools for your SLURM-based High-Performance Computing cluster. With real-time visualization, detailed job tracking, and AI-powered assistance, the HPC Dashboard transforms how you interact with your computing resources.

Dashboard Overview

Open Source Philosophy

HPC Dashboard is proudly open-source and designed for the HPC community! 🎉 We believe in creating accessible, powerful tools that make cluster management more intuitive.

Contribute

Interested in helping improve the HPC Dashboard? Check out our GitHub repository to get started! From feature suggestions to bug fixes, all contributions are welcome.

Project Overview

HPC Dashboard is a comprehensive toolkit designed for modern cluster administrators and users. Key features include:

  • Real-time Monitoring: Track cluster utilization, job status, and node health in real-time
  • Interactive Visualizations: Intuitive, graphical representation of complex cluster data
  • AI-powered Chat: Get answers about SLURM commands, job details, and best practices
  • Embeddings Support: Search documentation and receive contextually relevant answers

Key Features

FeatureDescription
Node MonitoringReal-time visualization of CPU, memory, and GPU utilization
Job TrackingDetailed job information with status updates and history
SLURM IntegrationDirect interfacing with SLURM's REST API
OpenAI Chat AssistantAI-powered support for SLURM queries and cluster information
Prometheus MetricsIntegration with Prometheus for advanced performance metrics
Documentation SearchEmbeddings-based search for finding relevant information
Module System IntegrationDisplay and search available LMOD/Environment modules

Technology Stack

HPC Dashboard leverages cutting-edge technologies to ensure performance and flexibility:

  • Next.js - Server components and app router architecture for dynamic content
  • TypeScript - Type-safe coding for reliability and maintainability
  • Tailwind CSS - Utility-first styling for responsive, clean interfaces
  • Shadcn/UI - Elegant, accessible components built on Radix UI
  • AI SDK - OpenAI integration for intelligent assistant capabilities
  • Drizzle ORM - Type-safe database operations for embedding storage
  • PostgreSQL Vector DB - Efficient storage and retrieval of document embeddings
  • Recharts - Interactive data visualization components

Directory Structure

actions
app
admin
api
login
modules
network
rewind
favicon.ico
globals.css
components
data
docs
slurm-overview.mdx
images
lib
public
types
utils
.env

Deployment

The HPC Dashboard is best deployed locally using PM2

For detailed deployment instructions, check out our Deployment Guide.

Support and Community

For support with the HPC Dashboard, you can:

We're dedicated to making HPC management more accessible and intuitive through this dashboard, and we welcome feedback and contributions from the community!


Made with ❤️ for HPC