Introduction

CosmicAC provides managed compute for machine learning workloads. Infrastructure setup can delay execution and divert attention from model development. CosmicAC abstracts this setup, allowing jobs to run immediately and scale as needed without manual server reconfiguration.

Job Types

CosmicAC supports several job types for different ML workflows.

GPU Container

High-performance containers with direct GPU access for training, experimentation, and development.

GPU containers let you:

Run on-demand GPU compute without managing infrastructure.
Access GPU hardware directly through secure device plugins.
Work in VM-level isolated environments for secure, dedicated compute.
Maintain full control over your environment — install packages, run scripts, and configure as needed.

npx cosmicac jobs init
npx cosmicac jobs create
npx cosmicac jobs list
npx cosmicac jobs shell <jobId> <containerId>

See Getting Started: Creating a GPU Container Job to create your first container.

Managed Inference

Run inference on open-source models like Qwen through a managed API.

Managed Inference lets you:

Access open-source models without deploying or managing serving infrastructure.

npx cosmicac inference init
npx cosmicac inference list-models
npx cosmicac inference chat --message "Explain quantum computing."

See Getting Started: Creating a Managed Inference Job to deploy your first model.

Continued Pre-training

Extend base models on your own data for domain-specific tasks.

Continued Pre-training lets you:

Train on your own datasets.
Save checkpoints at intervals during training.

Why CosmicAC?

Minimal setup — Submit jobs via the CLI or web interface. CosmicAC provisions GPU resources and schedules your workload automatically, with no manual server requests or environment configuration.

Secure, isolated environments — Each workload runs inside a KubeVirt virtual machine, providing VM-level isolation while maintaining direct GPU access.

Fast provisioning — Start workloads in minutes, not days. CosmicAC replaces manual SLURM-based workflows with automated provisioning and scheduling.

Built-in inference serving — Deploy models instantly via the Managed Inference API. CosmicAC handles API key authentication, load balancing, and service discovery.

Real-time notifications — Receive email and push notifications when costs exceed thresholds or errors occur.

Who is CosmicAC for?

Role	Use Case
ML Engineers	Train models, run experiments
Data Scientists	Deploy inference pipelines
Software Engineers	Integrate inference API into applications
DevOps Teams	Manage GPU infrastructure at scale

Core Architecture

CosmicAC uses Kubernetes for orchestration and KubeVirt for secure workload isolation. Kubernetes schedules containers, allocates GPU resources, and manages job lifecycle. KubeVirt runs each workload in an isolated virtual machine without requiring privileged containers, applying standard Kubernetes security controls (RBAC, SELinux, network policies) while exposing GPU devices through secure device plugins.

Kubernetes Implementation

CosmicAC uses Kubernetes as its core orchestration layer, replacing manual SLURM-based workflows with automated provisioning and scheduling.

Before (SLURM)	After (Kubernetes)
Request servers manually	Submit jobs via CosmicAC
Configure SLURM	Provision infrastructure automatically
Set up the environment	Schedule containers automatically
Wait days for setup	Start workloads in minutes

See System Components for detailed documentation of the architecture.

What's next?

Getting Started: Installation — Install and configure the CosmicAC CLI.
Getting Started: GPU Container Job — Create your first container.
Getting Started: Managed Inference Job — Deploy a model and run inference.

Introduction

On this page