Overview
CosmicAC provides managed compute for machine learning workloads. Infrastructure setup can delay execution and divert attention from model development. CosmicAC abstracts this setup, allowing jobs to run immediately and scale as needed without manual server reconfiguration.
Job Types
CosmicAC supports several job types for different ML workflows.
GPU Container
High-performance containers with direct GPU access for training, experimentation, and development.
GPU containers let you:
- Run on-demand GPU compute without managing infrastructure.
- Access GPU hardware directly through secure device plugins.
- Work in VM-level isolated environments for secure, dedicated compute.
- Maintain full control over your environment — install packages, run scripts, and configure as needed.
npx cosmicac jobs init
npx cosmicac jobs create
npx cosmicac jobs list
npx cosmicac jobs shell <jobId> <containerId>See Getting Started: Creating a GPU Container Job to create your first container.
Managed Inference
Run inference on open-source models like Qwen through a managed API.
Managed Inference lets you:
- Access open-source models without deploying or managing serving infrastructure.
npx cosmicac inference init
npx cosmicac inference list-models
npx cosmicac inference chat --message "Explain quantum computing."See Getting Started: Creating a Managed Inference Job to deploy your first model.
Continued Pre-training
Extend base models on your own data for domain-specific tasks.
Continued Pre-training lets you:
- Train on your own datasets.
- Save checkpoints at intervals during training.
Why CosmicAC?
Minimal setup — Submit jobs via the CLI or web interface. CosmicAC provisions GPU resources and schedules your workload automatically, with no manual server requests or environment configuration.
Secure, isolated environments — Each workload runs inside a KubeVirt virtual machine, providing VM-level isolation while maintaining direct GPU access.
Fast provisioning — Start workloads in minutes, not days. CosmicAC replaces manual SLURM-based workflows with automated provisioning and scheduling.
Built-in inference serving — Deploy models instantly via the Managed Inference API. CosmicAC handles API key authentication, load balancing, and service discovery.
Real-time notifications — Receive email and push notifications when costs exceed thresholds or errors occur.
Who is CosmicAC for?
| Role | Use Case |
|---|---|
| ML Engineers | Train models, run experiments |
| Data Scientists | Deploy inference pipelines |
| Software Engineers | Integrate inference API into applications |
| DevOps Teams | Manage GPU infrastructure at scale |
Core Architecture
CosmicAC uses Kubernetes for orchestration and KubeVirt for secure workload isolation. Kubernetes schedules containers, allocates GPU resources, and manages job lifecycle. KubeVirt runs each workload in an isolated virtual machine without requiring privileged containers, applying standard Kubernetes security controls (RBAC, SELinux, network policies) while exposing GPU devices through secure device plugins.
Kubernetes Implementation
CosmicAC uses Kubernetes as its core orchestration layer, replacing manual SLURM-based workflows with automated provisioning and scheduling.
| Before (SLURM) | After (Kubernetes) |
|---|---|
| Request servers manually | Submit jobs via CosmicAC |
| Configure SLURM | Provision infrastructure automatically |
| Set up the environment | Schedule containers automatically |
| Wait days for setup | Start workloads in minutes |
See System Components for detailed documentation of the architecture.
What's next?
- Getting Started: Installation — Install and configure the CosmicAC CLI.
- Getting Started: GPU Container Job — Create your first container.
- Getting Started: Managed Inference Job — Deploy a model and run inference.