GPU Compute Built
for Serious AI.
RTX 3090 nodes with 24 GB GDDR6X memory, 10,496 CUDA cores, and sub-60-second provisioning. Pre-baked PyTorch and TF images. Priced for the experiment that runs at midnight.
- 24 GBGDDR6X VRAM
- $0.15per GPU hour
- < 60sprovisioning time
- 10496CUDA cores
Built for these workloads
Architecture
24 GB VRAM. Serious compute. Hourly pricing.
RTX 3090 with Ampere architecture, NVLink support, and GDDR6X memory bandwidth tuned for the throughput patterns of large-model inference and fine-tuning at 8-bit.
- 24 GB GDDR6X · 936 GB/s memory bandwidth
- 10,496 CUDA cores · 328 tensor cores (3rd gen)
- NVIDIA NVLink for multi-GPU jobs
- PCIe 4.0 · NVMe-backed persistent volumes
- Pre-baked images: PyTorch 2.x, TF 2.x, CUDA 12.x
# stream logs from running job
$ bhk gpu logs gpu-node-01 --follow
→ Epoch 3/50 · loss=2.089 · 87% GPU
How It Works
From key to running in seconds.
Set an API key, launch a node, and push your first job in under two minutes.
Get an API Key
Log in at ai.bhkcloud.com/dashboard and generate a BHK_API_KEY. One key controls GPU, storage, and billing.
Launch a Node
bhk gpu launch --type rtx3090 provisions a node with your chosen image. SSH access in under 60 seconds.
Run Your Job
SSH in, mount your BHK S3 datasets, and run. Billing starts on launch and stops on terminate to the second.
Terminate & Pay
bhk gpu terminate gpu-node-01. Billing stops. No idle charges. No minimums. Invoice at month-end.
Cluster Profiles
Right-size for your workload.
From single-card experiments to multi-node training clusters. All at $0.15/GPU/hr.
| Profile | GPUs | VRAM | CUDA Cores | Ideal For | Price / hr |
|---|---|---|---|---|---|
| Single | 1× RTX 3090 | 24 GB | 10,496 | Inference, fine-tuning, experiments | $0.15 |
| Dual | 2× RTX 3090 | 48 GB | 20,992 | Larger models, parallel inference | $0.20 |
| Quad Popular | 4× RTX 3090 | 96 GB | 41,984 | 30B model training, distributed inference | $0.40 |
| Octa | 8× RTX 3090 | 192 GB | 83,968 | 70B+ model training, production inference | $0.80 |
| Custom | 16–256× | 384 GB+ | Scalable | Dedicated clusters, reserved capacity | Custom |
Platform
Infrastructure without the overhead.
Everything a machine learning engineer needs, nothing a procurement team invented.
Pre-baked ML images
PyTorch 2.x, TensorFlow 2.x, CUDA 12.x, and cuDNN 9 ready to pull. Custom Docker images via --image flag.
Persistent NVMe volumes
Attach SSD-backed volumes that survive node restarts. Snapshot and clone between regions in one API call.
Co-located S3 storage
Stream training data directly from BHK S3 at 2–4 GB/s without paying egress between compute and storage.
SSH & REST access
Shell in directly or drive everything through the REST API. Both are first-class citizens, not afterthoughts.
Orchestration-ready
Terraform provider, Pulumi SDK, and Kubernetes CSI driver. Bring your existing infra-as-code workflow.
Encrypted at rest & in transit
AES-256 at rest, TLS 1.3 in transit. VRAM is wiped on node termination before the hardware returns to pool.
FAQ
GPU cloud, answered.
Questions we get asked before the first bhk gpu launch.
What workloads run well on the RTX 3090?
The RTX 3090's 24 GB VRAM makes it excellent for LLM inference (models up to ~30B parameters in 8-bit quantization), image generation (Stable Diffusion XL, ComfyUI, Flux), model fine-tuning with LoRA/QLoRA, batch rendering, and video encoding. Anything that fits in 24 GB runs cleanly on a single node.
How does hourly GPU billing work?
You are billed for every hour your GPU node is running, pro-rated to the second. There are no minimum commitments. Spin up for a single experiment and terminate when done. Billing stops the moment you call bhk gpu terminate or terminate via the dashboard.
How fast is GPU provisioning?
Most nodes are ready within 60 seconds. Pre-baked base images for PyTorch 2.x, TensorFlow 2.x, and CUDA 12.x eliminate environment setup so your training job can start immediately after SSH access is available.
Can I use my own Docker image?
Yes. BHK GPU nodes support custom Docker images via bhk gpu launch --image docker.io/yourrepo/yourimage:tag. The image is pulled and cached at the region edge. Private registries are supported with credential injection through the API.
Do you support multi-GPU jobs?
Yes. Nodes are available in 1×, 2×, 4×, and 8× RTX 3090 configurations. For distributed training across multiple nodes, use the Quad or Octa profiles with NVLink-bridged inter-GPU bandwidth. Larger clusters are available via the enterprise plan.
How is storage connected to GPU nodes?
GPU nodes and BHK S3 buckets are co-located on the same internal network. Direct intra-cluster transfers run at 2–4 GB/s with no egress fees. You can also attach persistent NVMe volumes for checkpoint storage that survive node restarts.
Ready to launch your first node?
Set your API key, pick a cluster profile, and be running in under 60 seconds. No commitments.