Table of contents
In this guide, you'll learn how to get your own GPU machine in the cloud using NVIDIA Brev to build, run, and test your own models.
NVIDIA Brev is a cloud service that offers GPU machines for rent on demand. They come preconfigured with Docker and NVIDIA drivers, which makes them a great fit for working with Cog.
Brev connects to multiple cloud providers like GCP, AWS, Lambda Labs, and others to find the right type of GPU for the best price possible.
Brev is a developer-friendly tool with a command line interface that makes it easy to create and manage your GPU machines.
You can build and push most Cog models from machines that don't have GPUs, like your laptop, or a GitHub Actions runner.
This guide helps when you need to iterate on your model and run it as you make code changes, before you push it to Replicate. For that, you'll usually need a GPU development environment.
To get started, go to console.brev.dev and sign up for an account.
Creating an account is free, and Brev has simple pricing: Only pay for what you use.
Install the Brev CLI using Homebrew with the following command:
brew install brevdev/homebrew-brev/brev
.You can create and manage instances using the Brev website, but this guide uses the CLI for the sake of "brev"ity. (Sorry. Couldn't resist.)
Use the CLI to log in to Brev:
brev login
This will prompt you to open a link in your browser to authenticate with Brev.
Now that you've logged in, create an instance and give it a name:
brev create my-dev-box \
--gpu fancy-gpu
You should see output like the following:
$ brev create my-dev-box
Creating instance my-dev-box in org bapj27zsc
name my-dev-box
GPU instance n1-highmem-4:nvidia-tesla-t4:1
Cloud GCP
⡿ Creating your instance. Hang tight 🤙
⣟ Instance is deploying
Your instance is ready!
Connect to the instance:
brev open my-dev-box # brev open <NAME> -> open instance in VS Code
brev shell my-dev-box # brev shell <NAME> -> ssh into instance (shortcut)
Now that you've created your instance, you can access it as a shell:
brev shell my-dev-box
This will open a new terminal session that is SSH connected to your GPU instance.
Cog is Replicate's open-source tool that makes it easy to put a machine learning model in a Docker container.
Install the Cog CLI on your Brev instance:
sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog
To verify that your new instance is working properly, you can run a prediction on an existing model on Replicate.
Run the following command in the terminal to download the fofr/sdxl-emoji model and run it locally on your new instance to generate an emoji of the shaka symbol, AKA the "call me hand": 🤙
cog predict r8.im/fofr/sdxl-emoji@sha256:dee76b5afde21b0f01ed7925f0665b7e879c50ee718c5f78a9d38e04d523cc5e \
-i 'width=1024' \
-i 'height=1024' \
-i 'prompt="A TOK emoji of a hand doing the shaka symbol, AKA call me hand"'
If you see successful output from this task, you've made great progress!
Now that you know Cog is working, create a new model from scratch:
mkdir my-model && cd my-model
cog init
This will output something like the following:
Setting up the current directory for use with Cog...
✅ Created my-cog-model/cog.yaml
✅ Created my-cog-model/predict.py
✅ Created my-cog-model/.dockerignore
✅ Created my-cog-model/.github/workflows/push.yaml
If you're new to creating models on Replicate, check out the guide to push your first Cog model.
Brev can open your cloud instance in VS Code Remote, Cursor, or your favorite IDE. This lets you search and edit all the files on your Brev instance as if they were on your local machine:
Use brev open
to open your instance in your editor:
brev open my-dev-box
In your editor, overwrite the predict.py
file with the following code and save it:
from cog import BasePredictor, Input
class Predictor(BasePredictor):
def setup(self):
self.prefix = "hello"
def predict(self, text: str = Input(description="Text to prefix with 'hello '")) -> str:
return self.prefix + " " + text
Now that you've made some changes to your model code, run it with Cog using existing Brev shell
session, or the built-in terminal if you're using VS Code or Cursor:
cd my-model
cog predict -i text="world"
You should see output like the following:
hello world
You've just built an AI model on a GPU instance in the cloud!
You also have a fast and flexible cloud environment for iterating on your model and testing it before pushing it to Replicate.
Happy hacking! ✨🤙✨
brev ls
to list your instances. This is handy if you forget what you named your new instance!brev stop
your instance when you're not using it to avoid incurring charges.