Official models are always on and have predictable pricing. We maintain them in collaboration with the authors of the model to make sure they're high quality.
A number of models have worked this way for a while, but we're now giving it a name to make clear which ones work like this.
The way you call these models is a little different. If you're using a client library, you don't need to specify a version. For black-forest-labs/flux-1.1-pro in Node.js, for example:
const output = await replicate.run(
"black-forest-labs/flux-1.1-pro",
{ prompt: "A t-rex on a skateboard looking cool" }
);
If you're using the HTTP API, you use the POST /models/<owner>/<name>/predictions
endpoint and you don't need to specify a version. For example:
curl https://api.replicate.com/v1/models/black-forest-labs/flux-1.1-pro/predictions \
--request POST \
--header "Authorization: Bearer $REPLICATE_API_TOKEN" \
--header "Content-Type: application/json" \
--header "Prefer: wait" \
--data @- <<'EOM'
{
"input": {
"prompt": "A t-rex on a skateboard looking cool"
}
}
EOM
Nothing has changed about how you run other models. The best way to find out how to run a model is the API documentation on a model.
Instead of being charged by the amount of time a model runs, you're charged by output. For example, black-forest-labs/flux-1.1-pro is charged for each image it generates, but for other models this might be things like the number of tokens or the length of a video.
You can find out about each model's pricing on the pricing section of the model.
Here are some of the models that are now official:
Take a look at the official models collection for the full list.