Using the Chat API

The Chat API endpoint is used to generate text with Cohere LLMs. This endpoint facilitates a conversational interface, allowing users to send messages to the model and receive text responses.

import cohere
co = cohere.Client(api_key="<YOUR API KEY>")

response = co.chat(
  model="command-r-plus",
  message="Write a title for a blog post about API design. Only output the title text."
)

print(response.text) # "The Art of API Design: Crafting Elegant and Powerful Interfaces"

public class ChatPost {
  public static void main(String[] args) {
    Cohere cohere = Cohere.builder().token("<YOUR API KEY>").build();

    NonStreamedChatResponse response = cohere.chat(
      ChatRequest.builder()
      	.model("command-r-plus")
      	.message("Write a title for a blog post about API design. Only output the title text.")
    )

    System.out.println(response); // "The Art of API Design: Crafting Elegant and Powerful Interfaces"
  }
}

const { CohereClient } = require('cohere-ai');

const cohere = new CohereClient({
  token: '<YOUR API KEY>',
});

(async () => {
  const response = await cohere.chat({
    message: 'Write a title for a blog post about API design. Only output the title text.',
  });

  console.log(response.text)
})();

Response Structure

Below is a sample response from the Chat API

{
    "text": "The Art of API Design: Crafting Elegant and Powerful Interfaces",
    "generation_id": "dd78b9fe-988b-4c18-9419-8fbdf9968948",
    "chat_history": [
        {
            "role": "USER",
            "message": "Write a title for a blog post about API design. Only output the title text."
        },
        {
            "role": "CHATBOT",
            "message": "The Art of API Design: Crafting Elegant and Powerful Interfaces"
        }
    ],
    "finish_reason": "COMPLETE",
    "meta": {
        "api_version": {
            "version": "1"
        },
        "billed_units": {
            "input_tokens": 17,
            "output_tokens": 12
        },
        "tokens": {
            "input_tokens": 83,
            "output_tokens": 12
        }
    }
}

Every response contains the following fields:

text the generated message from the model.
generation_id the ID corresponding to this response. Can be used together with the Feedback API endpoint to promote great responses and flag bad ones.
chat_history the conversation presented in a chat log format
finish_reason can be one of the following:
- COMPLETE the model successfully finished generating the message
- MAX_TOKENS the model's context limit was reached before the generation could be completed
meta contains information with token counts, billing etc.

Multi-turn conversations

The user message in the Chat request can be sent together with a chat_history to provide the model with conversational context:

import cohere
co = cohere.Client(api_key="<YOUR API KEY>")

message = "Can you tell me about LLMs?"

response = co.chat(
  model="command-r-plus",
	chat_history=[
    {"role": "USER", "text": "Hey, my name is Michael!"},
    {"role": "CHATBOT", "text": "Hey Michael! How can I help you today?"},
  ],
  message="Can you tell me about LLMs?"
)

print(response.text) # "Sure thing Michael, LLMs are ..."

Instead of hardcoding the chat_history, we can build it dynamically as we have a conversation.

chat_history = []
max_turns = 10

for _ in range(max_turns):
	# get user input
	message = input("Send the model a message: ")
	
	# generate a response with the current chat history
	response = co.chat(
		message=message,
		temperature=0.3,
		chat_history=chat_history
	)
	answer = response.text
		
	print(answer)

	# add message and answer to the chat history
	user_message = {"role": "USER", "text": message}
	bot_message = {"role": "CHATBOT", "text": answer}
	
	chat_history.append(user_message)
	chat_history.append(bot_message)

Using `conversation_id` to Save Chat History

Providing the model with the conversation history is one way to have a multi-turn conversation with the model. Cohere has developed another option for users who do not wish to save the conversation history, and it works through a user-defined conversation_id.

import cohere
co = cohere.Client("<YOUR API KEY>")

response = co.chat(
  model="command-r-plus",
	message="The secret word is 'fish', remember that.",
  conversation_id='user_defined_id_1',
)

answer = response.text

Then, if you wanted to continue the conversation, you could do so like this (keeping the id consistent):

response2 = co.chat(
  	model="command-r-plus",
    message="What is the secret word?",
    conversation_id='user_defined_id_1'
)

print(response2.text) # "The secret word is 'fish'"

Note that the conversation_id should not be used in conjunction with the chat_history. They are mutually exclusive.

Response Structure

Multi-turn conversations

Using conversation_id to Save Chat History

Using `conversation_id` to Save Chat History