Skip to content

Commit

Permalink
Updating responsible use. (#193)
Browse files Browse the repository at this point in the history
* Updating responsible use.

* Updating responsible use.

* Updating responsible use.

* Updating responsible use.

* Adding filepaths.

* Fixing filepath.

* Adding images.

* Adding images.

* additional changes.

* Removing the second graph, per Diane's request.

* Misc updates.

* Adding links, minor edits.

* Making final edits.

* Last changes.

* final edits.

---------

Co-authored-by: Trent Fowler <[email protected]>
  • Loading branch information
trentfowlercohere and Trent Fowler authored Nov 4, 2024
1 parent 23a1589 commit 2311745
Show file tree
Hide file tree
Showing 5 changed files with 66 additions and 15 deletions.
Binary file added fern/assets/images/responsible_use_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added fern/assets/images/responsible_use_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 60 additions & 13 deletions fern/pages/responsible-use/responsible-use.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,74 @@
title: "Overview"
slug: "docs/responsible-use"

hidden: true
description: "The Responsible Use documentation provides guidelines for developers to use language models ethically and constructively, including model cards to communicate strengths and weaknesses, a data statement, and measures for harm prevention such as a dedicated safety team and external advisory council."
hidden: false
description: This doc provides guidelines for using Cohere language models ethically and constructively.
image: "../../assets/images/5d25315-cohere_docs_preview_image_1200x630_copy.jpg"
keywords: "AI safety, AI risk, responsible AI"

createdAt: "Thu Sep 01 2022 19:22:12 GMT+0000 (Coordinated Universal Time)"
updatedAt: "Fri Mar 15 2024 04:47:51 GMT+0000 (Coordinated Universal Time)"
updatedAt: "Fri Oct 25 2024 10:51:00 GMT+0000 (Coordinated Universal Time)"
---
The Responsible Use documentation aims to guide developers in using language models constructively and ethically. Toward this end, we've published [guidelines](/docs/usage-guidelines) for using our API safely, as well as our processes around [harm prevention](#harm-prevention). We provide model cards to communicate the strengths and weaknesses of our models and to encourage responsible use (motivated by [Mitchell, 2019](https://arxiv.org/pdf/1810.03993.pdf)). We also provide a [data statement](/data-statement) describing our pre-training datasets (motivated by [Bender and Friedman, 2018](https://www.aclweb.org/anthology/Q18-1041/)).
This documentation aims to guide developers in using language models constructively and ethically. To this end, we've included information below on how our Command R and Command R+ models perform on important safety benchmarks, the intended (and unintended) use cases they support, toxicity, and other technical specifications.

**Model Cards:**
[NOTE: This page was updated on October 31st, 2024.]

- [Generation](/docs/generation-benchmarks)
- [Representation](/docs/representation-benchmarks)
## Safety Benchmarks

If you have feedback or questions, please feel free to [let us know](mailto:[email protected]) — we are here to help.
The safety of our Command R and Command R+ models has been evaluated on the BOLD (Biases in Open-ended Language Generation) dataset (Dhamala et al, 2021), which contains nearly 24,000 prompts testing for biases based on profession, gender, race, religion, and political ideology.

## Harm Prevention
Overall, both models show a lack of bias, with generations that are very rarely toxic. That said, there remain some differences in bias between the two, as measured by their respective sentiment and regard for "Gender" and "Religion" categories. Command R+, the more powerful model, tends to display slightly less bias than Command R.

We aim to mitigate adverse use of our models with the following:
Below, we report differences in privileged vs. minoritised groups for gender, race, and religion.

- **Responsible AI Research:** We’ve established a dedicated safety team which conducts [research](https://arxiv.org/abs/2108.07790) and development to build safer language models, and we’re investing in technical (e.g., usage monitoring) and non-technical (e.g., a dedicated team reviewing use cases) measures to mitigate potential harms.
- **Cohere Responsibility Council:** We’ve established an external advisory council made up of experts who work with us to ensure that the technology we’re building is deployed safely for everyone.
- **No online learning:** To safeguard model integrity and prevent underlying models from [being poisoned](https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist) with harmful content by adversarial actors, user input goes through curation and enrichment prior to integration with training.
![](../../assets/images/responsible_use_1.png)

## Intended Use Cases
Command R models are trained for sophisticated text generation—which can include natural text, summarization, code, and markdown—as well as to support complex [Retrieval Augmented Generation](https://docs.cohere.com/docs/retrieval-augmented-generation-rag) (RAG) and [tool-use](https://docs.cohere.com/docs/tool-use) tasks.

Command R models support 23 languages, including 10 languages that are key to global business (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Chinese, Arabic). While it has strong performance on these ten languages, the other 13 are lower-resource and less rigorously evaluated.

## Unintended and Prohibited Use Cases
We do not recommend using the Command R models on their own for decisions that could have a significant impact on individuals, including those related to access to financial services, employment, and housing.

Cohere’s [Usage Guidelines](https://cohere.com/responsibility) and customer agreements contain details about prohibited use cases, like social scoring, inciting violence or harm, and misinformation or other political manipulation.

## Usage Notes
For general guidance on how to responsibly leverage the Cohere platform, we recommend you consult our [Usage Guidelines](https://docs.cohere.com/docs/usage-guidelines) page.

In the next few sections, we offer some model-specific usage notes.

### Model Toxicity and Bias
Language models learn the statistical relationships present in training datasets, which may include toxic language and historical biases along race, gender, sexual orientation, ability, language, cultural, and intersectional dimensions. We recommend that developers be especially attuned to risks presented by toxic degeneration and the reinforcement of historical social biases.

#### Toxic Degeneration
Models have been trained on a wide variety of text from many sources that contain toxic content (see Luccioni and Viviano, 2021). As a result, models may generate toxic text. This may include obscenities, sexually explicit content, and messages which mischaracterize or stereotype groups of people based on problematic historical biases perpetuated by internet communities (see Gehman et al., 2020 for more about toxic language model degeneration).

We have put safeguards in place to avoid generating harmful text, and while they are effective (see the "Safety Benchmarks" section above), it is still possible to encounter toxicity, especially over long conversations with multiple turns.

#### Reinforcing Historical Social Biases
Language models capture problematic associations and stereotypes that are prominent on the internet and society at large. They should not be used to make decisions about individuals or the groups they belong to. For example, it can be dangerous to use Generation model outputs in CV ranking systems due to known biases (Nadeem et al., 2020).

## Technical Notes
Now, we'll discuss some details of our underlying models that should be kept in mind.

### Language Limitations
This model is designed to excel at English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Chinese, and Arabic, and to generate in 13 other languages well. It will sometimes respond in other languages, but the generations are unlikely to be reliable.

### Sampling Parameters
A model's generation quality is highly dependent on its sampling parameters. Please consult [the documentation](https://docs.cohere.com/docs/advanced-generation-hyperparameters) for details about each parameter and tune the values used for your application. Parameters may require re-tuning upon a new model release.

### Prompt Engineering
Performance quality on generation tasks may increase when examples
are provided as part of the system prompt. See [the documentation](https://docs.cohere.com/docs/crafting-effective-prompts) for examples on how to do this.

### Potential for Misuse
Here we describe potential concerns around misuse of the Command R models, drawing on the NAACL Ethics Review Questions. By documenting adverse use cases, we aim to empower customers to prevent adversarial actors from leveraging customer applications for the following malicious ends.

The examples in this section are not comprehensive; they are meant to be more model-specific and tangible than those in the Usage Guidelines, and are only meant to illustrate our understanding of potential harms. Each of these malicious use cases violates our Usage Guidelines and Terms of Use, and Cohere reserves the right to restrict API access at any time.

- **Astroturfing:** Generated text used to provide the illusion of discourse or expression of opinion
by members of the public, on social media or any other channel.
- **Generation of misinformation and other harmful content:** The generation of news or other
articles which manipulate public opinion, or any content which aims to incite hate or mischaracterize a group of people.
- **Human-outside-the-loop:** The generation of text that could be used to make important decisions about people, without a human-in-the-loop.
4 changes: 3 additions & 1 deletion fern/v1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,9 @@ navigation:
- link: Security
href: https://cohere.ai/security
- page: Usage Guidelines
path: pages/responsible-use/responsible-use/usage-guidelines.mdx
path: pages/responsible-use/responsible-use/usage-guidelines.mdx
- page: Responsibly Using Cohere Models
path: pages/responsible-use/responsible-use.mdx
- section: Cohere for AI
contents:
- page: Cohere For AI Acceptable Use Policy
Expand Down
4 changes: 3 additions & 1 deletion fern/v2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,9 @@ navigation:
- link: Security
href: https://cohere.ai/security
- page: Usage Guidelines
path: pages/responsible-use/responsible-use/usage-guidelines.mdx
path: pages/responsible-use/responsible-use/usage-guidelines.mdx
- page: Responsibly Using Cohere Models
path: pages/responsible-use/responsible-use.mdx
- section: Cohere for AI
contents:
- page: Cohere For AI Acceptable Use Policy
Expand Down

0 comments on commit 2311745

Please sign in to comment.