Skip to main content

Cape's Confidential LLMs

The Cape API provides the option to invoke a Llama 2 7B Chat LLM that is securely hosted within an enclave. This secure enclave offers a significant advantage: it ensures the confidentiality of both the prompt and the model response during computation.

In this context, "confidential" means that the data being processed is not visible to anyone - this includes system administrators and cloud providers. Therefore, with the Cape LLM, you don't need to redact Personally Identifiable Information (PII) before invoking the model.

Next, let's explore how to call a confidential LLM for secure and confidential processing.

Get a Cape API Key

This tutorial assumes you have an environment variable CAPE_API_KEY that contains an API Key. You will need to signup for a Cape account, and then you can get an API key here.

Using the Cape API to call Cape's Confidential LLMs

Here's how to use Python to make a streaming request to the Cape API when targeting LLama 2 7B Chat hosted in an enclave. Please see the API docs for more details. To learn more about how to prompt Llama 2, you can read this blog post from HuggingFace.

import requests
import os

url = "https://api.capeprivacy.com/v1/cape/completions"

payload = {
"model": "llama2-7b-chat",
"prompt": "<s>[INST] <<SYS>>You are a helpful Assistant.<</SYS>>\n\nWhat is the capital of France? [/INST]",
"max_tokens": 50,
"temperature": 0.8,
"stream": True
}

headers = {
"content-type": "application/json",
"Authorization": f"Bearer {os.getenv('CAPE_API_KEY')}"
}

response = requests.post(url, json=payload, headers=headers)
for line in response.iter_lines():
print(str(line, 'utf-8'))
Output
{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " ", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " Ah", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " an", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " excellent", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " question", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "!", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " The", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " capital", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " of", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " France", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " is", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " none", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " other", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " than", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " Paris", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": ".", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": " ", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

data: {"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

data: {"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": null}]}

{"id": "cmpl-82ff486e-ada4-4a42-ab2b-68676d84e31e", "object": "text_completion", "created": 1692141838, "model": "/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin", "choices": [{"text": "", "index": 0, "logprobs": null, "finish_reason": "stop"}]}

[DONE]

If you do not want to stream the data and receive it back in one chunk you can pass "stream": False.

Output
{
"id":"cmpl-8892ef77-7acb-4fe9-bb9c-c823914bf912",
"object":"text_completion",
"created":1692142217,
"model":"/app/model/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin",
"choices":[
{
"text":" Ah, an excellent question! The capital of France is... (drumroll please) Paris! 🇫🇷 Yes, the City of Light, the City of Love, the City of Art, and the City",
"index":0,
"logprobs":null,
"finish_reason":"length"
}
],
"usage":{
"prompt_tokens":34,
"completion_tokens":50,
"total_tokens":84
}
}

Voilà! You have invoked a confidential Llama 2 7B Chat using the Cape API.