Skip to main content
Every ChappieClient maintains a private transcript and thread ID that grows with each call to send, stream, or response. This accumulated context lets the model remember earlier turns and refer back to them — but it also consumes tokens. Chappie exposes a set of async methods on ChappieClient that let you inspect, reset, restore, and summarize that context at any point, giving you precise control over what the model sees without managing message arrays yourself.

Snapshot the current context

Call context() to get a point-in-time copy of the client’s internal state. The returned ChappieClientContext is a plain Codable value — it owns no connection to the live client, so you can store it, archive it to disk, or pass it to another part of your app.
let snapshot = await client.context()

print("Session:  \(snapshot.sessionID)")
print("Thread:   \(snapshot.threadID)")
print("Model:    \(snapshot.model)")
print("Messages: \(snapshot.messageCount)")
print("~Tokens:  \(snapshot.estimatedTokens)")
sessionID
String
A UUID string identifying this client instance. Stays the same across context clears unless you resume a different context.
threadID
String
A UUID string used as the prompt cache key for the current conversation thread. Resets when you call clearContext().
model
String
The model selected on the client at the time of the snapshot.
messages
[ChappieInputMessage]
The full message history accumulated so far in the current thread.
compactedSummary
String?
A compaction summary written either by you or by the model. When non-nil, Chappie prepends it to the next request as a system message so the model has continuity even after messages has been cleared.
createdAt
Date
The timestamp when this context (or the last clear/resume) was started.
updatedAt
Date
The timestamp of the most recent mutation — turn recorded, model change, compaction, or resume.
messageCount
Int
A computed shorthand for messages.count.
estimatedTokens
Int
A local token estimate based on character counts — useful for deciding when to compact. Not a substitute for server-reported usage.

Measure context size

Call contextSize(contextWindow:) to get a ChappieContextSize snapshot that pairs the local estimate with a known context window size. Pass the model’s actual context window — for example, 272_000 for a 272k-token model — to get remaining-token and percentage-used values.
let size = await client.contextSize(contextWindow: 272_000)

print("Messages:          \(size.messageCount)")
print("Estimated tokens:  \(size.estimatedTokens)")
print("Remaining tokens:  \(size.remainingTokens ?? 0)")
if let used = size.usedPercent {
    print("Used:              \(String(format: "%.1f", used))%")
}
messageCount
Int
The number of messages in the current transcript.
estimatedTokens
Int
A heuristic token count derived from visible character counts in the stored messages.
contextWindow
Int?
The context window size you passed in, or nil if you called contextSize() without one.
remainingTokens
Int?
contextWindow - estimatedTokens, floored at zero. nil when contextWindow is not set.
usedPercent
Double?
estimatedTokens / contextWindow × 100, capped at 100. nil when contextWindow is not set or is zero.

Clear context

Call clearContext() to wipe the transcript and start a fresh thread. The client assigns a new threadID, clears messages and compactedSummary, and resets createdAt. The next send, stream, or response call starts from a blank slate.
await client.clearContext()
let reply = try await client.send("Start over — what can you help me with?")

Resume a saved context

Pass a previously snapshotted ChappieClientContext back to the client to restore it:
// Snapshot before navigating away
let saved = await client.context()

// Later — restore it
await client.resumeContext(saved)
let reply = try await client.send("Where were we?")
You can also resume a context by individual components when you’ve stored them separately — for example, after rehydrating from persistent storage:
await client.resumeContext(
    threadID: storedThreadID,
    sessionID: storedSessionID,
    model: storedModel,
    messages: storedMessages
)

Compact context

Compaction replaces the full messages array with a summary, reducing the tokens the model needs to process on the next turn while preserving conversational continuity.

Manual compaction — you write the summary

Call compactContext(summary:) with a string you compose yourself. Chappie stores it as compactedSummary, clears messages, and prepends the summary as a system message on the next request:
await client.compactContext(
    summary: "The user is comparing inventory alternatives for SKU-123. " +
             "They want to prioritize domestic suppliers with lead times under 5 days."
)

Automatic compaction — the model writes the summary

Call the throwing overload compactContext() with no arguments. Chappie sends the current transcript to the model with a built-in compaction prompt, waits for the model’s summary, then applies it:
do {
    let compacted = try await client.compactContext()
    print("Compacted to \(compacted.estimatedTokens) estimated tokens.")
} catch {
    print("Compaction failed: \(error)")
}
You can provide a custom prompt if the default compaction instruction doesn’t match your use case:
let compacted = try await client.compactContext(
    prompt: "Summarize the key decisions made so far, formatted as a bullet list."
)
Compact proactively — don’t wait until the context window is nearly full. Calling compactContext() while there’s still headroom gives the model enough context to write a thorough summary. A good trigger is when usedPercent exceeds 60–70%.

Parallel conversations need separate clients

ChappieClient serializes turns through an internal coordinator, so calling send or stream concurrently on the same instance queues the calls rather than running them in parallel. More importantly, all turns share the same transcript — interleaving unrelated conversations on one client corrupts the history.
Create a separate ChappieClient for each independent conversation thread. Sharing a single client across parallel topics will mix those topics’ messages into the same transcript and produce incoherent model responses.
// ✅ Independent clients for independent conversations
let supportClient  = Chappie.client(harness: .default)
let inventoryClient = Chappie.client(harness: ChappieHarnessSamples.inventory)

async let supportReply   = supportClient.send("How do I reset my password?")
async let inventoryReply = inventoryClient.send("Check stock for SKU-789.")

let (support, inventory) = try await (supportReply, inventoryReply)