This module provides a Python client for the SAP Agent Memory service (v1 API). It lets agents store, retrieve, and semantically search persistent memories, and record conversation messages grouped into logical message groups. The service handles vector embeddings automatically for memories — you store plain text, and the service makes it searchable by meaning.
Note
Memory extraction is the caller's responsibility. This client stores whatever text you pass
as content; it does not extract or summarize memories from conversation text on its own.
- Agent Memory User Guide
See the SAP Cloud SDK for Python installation guide for setup instructions. The agent memory module is included automatically.
You can import specific classes:
from sap_cloud_sdk.agent_memory import (
create_client,
AgentMemoryConfig,
FilterDefinition,
Memory,
Message,
MessageRole,
RetentionConfig,
SearchResult,
)Or use a star import for convenience:
from sap_cloud_sdk.agent_memory import *Use create_client() to get a client with automatic credential detection:
from sap_cloud_sdk.agent_memory import create_client
client = create_client()
memories = client.list_memories(agent_id="my-agent", invoker_id="user-123")
print(f"Found {len(memories)} memories")create_client() reads credentials from the CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_*
environment variables (or a mounted volume on BTP). See the
Configuration section for the full variable table.
There's also support for custom configuration if you want to specify credentials directly:
from sap_cloud_sdk.agent_memory import create_client, AgentMemoryConfig
config = AgentMemoryConfig(
base_url="https://<service-host>",
token_url="https://<tenant>.authentication.<region>/oauth/token",
client_id="<client-id>",
client_secret="<client-secret>",
)
client = create_client(config=config)The context manager is optional, but it is the easiest way to ensure the client is closed even if an exception is raised:
with create_client() as client:
memories = client.list_memories(agent_id="my-agent", invoker_id="user-123")To close the client manually, call client.close().
close() is only for local cleanup. It does not commit, flush, or roll back data.
Each API call is independent and final once accepted by the service.
Calling methods after close() is supported.
A stable identifier for the agent that owns the data — for example "hr-assistant" or
"support-bot". Chosen by the implementer; typically the name or ID of the AI agent.
Identifies the user or caller associated with the data — for example a user ID from
the application's auth system. Memories and messages are scoped to the combination of
agent_id and invoker_id.
Neither value is validated by the service — they are free-form strings. Consistent use across create, read, and search calls is the implementer's responsibility.
Texts with different words — or even different languages — can have the same meaning. "How to make pizza dough?" and "Italian flatbread preparation steps" are semantically similar despite sharing no words. To search a large corpus by meaning rather than exact keywords, the service uses vector embeddings.
An embedding model translates a text into a high-dimensional numeric vector. Texts with similar meaning produce vectors that point in a similar direction. The cosine similarity between two vectors measures that directional closeness: a value near 1.0 means the texts are semantically similar.
Example corpus:
- "Trains cross bridges"
- "Clouds block sunlight"
- "Rivers carve valleys"
- "Wolves hunt deer"
- "Engines power ships"
A search for "Sky illumination" returns "Clouds block sunlight" — closest in meaning, with the highest cosine similarity — even though the query shares no words with the result.
search_memories() uses this mechanism: you pass a natural-language query and a similarity
threshold, and the service returns the most semantically relevant stored memories.
Memories are persistent knowledge entries scoped to an agent_id + invoker_id pair.
The service generates a vector embedding for each memory automatically, enabling semantic search.
memory = client.add_memory(
agent_id="my-agent",
invoker_id="user-123",
content="The user prefers dark mode and metric units.",
metadata={"source": "preferences"},
)
print(memory.id)
# "a1b2c3d4-e5f6-7890-abcd-ef1234567890"Required fields:
agent_id: Identifier of the agent that owns this memory.invoker_id: Identifier of the user or caller associated with this memory.content: The memory text (plain string).
Optional fields:
metadata: Arbitrary key-value dict stored alongside the memory.
memory = client.get_memory(memory_id="<uuid>")
print(memory.content)
# "The user prefers dark mode and metric units."update_memory performs a partial update;
omitted fields remain untouched.
Note
content and metadata are the only editable fields; memory_id identifies which memory to update and cannot be modified
client.update_memory(
memory_id="<uuid>",
content="user prefers dark mode, metric units, and large font.",
metadata={"source": "preferences", "version": 2},
)client.delete_memory(memory_id="<uuid>")memories = client.list_memories(
agent_id="my-agent",
invoker_id="user-123",
limit=20,
)
for m in memories:
print(f" [{m.id}] {m.content[:80]}")
# [a1b2c3d4-...] The user prefers dark mode and metric units.
# [b2c3d4e5-...] The user's timezone is Europe/Berlin.Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
agent_id |
str | None |
None |
Filter by agent identifier. |
invoker_id |
str | None |
None |
Filter by invoker/user identifier. |
filters |
list[FilterDefinition] | None |
None |
Substring filters on "content" or "metadata". |
limit |
int |
50 |
Maximum number of memories to return. |
offset |
int |
0 |
Number of memories to skip (pagination). |
Returns: list[Memory]
Use FilterDefinition to narrow results by substring. Import it alongside create_client:
from sap_cloud_sdk.agent_memory import create_client, FilterDefinition
# Memories whose content contains "dark mode"
memories = client.list_memories(
agent_id="my-agent",
invoker_id="user-123",
filters=[FilterDefinition(target="content", contains="dark mode")],
)
# Combined: content AND metadata must both match
memories = client.list_memories(
agent_id="my-agent",
invoker_id="user-123",
filters=[
FilterDefinition(target="content", contains="dark mode"),
FilterDefinition(target="metadata", contains="preferences"),
],
)target must be "content" or "metadata". Multiple clauses are combined with AND.
Warning
Defining two clauses with the same target produces an AND predicate that requires both substrings to be present in the same field simultaneously. This is rarely intentional — for example:
filters=[
FilterDefinition(target="content", contains="user prefers"),
FilterDefinition(target="content", contains="user doesn't prefer"),
]Only memories whose content contains both substrings will be returned, which is typically an empty result set. OR combining across clauses is not yet supported.
Note
Metadata is stored as a JSON string. Filtering on "metadata" performs a free-text
substring match on the raw JSON — for example contains="preferences" matches any
metadata whose serialized form includes that word. Structured key-value filtering
(e.g. metadata.source == "preferences") is not supported.
Count memories without fetching their content. Near-zero cost.
total = client.count_memories(agent_id="my-agent", invoker_id="user-123")
print(f"Total memories: {total}")
# Total memories: 42Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
agent_id |
str | None |
None |
Filter by agent identifier. |
invoker_id |
str | None |
None |
Filter by invoker/user identifier. |
Returns: int
Search for memories whose meaning is similar to a natural-language query. The service returns results ordered by relevance (highest similarity first).
results = client.search_memories(
agent_id="my-agent",
invoker_id="user-123",
query="What are the user's display preferences?",
threshold=0.6,
limit=5,
)
for r in results:
print(f"[similarity={r.similarity:.2f}] {r.content}")
# [similarity=0.92] The user prefers dark mode and metric units.
# [similarity=0.81] User last asked about display settings on 2025-01-10.Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
agent_id |
str |
— | Agent identifier to scope the search. |
invoker_id |
str |
— | Invoker/user identifier to scope the search. |
query |
str |
— | Natural-language search query (5–5000 characters). |
threshold |
float |
0.6 |
Minimum cosine similarity score (0.0–1.0). |
limit |
int |
10 |
Maximum number of results (1–50). |
Returns: list[SearchResult] — each result extends Memory with a similarity (cosine score) field.
Messages represent individual turns in a conversation. Messages sharing the same message_group
form a logical message group. The service does not enforce a session concept — grouping is done
entirely via the message_group value you choose.
from sap_cloud_sdk.agent_memory import MessageRole
message = client.add_message(
agent_id="my-agent",
invoker_id="user-123",
message_group="conv-001",
role=MessageRole.USER,
content="What is the weather like today?",
)
print(message.id)
# "c3d4e5f6-a1b2-..."Required fields:
agent_id: Identifier of the agent.invoker_id: Identifier of the user or caller.message_group: Message group identifier (any string; use a consistent value per conversation).role: Author role — use theMessageRoleenum:USER,ASSISTANT,SYSTEM,TOOL.content: The message text.
Optional fields:
metadata: Arbitrary key-value dict stored alongside the message.
message = client.get_message(message_id="<uuid>")
print(f"[{message.role}] {message.content}")
# [USER] What is the weather like today?client.delete_message(message_id="<uuid>")messages = client.list_messages(
agent_id="my-agent",
invoker_id="user-123",
message_group="conv-001",
limit=50,
)
for msg in messages:
print(f" [{msg.role}] {msg.content[:80]}")
# [USER] What is the weather like today?
# [ASSISTANT] It's sunny and 22°C in Berlin.Filter by role to retrieve only a specific author's turns:
user_messages = client.list_messages(
agent_id="my-agent",
invoker_id="user-123",
message_group="conv-001",
role=MessageRole.USER,
)Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
agent_id |
str | None |
None |
Filter by agent identifier. |
invoker_id |
str | None |
None |
Filter by invoker/user identifier. |
message_group |
str | None |
None |
Filter by conversation group. |
role |
str | None |
None |
Filter by author role (USER, ASSISTANT, …). |
filters |
list[FilterDefinition] | None |
None |
Substring filters on "content" or "metadata". |
limit |
int |
50 |
Maximum number of messages to return. |
offset |
int |
0 |
Number of messages to skip (pagination). |
Returns: list[Message]
The same FilterDefinition syntax applies to messages:
from sap_cloud_sdk.agent_memory import create_client, FilterDefinition
# Messages whose metadata contains a specific tag
messages = client.list_messages(
agent_id="my-agent",
invoker_id="user-123",
message_group="conversation-001",
filters=[FilterDefinition(target="metadata", contains="escalated")],
)
# Messages whose content mentions a keyword
messages = client.list_messages(
agent_id="my-agent",
invoker_id="user-123",
filters=[FilterDefinition(target="content", contains="invoice")],
)See the Content and metadata filtering note under List Memories for details on metadata free-text limitations.
| Model | Description |
|---|---|
Memory |
A persistent memory entry (id, agent_id, invoker_id, content, metadata, timestamps) |
SearchResult |
Extends Memory with a similarity field (cosine score, 0–1) |
Message |
A message (id, agent_id, invoker_id, message_group, role, content, metadata, timestamp) |
RetentionConfig |
Data retention policy (message_days, memory_days, usage_log_days, timestamps) |
| Enum | Values |
|---|---|
MessageRole |
USER, ASSISTANT, SYSTEM, TOOL |
All models expose a to_dict() method that returns a plain dict for logging or forwarding.
memory = client.get_memory(memory_id="a1b2c3d4-...")
print(memory.to_dict())
# {
# "id": "a1b2c3d4-...",
# "agent_id": "my-agent",
# "invoker_id": "user-123",
# "content": "The user prefers dark mode and metric units.",
# "metadata": {},
# "created_at": "2025-01-10T12:00:00Z",
# "updated_at": "2025-01-10T12:00:00Z",
# }The module defines a structured exception hierarchy so you can catch errors at the appropriate level of specificity:
AgentMemoryError
├── AgentMemoryConfigError # bad or missing configuration
├── AgentMemoryValidationError # invalid inputs caught before any network call
└── AgentMemoryHttpError # HTTP-level error (status_code, response_text)
└── AgentMemoryNotFoundError # 404 Not Found
from sap_cloud_sdk.agent_memory.exceptions import (
AgentMemoryError,
AgentMemoryConfigError,
AgentMemoryValidationError,
AgentMemoryHttpError,
AgentMemoryNotFoundError,
)
# Catch invalid inputs before they reach the network
try:
client.add_memory(agent_id="", invoker_id="user-123", content="hello")
except AgentMemoryValidationError as e:
print(f"Bad input: {e}")
# Bad input: Required field(s) must be non-empty: 'agent_id'
# Catch a specific 404
try:
memory = client.get_memory(memory_id="non-existent-id")
except AgentMemoryNotFoundError:
print("Memory not found")
# Inspect the HTTP status code and response body
try:
memories = client.list_memories(agent_id="my-agent")
except AgentMemoryHttpError as e:
print(f"HTTP {e.status_code}: {e.response_text}")
# Catch all Agent Memory errors
try:
client = create_client()
memories = client.list_memories(agent_id="my-agent")
except AgentMemoryError as e:
print(f"Agent Memory error: {e}")The retention configuration controls automatic data cleanup. It is a singleton — one config per tenant.
rc = client.get_retention_config()
print(f"Messages: {rc.message_days} days")
print(f"Memories: {rc.memory_days} days")
print(f"Usage logs: {rc.usage_log_days} days")update_retention_config performs a partial update — only the provided fields are
changed; omitted fields remain unchanged.
client.update_retention_config(
message_days=30,
memory_days=90,
usage_log_days=180,
)Set a field to 0 to mark all data in that category for deletion at the next nightly scheduled cleanup. The server also accepts null to disable
automatic cleanup for that category.
When changes take effect
The service runs nightly data cleanup procedures that delete records based on creation timestamp. Changes to retention configuration apply to all future retention sweeps. The new retention window is calculated from each record's original creation timestamp, not from the time of the config change.
Increasing retention — records that were approaching expiry get more time. For example,
if message_days is raised from 90 to 120, a message created 89 days ago will now be
retained until it reaches 120 days old rather than being cleaned up after 90 days.
Decreasing retention — records outside the new window become eligible for removal. For
example, if message_days is reduced from 90 to 30, messages older than 30 days will be
removed at the next retention sweep, even if they fell within the original 90-day limit
when they were created.
Warning
Decreasing a retention period is a destructive, irreversible operation. Records outside the new window are permanently deleted at the next cleanup sweep.
Retrieve the most semantically relevant past memories before calling the language model:
def build_context(client, agent_id, invoker_id, user_query):
results = client.search_memories(
agent_id=agent_id,
invoker_id=invoker_id,
query=user_query,
threshold=0.65,
limit=5,
)
if not results:
return ""
lines = [f"- {r.content}" for r in results]
return "Relevant context from memory:\n" + "\n".join(lines)Store each user and assistant message so the full conversation history is available:
def record_turn(client, agent_id, invoker_id, group_id, user_text, assistant_text):
client.add_message(
agent_id=agent_id,
invoker_id=invoker_id,
message_group=group_id,
role=MessageRole.USER,
content=user_text,
)
client.add_message(
agent_id=agent_id,
invoker_id=invoker_id,
message_group=group_id,
role=MessageRole.ASSISTANT,
content=assistant_text,
)def get_conversation(client, agent_id, invoker_id, group_id):
return client.list_messages(
agent_id=agent_id,
invoker_id=invoker_id,
message_group=group_id,
limit=100,
)list_memories returns at most limit results per call. Use offset to page through large
sets, or use count_memories first to decide whether pagination is even necessary:
PAGE_SIZE = 100
total = client.count_memories(agent_id="my-agent", invoker_id="user-123")
if total == 0:
memories = []
elif total <= PAGE_SIZE:
memories = client.list_memories(
agent_id="my-agent", invoker_id="user-123", limit=total
)
else:
def iter_all_memories(client, agent_id, invoker_id, page_size=PAGE_SIZE):
offset = 0
while True:
page = client.list_memories(
agent_id=agent_id,
invoker_id=invoker_id,
limit=page_size,
offset=offset,
)
yield from page
if len(page) < page_size:
break
offset += page_size
memories = list(iter_all_memories(client, "my-agent", "user-123"))def iter_all_messages(client, agent_id, invoker_id, message_group, page_size=100):
offset = 0
while True:
page = client.list_messages(
agent_id=agent_id,
invoker_id=invoker_id,
message_group=message_group,
limit=page_size,
offset=offset,
)
yield from page
if len(page) < page_size:
break
offset += page_sizeAgentMemoryConfigError: Failed to load configuration: ...
Credentials could not be found. Check that either:
- The BTP service binding is mounted at
/etc/secrets/appfnd/hana-agent-memory/default/ - Or the environment variables are set (see Configuration)
The default limit is 50. Increase it or paginate:
memories = client.list_memories(agent_id="my-agent", invoker_id="user-123", limit=200)Also verify agent_id and invoker_id exactly match the values used when the memories were created.
The default threshold of 0.6 may be too strict for your data. Try a lower value:
results = client.search_memories(
agent_id="my-agent", invoker_id="user-123",
query="user display preferences",
threshold=0.3,
)The resource was deleted, the ID is incorrect, or the agent_id/invoker_id passed to a
list or search operation does not match the values used when the resource was created.
The OAuth2 token has expired and automatic refresh failed, or the configured credentials
(client_id, client_secret, token_url) are incorrect. Verify the credentials in your
environment variables or service binding.
- Mount path:
$SERVICE_BINDING_ROOT/hana-agent-memory/default/(defaults to/etc/secrets/appfnd/hana-agent-memory/default/) - Required keys:
application_url(Agent Memory service URL),uaa(JSON string with XSUAA credentials) - Env var fallback:
CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_{FIELD}(uppercased)
Note:
SERVICE_BINDING_ROOTdefaults to/etc/secrets/appfndwhen not set. See the Secret Resolver guide for details.
$SERVICE_BINDING_ROOT/hana-agent-memory/default/
├── application_url
└── uaa
export CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_APPLICATION_URL="https://agent-memory.example.com"
export CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_UAA='{"clientid":"...","clientsecret":"...","url":"https://..."}'The uaa key must contain a JSON string with the XSUAA credentials:
{
"clientid": "sb-xxx",
"clientsecret": "xxx",
"url": "https://subdomain.authentication.region.hana.ondemand.com"
}