+The AI Assessment Catalogue showcases the evaluation tools, testing frameworks, and assessment solutions available across the Citcom.ai TEF network.
+It is regularly updated as new methodologies and tools become available at each TEF site.
+If you would like to request an assessment or learn more about a tool, please contact the relevant TEF sites.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+| Solution Name | Provider | Licensing Type | Project Phase / TRL | Domain of Application | AI Risk Category | Ethical Dimensions | Security & Securitization of Data | Resources | Example of Use Case |
+| ------------- | -------- | -------------- | ------------------- | --------------------- | ---------------- | ------------------ | --------------------------------- | --------- | ------------------- |
+| **FAIRGAME** | LIST | Open-source | TRL 6–8 | LLM bias testing, AI agents behavioural testing, jailbreaking testing | General Purpose AI | Fairness, Robustness | Depends on the use case (whether the chatbot/AI agent has access to sensitive data) | GitHub: , Paper: "FAIRGAME: A Framework for AI Agents Bias Recognition Using Game Theory", Frontiers in AI and Applications, Vol. 413: ECAI 2025 | A city aims to test its citizen-facing chatbot before launch. FAIRGAME enables the creation of simulated users with diverse identities, personalities, and requests using LLMs, allowing evaluation in dynamic, real-world-like conversations. |
+| **MLA-BiTe** | LIST | To be open sourced | TRL 6–8 | LLM bias testing | General Purpose AI | Fairness, Robustness | No data privacy requirements | — | A city plans to evaluate fairness in its citizen-facing chatbot. MLA-BiTe allows non-technical staff to create local scenario-based prompts to uncover discriminatory behaviour across sensitive categories, supporting multiple languages and augmentations. |
+| **Legal KG-RAG** | LIST | Proprietary | TRL 5–7 | LLM factuality accuracy testing | General Purpose AI | Transparency, Explainability, Robustness | Depends on whether the RAG is performed on sensitive data | — | A city using a standard RAG pipeline obtains irrelevant results. Legal KG-RAG rebuilds the legal corpus as a Neo4j knowledge graph, enabling direct comparison between traditional and KG-enhanced retrieval. |
+| **MLA-Reject** | LIST | To be open sourced | TRL 6–8 | LLM robustness to jailbreaking | General Purpose AI | Robustness | Depends on whether the system has access to sensitive data | — | A public administration operates a multilingual assistant for internal queries. They want to test robustness against unsafe or misleading prompts. MLA-Reject generates difficult negative prompts to test refusal behaviour and safety guardrails, revealing weaknesses and improving configurations. |
+
+
diff --git a/docs/documentation/ai_assessment/index.md b/docs/documentation/ai_assessment/index.md
index 5312b5b9..b6c543f3 100644
--- a/docs/documentation/ai_assessment/index.md
+++ b/docs/documentation/ai_assessment/index.md
@@ -1,3 +1,56 @@
---
title: AI Assessment
---
+
+# Citcom Label
+
+The Citcom Label is an initiative currently under development within Citcom.ai. Its goal is to create a trusted, recognisable signal that helps AI providers demonstrate responsible practices and gives buyers—especially public-sector actors such as smart cities—a clearer basis for evaluating and procuring AI solutions.
+
+
+## What will the Citcom Label be?
+
+The label is envisioned as a **system of digital badges**, each representing a specific dimension of trustworthiness assessed during the evaluation process.
+These badges would include a **watermark**, ensuring authenticity and preventing misuse. Each badge would be **verifiable through the Citcom Hub**, allowing external stakeholders to confirm its origin, evaluation status, and associated criteria.
+
+The Citcom badges are **not intended to function as legally binding conformity certificates under the AI Act**. Instead, they serve as **smart-city–oriented quality marks**, helping cities and other public authorities gain confidence in the AI solutions they consider adopting.
+
+For AI innovators, the Citcom badge system provides **independent third-party validation**, helping them promote their solutions and demonstrate that they meet recognised standards of trustworthiness. For cities and public buyers, the badges offer **clear, evidence-based guidance** to support more informed and transparent procurement decisions.
+
+## On what basis will the Citcom badges be awarded?
+
+The detailed criteria are still being developed with Citcom partners, but several guiding principles are emerging:
+
+### Completion of an evaluation
+A badge is expected to be awarded only once a solution completes a structured assessment aligned with shared guidelines for the relevant dimension of trustworthiness.
+
+### Common methodology
+Work is ongoing to define a coherent framework that determines how systems are qualified, how requirements translate into test cases, and how results are interpreted across different trust dimensions.
+
+### Success thresholds
+Initial discussions point toward setting minimum quantitative and qualitative thresholds that vary by product type, maturity level, and the specific dimension being assessed.
+
+### Real-world validation
+Evaluations are expected to rely on practical or pilot scenarios using the actual product, ensuring that results reflect real-world behaviour.
+
+
+## Who will conduct the assessment and with which methodologies?
+
+The assessment behind each Citcom badge will be carried out by the participating TEF sites. Each site brings its own specialised methodologies, tools, and testing infrastructures, reflecting the diversity of technical expertise across the Citcom network.
+
+These assessment solutions cover different dimensions of trustworthiness and can be consulted through the **AI Assessment Catalogue**, available at the following link:
+
+[AI Assessment Catalogue](ai_assessment_catalogue.md)
+
+The catalogue provides an overview of the available evaluation tools, test suites, and methodologies, enabling innovators to understand which capabilities are applied to their systems and helping cities see how specific trust dimensions are assessed.
+
+### Can an AI provider receive assessments across multiple TEF sites?
+
+Yes. If a solution would benefit from complementary expertise available across several TEF sites, an AI provider can undergo assessments in multiple locations. In such cases, the **first-contact TEF site** will coordinate the overall process.
+
+The coordinating TEF site will:
+- connect with the additional TEF sites that carry out their assessments independently,
+- ensure that each participating site manages its own contractual and operational responsibilities,
+- consolidate the evaluation results into a unified report,
+- and oversee the issuance of the Citcom badges corresponding to the dimensions assessed across all sites.
+
+This ensures a seamless experience for AI innovators while leveraging the full breadth of expertise across the TEF network.
diff --git a/mkdocs.yml b/mkdocs.yml
index 87d4ba3f..47d79952 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -9,6 +9,7 @@ extra_css:
- stylesheets/extra.css
- stylesheets/neoteroi-mkdocs.css
- assets/css/data_catalog.css
+ - assets/css/ai_assess_catalog.css
extra_javascript:
- assets/js/data_catalog.js
theme:
@@ -66,6 +67,7 @@ nav:
- documentation/local_digital_twins/index.md
- AI Assessment:
- documentation/ai_assessment/index.md
+ - documentation/ai_assessment/ai_assessment_catalogue.md
- AI services:
- services/index.md
- Minimal Interoperable AI Service: services/waste_collection.md
@@ -186,4 +188,4 @@ extra:
data_broker: 'Data Broker'
data_api: 'Data API'
data_idm_auth: 'IdM & Auth'
- data_publication: 'Data Publication'
+ data_publication: 'Data Publication'
\ No newline at end of file