diff --git a/_approaches/arcotl.md b/_approaches/arcotl.md new file mode 100644 index 00000000..7384ad9f --- /dev/null +++ b/_approaches/arcotl.md @@ -0,0 +1,21 @@ +--- +title: ArCoTL +description: ArCoTL – TLR between Software Architecture Models and Code. +permalink: /approaches/arcotl/ +importance: 2 +layout: page +--- + +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +ArCoTL (Architecture–Code Trace Links) focuses on linking a given architecture model (SAM) to the source code. +It assumes you have a formal model of the system's components and interfaces, and wants to find the corresponding code. +ArCoTL transforms both the architecture model and the code into intermediate representations (e.g. simplified graphs) and then applies various heuristics to match elements +These heuristics include standalone rules and dependent rules (which consider relationships) plus filters to refine the links. + +- How it works: Starting from a SAM and the codebase, ArCoTL builds simplified model and code representations. It then uses text similarity, naming conventions, and dependency heuristics to propose links between each model component and code artifact. +- Effectiveness: ArCoTL turned out to be very effective on its own. In experiments, the model-to-code step (ArCoTL) achieved an average F1 of ~0.98. + +## References + +- [ICSE 2024 publication page](/c/icse24) diff --git a/_approaches/ardocode.md b/_approaches/ardocode.md new file mode 100644 index 00000000..dca09637 --- /dev/null +++ b/_approaches/ardocode.md @@ -0,0 +1,18 @@ +--- +title: ArDoCode +description: ArDoCode – TLR between Software Architecture Documentation and Code. +permalink: /approaches/ardocode/ +importance: 5 +layout: page +--- + +{:width="100%" style="border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +ArDoCode is a simpler variant of trace recovery that treats source code itself as the "model". +Instead of first building a formal model, ArDoCode directly matches architecture document content with code elements using the same heuristics designed for linking docs to models. +In practice, it extracts key terms from the documentation and tries to align them with names in the code (e.g. class or module names) as if the code were the model. + +- Key idea: Apply the SWATTR approach without an explicit SAM by interpreting the codebase as a model. For example, if the doc mentions a component "WebUI" and there is a WebUI package in code, ArDoCode will link them. +- Effectiveness: Because it skips the formal modeling step, ArDoCode is easier to apply but less precise. In evaluations, ArDoCode achieved a weighted F1 of only ~0.62, substantially lower than the full TransArC method. It serves mainly as a baseline and demonstrates that without structured models, the TLR performance drops. + +See our [ICSE 2024 publication page](/c/icse24) for details, links, and resources. diff --git a/_approaches/inconsistency-detection.md b/_approaches/inconsistency-detection.md new file mode 100644 index 00000000..b1886f11 --- /dev/null +++ b/_approaches/inconsistency-detection.md @@ -0,0 +1,22 @@ +--- +title: Inconsistency Detection +description: Documentation-Model-Inconsistency-Analysis pipeline. +permalink: /approaches/inconsistency-detection/ +importance: 8 +layout: page +--- + +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +The ArDoCo inconsistency detection approach uses trace link recovery to detect inconsistencies between natural-language architecture documentation and formal models. +It identifies two kinds of issues: + +(a) Unmentioned Model Elements (UMEs): components or interfaces that appear in the model but are never described in the documentation; +(b) Missing Model Elements (MMEs): elements mentioned in the text that do not exist in the model. + +The method runs a TLR procedure (namely SWATTR) and then flags any model element with no corresponding text link (a UME) or any sentence that refers to a non-modeled item (an MME). + +- Detection strategy: Use the TLR results as a bridge. After linking as many sentences to model elements as possible, any "orphan" model nodes or text mentions indicate a consistency gap. For example, if the model has a "Cache" component with no sentence linked, that is an UME; if the doc talks about "Common" but the model lacks it, that is an MME. +- Results: The approach achieved an excellent F1 (0.81) for the underlying trace recovery. For inconsistency detection, it attained ~93% accuracy in identifying UMEs and ~75% for MMEs, significantly better than naive baselines. These results suggest that using trace links is a promising way to find documentation-model mismatches. + +See our [ICSA 2023 publication page](/c/icsa23) for details, links, and resources. diff --git a/_approaches/lissa.md b/_approaches/lissa.md new file mode 100644 index 00000000..ae9cd408 --- /dev/null +++ b/_approaches/lissa.md @@ -0,0 +1,19 @@ +--- +title: LiSSA +description: LiSSA – LLM/RAG-based TLR. +permalink: /approaches/lissa/ +importance: 6 +layout: page +--- + +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +LiSSA (Linking Software System Artifacts) is a retrieval-augmented, LLM-based approach that aims to be generic across artifact types. +The key idea is to use a Large Language Model (LLM) together with information retrieval (IR) to find trace links. +For a given source artifact (e.g. a requirement or a sentence in documentation), LiSSA first uses IR techniques to retrieve a small set of potentially relevant target artifacts (code files, model elements, etc.). +It then queries the LLM with the retrieved context to generate or suggest the most likely trace link. + +- Scope: LiSSA was tested on multiple tasks including requirements→code, documentation→code, and architecture-docs→models. The same RAG process is applied in each case, making it a one-size-fits-many solution. +- Effectiveness: In experiments, LiSSA significantly outperformed state-of-the-art tools on the code-centric tasks. For example, it showed much higher accuracy when linking requirements to code than prior methods. + +LiSSA is primarily associated with our [ICSE 2025 publication page](/c/icse25), but is also related to our [REFSQ 2025 publication page](/c/refsq25). See these pages for details, links, and resources. diff --git a/_approaches/secdragon.md b/_approaches/secdragon.md new file mode 100644 index 00000000..09b1b4e0 --- /dev/null +++ b/_approaches/secdragon.md @@ -0,0 +1,9 @@ +--- +title: SecDragon +description: SecDragon – TLR for Security Requirements. +permalink: /approaches/secdragon/ +importance: 7 +layout: page +--- + +🚧 This approach is not available yet. diff --git a/_approaches/swattr.md b/_approaches/swattr.md new file mode 100644 index 00000000..e72528a9 --- /dev/null +++ b/_approaches/swattr.md @@ -0,0 +1,20 @@ +--- +title: SWATTR +description: SWATTR – TLR between Software Architecture Documentation and Software Architecture Models. +permalink: /approaches/swattr/ +importance: 1 +layout: page +--- + +{:width="100%" style="border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +SWATTR (SoftWare Architecture TexT TRace link recovery) is an agent-based framework for linking textual architecture documentation (SAD) and formal models (SAM). +Rather than focusing on a single algorithm, SWATTR defines a pipeline with multiple stages where different "agents" can operate. +First it extracts and preprocesses text from the SAD and components from the architecture model. +Next, it uses NLP and heuristics to identify architecture elements (like component names) mentioned in the text. +Finally, it connects these identified text elements to model elements to form trace links. + +- Pipeline stages: The framework is extendable, meaning you can plug in different strategies at each step. For example, one agent might use term matching to find components in sentences, while another uses more advanced similarity measures. All results are aggregated to produce the final links. +- Results: SWATTR was evaluated on three case studies and achieved a weighted average F1-score of about 0.72 for trace recovery. This was a strong performance (outperforming simple baselines by ~0.24 F1) and demonstrated the benefit of the multi-stage approach. + +See our [ECSA 2021 publication page](/c/ecsa21) for details, links, and resources. diff --git a/_approaches/transarc.md b/_approaches/transarc.md new file mode 100644 index 00000000..e61cf193 --- /dev/null +++ b/_approaches/transarc.md @@ -0,0 +1,19 @@ +--- +title: TransArC +description: TransArC – TLR between Software Architecture Documentation, Models, and Code. +permalink: /approaches/transarc/ +importance: 3 +layout: page +--- + +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +TransArC is a transitive trace link recovery approach that connects architecture documents to code via an intermediate architecture model. +It first uses an existing method (SWATTR) to connect the textual architecture documentation and component-based architecture model (SAM), then applies a new method (ArCoTL) to link the model elements to code. +In other words, TransArC builds a bridge: document ⟶ model ⟶ code. +This two-step strategy helps bridge the semantic gap between informal text and code. + +- How it works: TransArC extracts combines the two link sets of trace links, namely SWATTR and ArCoTL, to produce trace links transitively from documentation to code. +- Results: In experiments on five systems, TransArC achieved a high average F1 score (~0.82) for recovering documentation-to-code links, significantly outperforming baseline methods. This shows that combining the two specialized steps yields much more accurate links than simpler approaches. + +See our [ICSE 2024 publication page](/c/icse24) for details, links, and resources. diff --git a/_approaches/transarcai.md b/_approaches/transarcai.md new file mode 100644 index 00000000..4a5aba16 --- /dev/null +++ b/_approaches/transarcai.md @@ -0,0 +1,20 @@ +--- +title: "TransArC-AI" +description: "TransArC-AI – LLM-based TLR between Software Architecture Documentation, Models, and Code." +permalink: /approaches/transarcai/ +importance: 4 +layout: page +--- + +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + +TransArC-AI extends the TransArC idea by using an LLM to generate a simple architecture mode (SAM). +In this approach, instead of requiring a hand-made SAM, a large language model (such as GPT-4) is prompted to extract or invent the main component names from the SAD (and optionally from code). +These names serve as a minimal architecture model (i.e. a list of components). +Then, as in TransArC, these LLM-derived components are matched to code. +The goal is to bridge the SAD–code gap without manual modeling. + +- How it works: Given the software architecture text and the codebase, the system asks the LLM to list likely component names. That list of names forms a "Simple Software Architecture Model" (SSAM). Finally, code elements with matching names or descriptions are linked to the documentation. This pipeline avoids needing an explicit UML model. +- Effectiveness: TransArC-AI achieved very competitive results. Using GPT-4o, it obtained a weighted F1 of about 0.86, nearly as good as the original TransArC with a hand-made model (F1 0.87). It also substantially outperformed the ArDoCode baseline (which scored ~0.62). This shows that LLMs can automatically infer the key architectural components. + +See our [ICSA 2025 publication page](/c/icsa25) for details, links, and resources. diff --git a/_approaches/tv.md b/_approaches/tv.md new file mode 100644 index 00000000..b53ad056 --- /dev/null +++ b/_approaches/tv.md @@ -0,0 +1,11 @@ +--- +title: ArDoCo-TV +description: "Trace View: a viewer for trace links." +permalink: /approaches/tv/ +importance: 9 +layout: page +--- + +ArDoCo-TV is a tool for visualizing trace links between software artifacts, supporting the analysis and understanding of traceability in software projects. + +See our [ArDoCo TV](https://ardoco.de/TraceView) for more information. diff --git a/_pages/conferences/aire25.md b/_conferences/aire25.md similarity index 93% rename from _pages/conferences/aire25.md rename to _conferences/aire25.md index eb8b3d55..7c7a4574 100644 --- a/_pages/conferences/aire25.md +++ b/_conferences/aire25.md @@ -13,7 +13,7 @@ authors: To be published at the [33rd International Requirements Engineering Conference Workshops (REW)](https://aire-ws.github.io/aire25/). -{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} +{:width="100%" style="background-color: white; border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} ## Abstract diff --git a/_pages/conferences/ecsa21.md b/_conferences/ecsa21.md similarity index 92% rename from _pages/conferences/ecsa21.md rename to _conferences/ecsa21.md index d1f9b7fb..18e9b907 100644 --- a/_pages/conferences/ecsa21.md +++ b/_conferences/ecsa21.md @@ -15,6 +15,8 @@ authors: Published at the [15th European Conference on Software Architecture (ECSA 2021), September 13-17 2021](https://conf.researchr.org/home/ecsa-2021) +{:width="100%" style="border-radius: 8px; padding: 10px; display: block; margin: 0 auto;"} + ## Abstract Software Architecture Documentation often consists of different artifacts. diff --git a/_pages/conferences/fg-arch24.md b/_conferences/fg-arch24.md similarity index 87% rename from _pages/conferences/fg-arch24.md rename to _conferences/fg-arch24.md index bcbe7760..70c65dbe 100644 --- a/_pages/conferences/fg-arch24.md +++ b/_conferences/fg-arch24.md @@ -8,9 +8,7 @@ authors: - tobias_hey --- -
-
-
-
-