Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: Build RAG applications with LlamaIndex on Google Cloud C4A Axion VM

description: Set up LlamaIndex on Google Cloud C4A Axion Arm VMs running SUSE Linux to build browser-based Retrieval-Augmented Generation (RAG) applications using local LLMs, vector databases, and FastAPI.

minutes_to_complete: 30

who_is_this_for: This is an introductory topic for DevOps engineers, AI engineers, ML engineers, and software developers who want to build Retrieval-Augmented Generation (RAG) applications using LlamaIndex on SUSE Linux Enterprise Server (SLES) Arm64, integrate vector databases, and query custom documents using local LLMs.

learning_objectives:
- Install and configure LlamaIndex on Google Cloud C4A Axion processors for Arm64
- Build indexing and retrieval pipelines using LlamaIndex
- Integrate ChromaDB vector databases with local LLMs using Ollama
- Build and test a browser-based RAG application using FastAPI

prerequisites:
- A [Google Cloud Platform (GCP)](https://cloud.google.com/free) account with billing enabled
- Basic familiarity with Python and AI/LLM concepts

author: Pareena Verma

##### Tags
skilllevels: Introductory
subjects: ML
cloud_service_providers:
- Google Cloud

armips:
- Neoverse

tools_software_languages:
- LlamaIndex
- Python
- ChromaDB
- Ollama
- FastAPI

operatingsystems:
- Linux

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================

further_reading:
- resource:
title: LlamaIndex official documentation
link: https://docs.llamaindex.ai/en/stable/
type: documentation

- resource:
title: LlamaIndex GitHub repository
link: https://github.com/run-llama/llama_index
type: documentation

- resource:
title: Ollama documentation
link: https://ollama.com/library
type: documentation

weight: 1
layout: "learningpathall"
learning_path_main_page: yes
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Learn about LlamaIndex and Google Axion C4A for RAG applications
weight: 2

layout: "learningpathall"
---

## Google Axion C4A Arm instances for AI and RAG workloads

Google Axion C4A is a family of Arm-based virtual machines built on Google’s custom Axion CPU, which is based on Arm Neoverse V2 cores. Designed for high-performance and energy-efficient computing, these virtual machines offer strong performance for modern cloud workloads such as AI applications, vector databases, Retrieval-Augmented Generation (RAG) pipelines, and scalable inference services.

The C4A series provides a cost-effective alternative to x86 virtual machines while using the scalability and performance benefits of the Arm architecture in Google Cloud.

To learn more, see the Google blog [Introducing Google Axion Processors, our new Arm-based CPUs](https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu).

## LlamaIndex for RAG and context-aware AI applications on Arm

LlamaIndex is an open-source framework designed to build context-aware AI applications using Large Language Models (LLMs). It's widely used for Retrieval-Augmented Generation (RAG), document indexing, vector search, semantic retrieval, and integrating custom data sources with LLMs.

LlamaIndex provides a unified framework with components such as:

* Document loaders for ingesting custom data
* Indexing pipelines for structured retrieval workflows
* Query engines for context-aware question answering
* Vector store integrations for scalable embedding search
* LLM integrations for generating grounded responses

Running LlamaIndex on Google Axion C4A Arm-based infrastructure enables efficient execution of AI and RAG workloads by using multi-core Arm CPUs and optimized memory performance. This results in improved performance per watt, reduced infrastructure costs, and better scalability for browser-based AI applications and local inference pipelines.

Common use cases include browser-based AI assistants, document search applications, semantic retrieval systems, vector database integrations, enterprise knowledge bases, and context-aware chatbot applications.

To learn more, see the [LlamaIndex documentation](https://docs.llamaindex.ai/en/stable/) and the [LlamaIndex GitHub repository](https://github.com/run-llama/llama_index).

## What you've learned and what's next

You've now learned about Google Axion C4A Arm-based virtual machines and their performance advantages for AI and RAG workloads. You were also introduced to core LlamaIndex components including document ingestion, indexing pipelines, query engines, vector stores, and LLM integrations.

Next, you'll create a firewall rule in Google Cloud Console to enable remote access to the browser-based LlamaIndex RAG application used in this Learning Path.
Loading
Loading