Agentic AI in 5G RAN
- Venkateshu Kamarthi
- Dec 19, 2025
- 10 min read
1. Introduction
Modern 5G radio access networks have reached a level of complexity where traditional automation approaches are no longer sufficient. While machine learning has been widely adopted in telecom analytics, most deployed solutions still behave as passive systems: they ingest data, run inference, and output a score, label, or alert.
In practice, experienced RAN engineers do not work this way. They observe symptoms, form hypotheses, validate those hypotheses by checking additional signals, discard weak explanations, and iterate until they reach a conclusion they can defend. Agentic AI attempts to encode this troubleshooting behaviour into software. Rather than producing a single-shot prediction, an agent actively reasons over data, decides what information to inspect next, and converges toward a root cause with supporting evidence.
For example, A single user throughput issue may involve interactions between PHY-layer radio conditions, MAC scheduling behavior, mobility procedures, transport congestion, and configuration changes that occurred hours earlier.
What Makes AI “Agentic”
Agentic AI is not defined by a specific algorithm or model. Instead, it is defined by behaviour. An agent is an entity that operates in an environment with a goal, maintains internal state, and can take actions that influence its future observations. In the context of software systems, those actions usually involve querying data sources, invoking analytical tools, or triggering downstream workflows.
Agentic systems decompose intelligence into a cognitive loop: Observe (perceive via sensors/logs), Orient (contextualize with memory/knowledge), Decide (reason/plan via LLM), Act (execute tools/APIs), Reflect (evaluate/critique outcomes). Unlike traditional ML (supervised classification), agents handle open-ended tasks through ReAct prompting ("Reason + Act") or hierarchical planning (sub-agents for subtasks).

The critical distinction between an agent and a conventional ML model is control flow. In a traditional ML pipeline, the sequence of operations is fixed by the developer. In an agentic system, the AI itself decides the sequence. For example, when analysing a throughput drop, an agent might first look at SINR trends. If radio conditions appear healthy, it may then choose to inspect MAC retransmissions or scheduler fairness. If those signals are inconclusive, it might expand the scope to neighbouring cells or recent configuration changes.
This ability to decide “what to check next” is what gives agentic systems their power in complex domains like telecom.
Traditional machine learning systems are fundamentally reactive. They take an input vector, apply a learned function, and produce an output. Once the output is produced, the system is done. Any further investigation or decision-making must be orchestrated externally by humans or additional pipelines.
Agentic AI systems, by contrast, are interactive and goal-driven. They do not stop after producing a single output. Instead, they observe the environment, reason about what they are seeing, decide what information they need next, and continue operating until they reach a satisfactory conclusion.
In other words, traditional ML answers a question.
Agentic AI figures out what the right question is and then answers it.
Dimension | Traditional ML | Agentic AI |
Execution style | One-shot inference | Iterative reasoning loop |
Control flow | Developer-defined | Agent-defined |
Adaptability | Low | High |
Memory | None or implicit (model weights) | Explicit short & long-term memory |
Tool usage | Hard-coded | Dynamic, agent-selected |
Explainability | Post-hoc, limited | Native, reasoning-based |
Human trust | Often low | Significantly higher |
Best suited for | Well-defined tasks | Complex, ambiguous problems |
2. Core Architecture of an Agentic AI System
At a high level, an agentic AI system for 5G RAN sits between the network and the operations layer. It continuously observes telemetry from the network, reasons over that data using domain knowledge and learned patterns and produces structured conclusions.

The first component is the perception layer. This layer ingests raw telemetry such as gNB logs, performance management counters, event notifications, and traces. Its role is not to make decisions but to normalize and contextualize data. A log line indicating repeated UL HARQ failures is transformed into a structured signal associated with a specific UE, cell, time window, and protocol layer.
Once data is structured, it becomes input to the agent core. The agent core maintains the current investigative context: what problem is being analysed, what evidence has already been gathered, and which hypotheses are still plausible. This context persists across multiple reasoning steps, which is essential for complex RCA.
The agent then interacts with tools. In a telecom environment, tools might include KPI query services, topology graphs, historical data stores, or even simulators. Importantly, the agent decides when and how to use these tools. This is a major departure from static analytics pipelines.
Finally, the agent produces an output that includes both a conclusion and a narrative explanation. The explanation is not an afterthought; it is a first-class output that determines whether the system will be trusted by network engineers.
3. Reasoning and Memory in Telecom Agents
Memory plays a crucial role in agentic AI. Without memory, an agent cannot build on previous observations or learn from past incidents. In RAN analytics, memory typically takes three forms.
Short-term memory holds the current investigative state. For example, it may store that SINR has already been checked and found to be stable, so there is no need to revisit that hypothesis.
Long-term memory stores known failure patterns, such as the observation that high BLER combined with high SINR often points to scheduling issues rather than radio problems.
Episodic memory captures past incidents and their confirmed root causes, allowing the agent to recognize recurring patterns.
Reasoning operates over this memory using a combination of rule-based logic and learned representations. Domain rules are particularly important in telecom because they constrain the solution space. For instance, if downlink BLER is high but uplink BLER is normal, certain RF issues can be ruled out early in the investigation.
4. Applying Agentic AI to 5G RAN Root Cause Analysis
To make these ideas concrete, consider a common operational problem: a sudden drop in throughput affecting a subset of users in a cell.
The agent begins by observing the symptom through PM counters. It notices that throughput has dropped by 40% compared to the baseline. Rather than immediately labelling the issue, the agent formulates initial hypotheses. These might include poor radio conditions, uplink interference, scheduler misconfiguration, or transport congestion.
The agent then evaluates these hypotheses sequentially. It checks SINR and RSRP trends and finds them to be stable. This weakens the radio degradation hypothesis. Next, it inspects BLER and HARQ retransmission counts, both of which are elevated. At this point, the agent reasons that if SINR is good but BLER is high, the problem is unlikely to be purely physical-layer interference.
The agent then queries MAC-layer statistics and discovers that scheduler queue lengths increased sharply after a recent configuration change. It correlates the timing of the throughput drop with this change and checks whether neighbouring cells exhibit similar behavior. They do not, which further strengthens the hypothesis that the issue is local and configuration related.
At this stage, the agent has enough evidence to converge on a root cause: a scheduler parameter change leading to excessive retransmissions and reduced effective throughput. The final output includes not only this conclusion but also the reasoning chain and supporting metrics.
Problem Statement
Given gNB logs and KPIs, automatically identify the root cause of a throughput drop.
Step 1: Data Ingestion
Sources:
gNB log stream
PM counters (15-min / 5-min)
UE traces
Pipeline:
Kafka → Stream Processor → Feature Normalizer
Step 2: Define Agent Goal
Goal: Identify root cause of RAN performance degradation
Step 3: Define Hypothesis Space
Examples:
Interference
Scheduler overload
Hardware fault
Backhaul congestion
Mobility misconfiguration
Step 4: Reasoning Logic
Pseudo-code:
if SINR > threshold and BLER high:
suspect MAC or scheduler
elif SINR low and RSRP fluctuating:
suspect interference
elif handover failures high:
suspect mobility config
Step 5: Evidence Collection
Agent queries:
Neighbor cell KPIs
Scheduler stats
UE distribution
Time correlation
Step 6: Root Cause Identification
Example output:
Root Cause:
Scheduler misconfiguration causing excessive HARQ retransmissions
Evidence:
- SINR stable at 18–20 dB
- BLER > 25%
- Retransmission count > 3x baseline
- Issue started after config change
Step 7: Explanation Generation
Agent produces engineer-readable RCA, not just a label.
5. Building a Agentic AI Application
Building a minimal agentic system for 5G RAN does not require exotic infrastructure. The starting point is a streaming data pipeline that ingests logs and counters from the gNB. This pipeline can be built using standard components such as Kafka and a stream processing framework.
On top of this pipeline sits the agent. The agent’s goal is explicitly defined: identify the most likely root cause of observed performance degradation. A small hypothesis library is created, capturing common RAN failure modes. Each hypothesis is associated with signals that support or contradict it.
The agent operates in a loop. It observes incoming data, updates its internal state, decides which hypothesis to evaluate next, and queries the necessary data. This continues until one hypothesis is sufficiently supported or all hypotheses are exhausted.
Even a relatively simple reasoning loop can outperform static ML models because it adapts its investigation based on intermediate findings rather than treating all cases identically.
Agentic AI application for 5G RAN
Building an Agentic AI application for 5G RAN follows a structured 12-step procedure, evolving from prototype to production RIC xApp deployment. Each step addresses specific challenges in telecom environments (low latency, multi-vendor integration, carrier-grade reliability) with precise infrastructure sizing and validation criteria.
Phase 1: Foundation
Step 1: Requirements Definition
Objective: Define agent scope (e.g., "HO failure RCA <30s") and success metrics.
1. Document use case: Input=gNB logs → Output=Root cause + remediation
2. Define KPIs: Accuracy>90%, Latency<5s, MTTR reduction>50%
3. Guardrails: No config changes without human approval (confidence<85%)
4. Data sources: QXDM logs, E2SM-KPM, PM counters, MDT traces
Infra Needs: Local dev machine (16GB RAM, Python 3.11)
Step 2: Framework Selection
Compare & Choose:
LangGraph: Stateful graphs, telecom-grade persistence
CrewAI: Multi-agent orchestration, simpler API
AutoGen: Microsoft-backed, E2 integration focus
LlamaIndex: RAG-focused (logs → knowledge base)
Code:
pip install langgraph langchain-openai chromadb fastapi uvicorn
Phase 2: Data Pipeline
Step 3: Log Ingestion & Parsing
Challenge: QXDM unstructured logs → structured events
python
def parse_ran_logs(file_path: str) -> dict:
"""Extracts RRC/MAC events from QXDM CSV"""
df = pd.read_csv(file_path)
events = {
'ho_failures': len(df[df['msg_type']=='RRCReconfigFail']),
'rsrp_drops': len(df[df['L1_RSRP']<-100]),
'pci_list': df['PCI'].unique().tolist()
}
return events
Infra: ELK(Elasticsearch, Logstash, and Kibana) stack (Elasticsearch 8GB heap) for log retention
Step 4: Vector Database Setup
Purpose: Store historical RCA episodes for few-shot learning
docker run -d -p 8000:8000 --name chroma chromadb/chroma
Schema: {timestamp, cell_id, symptoms, diagnosis, remediation, outcome}
Step 5: Tool Development
Critical Tools for RAN:
1. E2_KPM_Query: RIC PM counters (PRB, HO rates)
2. 3GPP_KB: TS 38.331 parameters lookup
3. O1_Config: Antenna tilt/beam TCI changes
4. Xn_PCI: Neighbor optimization
Phase 3: Agent Architecture
Step 6: Single Agent Prototype
python
from langgraph.graph import StateGraph
class RANState(TypedDict):
symptoms: str
diagnosis: str
confidence: float
actions: List[str]
# Supervisor agent routes to specialists
def supervisor_agent(state: RANState):
return llm.invoke(f"Route to: LogAgent/KPIAgent/RemediationAgent")
Step 7: Multi-Agent Hierarchy
Key Design Principles:
Single Responsibility: Each agent focuses on one competency (parsing, KPIs, RCA, etc.)
State Persistence: Shared RANState object passes data between agents
Confidence Gating: Routes to human if diagnosis confidence <85%
Reflection Loop: Post-action critique enables learning

Infra: GPU pod (A10G 24GB VRAM) for parallel inference
Step 8: Memory Integration
Short-term: Conversation buffer (past 10 interactions)
Long-term: ChromaDB semantic search
Episodic: Past RCA episodes (vectorized)
Prompt Template:
"Similar past incident: Cell123 HO failure → PCI collision → fixed by XnAP remap"
Phase 4: RIC Integration
Step 9: E2/xApp Interface
O-RAN RIC Deployment:
# Helm values for Near-RT RIC
ric_platform:
e2term: enabled
xapp: ran-agentic-ai
E2:
functions: ["KPMv2", "RC"]
period: 1s
E2SM-KPM Report Handler:
python
@ric_xapp_handler
def kpm_report(msg: E2SmKpmReport):
ho_rate = msg.metrics['ho_success_rate']
if ho_rate < 80:
trigger_agent_analysis(cell_id=msg.cell_id)
Step 10: A1 Policy Enforcement
Policy Example:
{
"name": "ho_failure_policy",
"logic": "if ho_failure_rate > 20% then activate_agent",
"targets": ["gNB1", "gNB2"]
}
Phase 5: Productionization
Step 11: Containerization & Orchestration
Dockerfile:
FROM nvidia/cuda:12.1-runtime-ubuntu22.04
COPY . /app
RUN pip install -r requirements.txt
EXPOSE 8080
CMD ["uvicorn", "agent:app", "--host", "0.0.0.0"]
K8s Deployment:
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
template:
spec:
containers:
- name: ran-agent
resources:
limits:
nvidia.com/gpu: 1 # A10G/T4
memory: "16Gi"
requests:
cpu: "4"
memory: "12Gi"
---
HorizontalPodAutoscaler:
scaleTargetRef: deployment/ran-agent
targetCPUUtilizationPercentage: 70
Step 12: Monitoring & Guardrails
This observability stack transforms Agentic AI from experimental to production reality, providing operators with real-time proof of value (MTTR dashboards), debuggability (traces), and safety (human governance) required for scaling across 10,000+ cell deployments.

Infrastructure Requirements
Component | Dev | Test | Production |
Compute | 16C/64GB | 4xA10G (96GB) | 8xA100/H100 (320GB total) |
Storage | 500GB SSD | 2TB NVMe (ChromaDB) | 10TB Ceph (logs + embeddings) |
Network | 1Gbps | 10Gbps RIC fabric | 25Gbps E2/O1 interfaces |
RIC Platform | gNB vendor SW | RIC platform | O-RAN SC Near-RT RIC |
Production Scale (1000 cells):
- Pods: 12 (4/node x 3 nodes)
- GPU Memory: 288GB total (Llama3.1 70B quantized)
- Throughput: 50 RCA/minute
- Latency: p99 <3s
Validation MOP
Lab Validation
1. Generate synthetic anomalies:
iperf3 → traffic → force HO failures
2. Agent execution:
curl -X POST ran-agent/analyze -d '{"logs":"ho_fail.qxdm"}'
3. Verify:
- Diagnosis accuracy: 92% vs manual
- Remediation: PCI change → HO success +25%
Field Trial Checklist
E2 subscription established (KPMv2 1s period)
A1 policy activated (thresholds validated)
Multi-vendor test: Vendor1+Vendor2 gNBs
Failover: Agent pod restart <30s
Rollback: Human override mechanism
6. Real-Time Deployment in 5G RAN Infrastructure
Deploying agentic AI in a live 5G network requires careful consideration of latency, safety, and integration points. The most natural deployment location is the Near-Real-Time RIC in an O-RAN architecture. The Near-RT RIC already aggregates near-real-time telemetry via the E2 interface and hosts xApps designed for control and optimization.
An agentic RCA xApp running in the Near-RT RIC can analyze data at sub-second to second-level timescales without interfering with critical control-plane functions. In this setup, the agent operates primarily in an advisory mode, producing RCA reports and recommendations rather than directly enforcing changes.
For longer-term analysis and learning, the same agent logic can be deployed in the non-RT RIC, where it has access to richer historical data and policy frameworks. In some cases, lightweight agent components may even be embedded within the gNB for ultra-fast detection, though this is typically limited to narrow use cases due to resource constraints.
7. Operational Challenges and Mitigations
Agentic AI introduces new challenges alongside its benefits. One concern is hallucination or overconfident reasoning based on incomplete data. In telecom systems, this risk is mitigated by grounding the agent’s reasoning in hard metrics and domain constraints. Another challenge is scalability. A national network may contain tens of thousands of cells, requiring careful design of agent orchestration and load management.
Explainability remains both a challenge and a strength. While agents can produce detailed reasoning chains, those explanations must be carefully structured to be useful to human operators. Free-form narratives are less effective than explanations that clearly link symptoms to evidence and conclusions.
8. Conclusion
Agentic AI transforms 5G RAN from reactive monitoring to proactive intelligence, with O-RAN providing the perfect disaggregated canvas for scalable deployment. Future Rel-19 native AI hooks will embed agents directly in gNB firmware.
As networks evolve toward 6G and beyond, the trend toward autonomy will accelerate. Agentic AI will not replace human engineers, but it will increasingly act as a first-line analyst, handling routine investigations and surfacing well-reasoned conclusions for review.
9. References
1. Edge Agentic AI Framework for Autonomous Network Optimisation in O-RAN

