Feature Deep Dive: Observo Query and the Future of Searchable Telemetry Data

Bryan Turriff and Ajay Shekar

Faced with exploding telemetry volumes and soaring tool costs, security and DevOps teams have had to make tough tradeoffs: reduce what you ingest, or risk drowning in data–but that can introduce blind spots and slow critical incident detection. Observo AI’s data pipeline helps solve that by giving teams powerful ways to filter, enrich, and route logs in real time—optimizing what goes into platforms like SIEM, log analytics, and APM tools.

But what about the data you don’t send?

Whether you're holding data for compliance, need to run an investigation on older events, or want to analyze trends across time, Observo Query gives you direct access to archived telemetry—without rehydrating petabytes just to find the single event you need.

This blog explores how Observo Query bridges cost-efficient data reduction with deep historical search—and where we’re taking the product next. UPDATE: This blog has been updated with a section on Observo Orion, our Agentic AI Data Engineer, which was released earlier this year.

‍

Rethinking Retention: Archive First, Search When You Need It

Many teams look to filter telemetry before forwarding it to tools like Splunk, Elasticsearch, or Datadog to control costs. But without a reliable way to archive that data, they risk falling short of data retention policies and industry standards—or losing access to valuable context for future investigations.

At the same time, keeping everything in your SIEM index for an extended period of time degrades performance and significantly drives up storage and compute costs. We've seen customers lose hours diagnosing pipeline stalls—only to find the real issue was index saturation in their search backend.

With Observo Query, that tradeoff goes away.

The product starts by creating an immutable archive of your full-fidelity telemetry. Data is stored in Parquet format in object storage like Amazon S3 or Azure Blob—before any reduction happens. This archive becomes your long-term, cost-efficient foundation. And when you need that data later, you can rehydrate it and stream it directly into your analytics tools—or query it in place.

Rehydration: Bringing Archived Data Back to Life

Rehydration in Observo Query lets teams select a time window and repopulate logs into destinations like Elasticsearch. This is useful for breach investigations, audits, compliance lookbacks, and one-off deep dives. But the raw data doesn’t always match the live schemas or dashboards of the tools you need to do analysis.

Luckily the Observo AI Data Pipeline makes this much more flexible.

You can run archived data through any pipeline you choose, allowing rehydrated logs to be fully transformed and enriched—just like your real-time streaming data. This means data shows up in your tools with the same structure, tags, and formats you already use. You can choose to process full-fidelity data (no sampling), customize transforms, and even run dry tests before committing changes.

Whether you’re rerunning detection logic or analyzing past events with familiar tools, rehydration now feels like a natural extension of your existing pipeline—not a separate forensic workflow.

Smart Search Without Rehydration

Of course, not every question needs a full rehydration.

With our search capability, teams can explore archived data directly in place—no ingestion required. Built on Apache Iceberg, Observo Query supports SQL-like querying on top of your S3-based archives. This enables fast, column-aware filtering and the ability to define partitions that match your most frequent queries.

For example, if your team frequently investigates traffic from specific IP ranges or geographies, you can partition by those attributes to drastically speed up search. Need to pinpoint when a malicious peer IP first contacted your firewall? Search lets you find the answer in seconds—without moving terabytes of data into an analytics tool and incurring the costs of daily limit overages.

Natural language capabilities also allow teams to ask questions like:

“What were the slowest API calls during last quarter’s product launch?”

Or:

“Show me failed logins by service account over the past 30 days.”

The goal: abstract away SQL and pipeline configs for teams that want answers, not complexity.

‍

Observo Orion: Powering Query in Context

Observo Query makes it easy to search archived telemetry using natural language or SQL-like syntax—but effective investigation often requires more than just pulling data. Security and DevOps teams need answers in the context of what they’re trying to accomplish: triage an incident, validate a policy, verify retention, or identify performance regressions. That’s where Observo Orion comes in.

Observo Orion is Observo AI’s agentic assistant, built to help users move from data retrieval to decision-making. It works alongside Observo Query to interpret intent, clarify scope, and recommend next steps. Observo Orion, users can:

Ask natural-language questions and receive not just matching log events, but summaries and insights
Put search results into operational context—for example, identifying which services were impacted or which user accounts were involved
Automatically generate pipeline filters, enrichments, or transformations based on what the user is trying to query or analyze
Highlight anomalies or patterns in historical logs, even when rehydration isn’t required
Recommend more efficient query strategies or pipeline changes to reduce false positives or speed up investigation

By combining flexible telemetry search with intelligent, goal-aware guidance, Observo Orion helps teams go beyond finding data—they get closer to understanding what it means and what to do next.

‍

Use Cases: Why (and When) It Matters

Security Incident Investigation

When incidents surface, it’s rarely in real time. A breach that starts with a compromised credential in February might not be discovered until May (or in many cases even 1-2 years later). With Observo Query, teams can quickly rehydrate or search archived logs to trace the full timeline of compromise—using the same analytics tools, dashboards, and formats they trust.

Regulatory Compliance and Retention

Many organizations are required to retain telemetry for 3–7 years. But keeping that data in an active SIEM is cost-prohibitive. Observo Query helps meet retention mandates by archiving full-fidelity logs in low-cost cloud storage—and retrieving them only when needed, without sacrificing fidelity.

Capacity Planning and Engineering Readiness

Whether it’s a tax deadline, Black Friday, or a flash sale, high-traffic events create stress scenarios you want to model and prepare for. With Query, DevOps teams can isolate past events, rehydrate just that window, and analyze resource utilization, latency spikes, and scaling patterns—down to the pod or service level.

Fraud and Law Enforcement Requests

In sectors like e-commerce and financial services, law enforcement may request historical data tied to specific user sessions or transactions. Observo Query allows teams to search archived data by key identifiers (e.g., user ID, transaction ID, IP range), locate the relevant logs, and export them—all without disrupting daily operations.

From Archive to Insight

Observo Query bridges a critical gap between cost control and long-term visibility.

By combining smart archival, flexible rehydration, and powerful search into one platform, we’re helping security and DevOps teams get more value from their telemetry—without overloading their tools or budgets. It’s no longer a tradeoff between fast answers and full fidelity. With Observo Query, you get both.

And with natural language capabilities, and as usage-driven optimization and federated search evolve, this product will continue to become not just a safety net—but a strategic asset.

Want to learn how leading CISOs are rethinking their data strategy with AI-powered pipelines?

Download the CISO Field Guide to Security Data Pipelines

‍