Agentic AI Is a Systems Design Problem

Agentic AI Is a Systems Design Problem

This is Part 3 of a three-part series:

AI & Machine Learning
AI Agents & Chatbots

Most agent demos look the same: one capable LLM, a handful of tools, and a happy-path workflow that works–once. Maybe even a few times. Then it gets shown to stakeholders and the question becomes: can we ship this?

This is where most teams get stuck.

The gap between a compelling agent demo and a production-ready agentic system isn’t about prompt quality or model choice. It’s about systems design. In production, an agent is not a clever script–its a distributed, stateful system that must be repeatable, observable, secure, and cost-bounded while operating under uncertainty.

The prototype mindset optimizes for capability: can the agent do the thing?

The production mindset optimizes for reliability: can the system do the thing a thousand times a day, across teams, without surprises?

Nearly every production decision for agentic AI falls into three buckets:

Design Patterns: How Your Agentic System Actually Operates

Choosing an agent design pattern is a structural decision guided by the business objective and determines your system’s failure modes, debuggability, and operational burden. Throughout 2025, the community shifted away from “monolithic” prompts toward specialized multi-agent coordination. By breaking complex tasks into discrete roles, your system can achieve higher reliability and easier maintenance.

Below, we explore four common patterns of agentic architecture, moving from rigid structured workflows to highly decentralized systems, before examining the specific role–and risks—of the single-agent model.

The Orchestrator Pattern

The supervisor agent plans tasks, delegates to tools or sub-agents, and aggregates results.

Best for: workflows that must be predictable, auditable, and easy to reason about

Tradeoffs:

This pattern maps cleanly to traditional workflow engines and is often the easiest to productionalize.

Agentic Investment Research Engine

Real-World Examples

Customer Support System

A triage agent receives a support ticket, classifies the issue type (network, billing, access), and routes it to the appropriate specialist agent. The triage agent doesn’t handle the ticket itself–it decides which expert to consult. This mirrors human support centers where a receptionist routes calls to technical support, billing teams, and account management.

Multi-Agent Investment Analysis

A coordinator agent receives a ticker symbol and dispatches it to four parallel specialist agents simultaneously–fundamental analysis (financial statements), technical analysis (price patterns), sentiment analysis (news/social signals), and ESG evaluation. The coordinator aggregates their independent insights into a single investment recommendation.

The Specialized Pipeline Pattern

A fixed sequence of specialized agents (e.g., retrieve → analyze → decide → act).

Best for: well-defined business processes like document extraction, intake triage, or approval flows.

Tradeoffs:

This pattern shines when the business process already exists and the agent augments it, rather than replaces it.

Real-World Examples

Contract Draft Pipeline

A law firm’s document system chains four agents in sequence:

Contract Draft Pipeline

Asynchronous Event-Driven Agents

Agents subscribe to events and react asynchronously (ticket created, invoice posted, alert fired).

Best for: high-volume operational environments where agents augment existing systems

Tradeoffs:

This pattern treats agents as probabilistic microservices–powerful, but operationally demanding.

Agentic Lead Intelligence

Real-World Examples

Self-Healing IT Infrastructure

When system logs and metrics arrive via Kafka topics, multi AI agents respond in parallel without explicit orchestration:

Each agent subscribes independently to the same event stream. No centralized coordinator decides which agent should act–they all react based on their internal logic.

Lead Intelligence for Sales Teams

When new lead emails arrive in the inbox, multiple agents react asynchronously to enrich the opportunity:

All execute concurrently without blocking the email system. By the time a sales rep opens the lead, they have a complete intelligence package—prospect research, competitive context, recommended quote positioning, and talking points—dramatically shortening sales cycles.

The Single “Generalist” Agent

One agent with many tools and an ever-growing prompt.

Best for: proofs-of-concept, demos, and narrow use-cases

Reality: while often the starting point for developers, this pattern faces a fundamental scaling wall as workloads grow in complexity.

Platform Copilot

Real-World Examples

Coding Assistants and Platform Copilots

The single agent pattern works well in constrained domains like basic coding assistants or platform copilots—they operate synchronously, reasoning through tasks, and have a defined set of tools and actions. However, once behavior becomes complex, debugging turns into archaeology. Single agents struggle with:

It is important to note that the Single Agent is not “bad” architecture–in fact, when applied to the right problem, it is incredibly powerful. Agentic coding tools like Claude Code are immensely successful at executing complex, iterative technical tasks. Just don’t expect it to help with qualifying inbound leads.

Choosing a pattern is choosing how your system fails–and how easily humans can understand why.

The decision tree is straightforward:

The best systems often blend patterns: a mostly deterministic pipeline with one orchestrator step for routing, or an event-driven system with internal pipeline stages within each agent. The key is matching pattern choice to workload characteristics–not architecture preferences.

Build vs. Buy Is an Architectural Decision

Like any emerging technology, early adopters built everything from scratch because they had to. In early 2026, the build-vs-buy debate shifted to a question about architectural ownership. When you choose a path, you aren’t just choosing a vendor; you are choosing where complexity–and long-term maintenance–should live.

When Buying Makes Sense

Buying a platform is often the right call when you need:

Benefits: Faster time-to-value, fewer sharp edges, and fewer homegrown control planes

When Building is Justified

Building in-house makes sense when:

Building means owning the failure modes–intentionally.

The Hybrid Reality

Most successful teams land here: buy the orchestration, monitoring, and compliance layer; build domain-specific agents, tools, and logic on top.

Build-vs-buy is about choosing where you want to innovate–and where you want leverage.

Observability, Traceability, and Cost: The Non-Negotiables

If you add observability after your agent is “working,” you aren’t building a production system–you’re running an extended prototype. In a world of non-deterministic outputs, logs are no longer enough; you need traces.

Opening the Black Box

Production agents operate in a “block box” of reasoning. To debug them, you need per-request traces that act as a forensic transcript. You must be able to reconstruct:

Without this you cannot answer “why did it do that?” or debug failures.

Metrics That Actually Matter

Once the system is live, move past trivial metrics like “total tokens” and focus on operational health:

Cost as a Design Constraint

In agentic systems, cost is not an afterthought—it is a limiting reagent. Multi-agent loops can exponentially increase token consumption if left unchecked.

To stay profitable, production systems must implement:

The Bottom Line: if you cannot explain exactly why an agent took an action or why a specific request cost $2.00, you aren’t ready for production.

A Simple Production Readiness Checklist

Before calling an agentic system “production”:

If any of these are unclear, the system may work–but it won’t scale safely.

Written by

Francisco González, Director of AI/ML

Published

March 18, 2026

Cortex Code: Build Faster with Snowflake’s AI Coding Agent

Cortex Code: Build Faster with Snowflake’s AI Coding Agent

AI & Machine Learning
Data & App Engineering
AI Agents & Chatbots
Snowflake

Snowflake has officially unveiled Cortex Code, with its CLI version now in general availability. As a Cortex Code release partner, OneSix has spent the last month working hands-on with the platform, exploring how it fits into real-world development workflows and what it unlocks for modern data and AI teams.

This launch represents more than just a new developer tool. Cortex Code signals a meaningful shift in how organizations will build, deploy, and scale data products inside Snowflake.

At its core, Cortex Code is built to automate and accelerate end-to-end data and AI development, making DataOps more attainable, even for lean teams with limited engineering bandwidth. Unlike generic coding assistants, Cortex Code is grounded in the environment where your data already lives. It understands your schemas, governance model, and Snowflake architecture, enabling developers to move faster without compromising enterprise security or control.

The impact is clear: organizations can translate architectural designs and business requirements into scalable solutions faster than ever before.

The Snowflake Cortex Ecosystem

Cortex Code joins Snowflake Intelligence as a part of the Snowflake Cortex AI product suite. Snowflake Intelligence and its underlying components (Cortex Analyst and Cortex Search) have been delivering natural language querying (NLQ) and document-aware insights directly to end users.

Cortex Code extends this intelligence upstream. Rather than focusing on end-user consumption, Cortex Code brings AI directly into the developer workflow, helping teams build the pipelines, applications, and AI tools that power the business.

Cortex Code Capabilities

Integrates Seamlessly Into Development Workflows

Cortex Code is available through:

The CLI brings secure, Snowflake-aware assistance into local workflows, integrating naturally with tools like VS Code, terminals, Bash commands, and Python scripts.

Teams can build and test queries, notebooks, and pipelines locally, then deploy confidently into Snowflake environments with full platform alignment.

Deep Data and Governance Awareness

Cortex Code doesn’t just generate SQL; it understands your Snowflake environment:

When making recommendations, Cortex Code evaluates what already exists and builds in alignment with your enterprise standards.

Snowflake-Native Architectural Intelligence

Because Cortex Code is purpose-built for Snowflake, it understands the full ecosystem of tools available, including:

It can suggest optimized approaches, enforce best practices, and even help implement layered warehouse architectures such as star schemas, grounded in your design requirements.

Enterprise-Grade Security by Design

Cortex Code operates entirely within Snowflake’s existing RBAC framework. Organizations can:

Your data and metadata never leave Snowflake, and Cortex Code leverages the same governed model infrastructure behind AI SQL and Snowflake Intelligence.

This makes adoption far more practical for enterprises with strict compliance and security requirements.

Cortex Code in Practice: The Building Loop

In our experience, Cortex Code follows a predictable human-in-the-loop framework.

1. Prompt in Natural Language

From simple requests like:

To complex workflows such as:

Cortex Code begins by evaluating what it has access to and building a plan grounded in your environment.

2. Clarify Intent and Architecture

For more complex tasks, Cortex Code will confirm details such as naming conventions, architecture assumptions, or execution steps:

This step ensures alignment before execution.

3. Step-by-Step Developer Approval

Cortex Code maintains a true human-in-the-loop model:

It enhances developer productivity while keeping governance and oversight firmly in human hands.

4. Rapid Iteration Around Roadblocks

When roadblocks arise, Cortex Code adapts quickly:

For example, if a Bash command fails or a required function is unavailable, Cortex Code can pivot to a Python-based approach and install the necessary packages. If existing roles lack sufficient privileges to set permissions on a new schema, it will generate the required code and clearly indicate what administrative approvals are needed to proceed.

Rather than halting progress, Cortex Code keeps development moving, reducing friction, minimizing context switching, and accelerating overall delivery velocity.

5. Validation, Cleanup, and Next-Step Recommendations

At the end of larger workflows, Cortex Code summarizes what it built and suggests improvements such as:

Getting Started: Start Simple, Scale Intelligently

For organizations adopting Cortex Code, we recommend a phased approach:

1. Establish a Safe Sandbox

Start with a limited-access schema or dedicated development database to experiment safely.

2. Begin With High-Confidence Tasks

Use Cortex Code for:

3. Expand Into Advanced Development

Once confidence builds, scale into:

4. Secure, Benchmark, and Operationalize

Before production rollout:

The Strategic Impact of Cortex Code

Cortex Code represents a major evolution in how data pipelines and AI applications will be built inside Snowflake. With an intelligent coding agent built directly into the platform and grounded in your unique data environment, organizations can dramatically reduce the time from idea to prototype to deployment. Here are the essential takeaways:

As you begin exploring Cortex Code, start small, build trust in a sandbox, and scale thoughtfully into higher-impact engineering and AI workflows. The future of Snowflake-native development is collaborative, and Cortex Code is an important step forward.

Start Building Smarter

Cortex Code is a meaningful step forward in Snowflake-native development. If you’re looking to move faster, reduce engineering bottlenecks, or understand where Cortex Code fits into your roadmap, let’s connect.

Written by

Chris Hammer, Lead Consultant

Published

February 20, 2026

Microsoft Copilot + Snowflake Cortex Agents: The Best of Both Worlds

Microsoft Copilot + Snowflake Cortex Agents: The Best of Both Worlds

AI & Machine Learning
AI Agents & Chatbots
Snowflake

As enterprises move from AI experimentation to real, production-grade use cases, one challenge keeps surfacing: how do you balance ease of adoption with depth of control? Microsoft Copilot and Snowflake Cortex Agents each solve part of that puzzle, but together, they unlock something far more powerful.

This post explores how Microsoft Copilot and Snowflake Cortex Agents complement each other, when each platform shines on its own, and why combining them creates a pragmatic path to enterprise AI at scale.

Understanding the Copilot Ecosystem

“Copilot” isn’t a single product. It’s an ecosystem.

For enterprises, the center of gravity is Microsoft 365 Copilot, which brings generative AI directly into the tools people already use every day: Word, Outlook, PowerPoint, Excel, Teams, and Copilot Chat. The experience is grounded in:

Where Copilot really becomes interesting, however, is with Copilot Agents.

Copilot Agents: From Chat to Action

Copilot Agents allow teams to package:

Using Copilot Studio, organizations can create task-focused agents that guide users through predictable, repeatable processes. Everything from proposal generation to document analysis and operational workflows.

The result: fast time-to-value, low barriers for adoption, and AI that shows up where work already happens.

Where Snowflake Cortex Agents Excel

Snowflake Cortex Agents approach the problem from the opposite direction. Instead of starting with the user interface, Cortex starts with data. Built directly into the Snowflake platform, Cortex Agents are designed for:

Cortex shines when:

In short, Cortex Agents are powerful, and if you’ve read this far, then you’re prepared to handle Cortex’s more mature technical aspects.

Copilot vs. Cortex: Not a Competition

It’s tempting to frame this as a head-to-head comparison. In reality, Copilot and Cortex solve different problems.

Copilot Strengths

Cortex Strengths

Each platform has unique strengths. And that’s exactly why combining them works so well.

The Best of Both Worlds: Copilot + Cortex

Snowflake Cortex Agents can now be registered as apps inside a Microsoft tenant and accessed directly from Copilot via OAuth authentication. This means:

Additional capabilities include:

What This Architecture Unlocks

With this integration, organizations no longer have to choose between:

Instead, they can:

A Real-World Pattern: Start Simple, Scale Intelligently

One of the most effective adoption strategies we’ve seen is:

This phased approach reduces risk, avoids overengineering early, and ensures AI adoption is driven by real business outcomes, not novelty.

The Takeaway

Microsoft Copilot and Snowflake Cortex Agents are complements, not rivals. Together, they offer:

For organizations serious about operationalizing AI, this combination represents one of the most compelling enterprise patterns available today.

If you’re exploring how to design, build, or integrate Copilot and Cortex Agents into your data and AI strategy, OneSix can help you move from experimentation to impact fast.

Put Copilot and Cortex to Work

If you’re exploring how to design, build, or integrate Copilot and Cortex Agents into your data and AI strategy, OneSix can help you move from experimentation to impact fast.

Written by

Jonathan Kolar, Sr. Lead Consultant

Published

January 29, 2026

Agentic AI: Using Tool Calling to Go Beyond RAG

Agentic AI: Using Tool Calling to Go Beyond RAG

This is Part 2 of a three-part series:

AI & Machine Learning
AI Agents & Chatbots

“Agentic AI” is one of today’s most popular AI terms. But what does it actually mean?

At its core, Agentic AI describes systems that can make decisions and perform tasks on their own. They don’t just respond—they take action. That autonomy comes from two capabilities:

For humans, we can easily act on our decisions. For large language models (LLMs), it doesn’t have that same capability. An LLM can reason about what should happen, but it can’t take action unless we give it a way to interact with the world.

That bridge is tool calling—the mechanism that turns a passive LLM into an active, task-performing agent.

Standard LLM Workflow

Agentic LLM Workflow With Tool Calling

What Is Tool Calling?

If Agentic AI is about helping LLMs act, tool calling is how they do it. Tool calling allows an LLM to use external functions or APIs (tools) to perform real-world tasks based on its reasoning. Instead of returning only text, the LLM can decide which tool to use, call it with specific inputs, and then continue reasoning using the results.

How tool calling works

Putting it into practice

Consider the question: “What’s the weather today?”

An LLM doesn’t know the current weather. But if you give it a “getWeather” tool:

Taking tool calling a step further

Agents build on this by using recursive tool calling to achieve their results:

Tips for Better Tool Calling

Tool calling isn’t just about giving an LLM access to functions. You can also shape the LLM’s behavior:

Add structured arguments

For example, a "reasoning" field will make the LLM explain why it wants to use a tool, even for non-reasoning models.

Use parallel calls

If the LLM can’t do parallel execution natively, create a 'wrapper tool' that gathers inputs for multiple functions at once, and handles the parallelization as part of the function.

These techniques can improve both clarity and speed.

Very Cool Tool Calling Example

(Say that 3 times fast!)

Automating dependency mapping

One of our clients needed to extract many different values from documents. Each value had its own instructions, often like this:

Value A can be found in Section X. If Value A is found, set Value B to ‘yes.’

This instruction means B depends on A.

Mapping a few dependencies by hand is simple. Mapping hundreds—written in natural language and full of cross-references—is not.

Instead of doing it manually, we used tool calling to:

The result: a fully automated process for untangling human-written rules, and outputting a dependency tree that helped parallelize extractions.

Tool Calling vs. Structured Output

These two ideas work together, but they solve different problems:

In practice, you often use tool calls to collect what you need for the agent—and structured output to produce the final answer in a format fit for use outside of the agent.

Using Agents vs. Hard-Coded Tool Calls

Tool calling can be thought of as an expanded form of RAG. Instead of retrieving only documents, you’re giving the LLM access to any external capability. There are two ways to design this:

Approach 1: Pre-defined tool calls

You decide exactly which tools to call before sending your prompt to the LLM. You execute these tools as needed, and enrich the prompt with context from the pre-determined calls.

Pros:

Cons:

An example of this approach is basic RAG: you use a specific query to gather relevant documents, enrich your context with this data, and send it to the LLM.

Approach 2: Let the LLM decide

You give the LLM a list of tools, and their specifications, and let it choose.

Pros:

Cons:

Agents shine in complex, open-ended workflows—but they must be managed carefully. An example of this approach is to give the LLM a ‘RAG tool’, and let the LLM decide when to use the tool, and with what inputs.

Planning Tool Calls

LLMs don’t naturally reveal their thought process unless you use a dedicated reasoning model. But tool calling gives us a workaround. You can:

Good agents aren’t just powerful—they’re monitored and constrained.

Keeping Costs Under Control

Left unchecked, an agent can call tools endlessly to verify its own logic. To prevent runaway behavior:

The goal is to keep the agent smart, efficient, and affordable.

The Bottom Line

Tool calling unlocks the “agentic” in Agentic AI. It lets LLMs make decisions, take action, and operate beyond their training data. But with that power comes the need for clear controls, thoughtful design, and cost-aware engineering.

Used well, tool calling turns LLMs from passive responders into active problem-solvers—capable of navigating complex tasks, coordinating multiple steps, and producing reliable, actionable outcomes.

If you’re building anything more ambitious than a single-prompt chatbot, tool calling is the key to taking your system beyond RAG and into true Agentic AI.

Let’s talk about how we can turn your AI ideas into measurable results.

At OneSix, we design and deploy AI systems built for the real-world. We engineer context, optimize retrieval, and integrate AI into your workflows—so your models deliver accurate, reliable, measurable results.

Written by

Matt Altberg, Lead ML Engineer

Published

November 19, 2025

Snowflake Intelligence in Action: Predicting Customer Churn

Snowflake Intelligence in Action: Predicting Customer Churn

AI & Machine Learning
AI Agents & Chatbots
Forecasting & Prediction
Snowflake

Detect early churn signals hidden in structured and unstructured data.

Powered by Snowflake Intelligence, the Customer Churn Predictor uncovers early warning signs hidden in your data, so customer success teams can take action before revenue loss. Watch the demo to see how it works and why and why sales and customer success teams love its proactive approach to churn prevention.

The Hidden Problem

Churn Signals You Can’t See

Customer success teams are sitting on a goldmine of insights, but much of it is locked away in unstructured data. Support tickets, meeting notes, and emails often contain subtle red flags that point to dissatisfaction or disengagement.

The challenge? These clues are nearly impossible to spot at scale. Without a clear signal, teams are forced to react after problems surface. By then, it’s too late. Renewals are lost, and valuable customers quietly churn without warning.

The Solution

Spotting Churn Before It Happens

The Customer Churn Predictor, built on Snowflake Intelligence, Cortex Analyst, and Cortex Agents, transforms churn prevention from reactive to predictive.

Together, these capabilities automatically surface non-obvious churn signals, generate risk briefings, and empower customer success teams to act decisively—long before a renewal is at risk.

The Impact

From Firefighting to Foresight

With the Customer Churn Predictor, teams gain a 360° AI-driven view of customer health. Instead of reacting to churn, they can predict and prevent it. Churn prevention stops being a fire drill and becomes a strategic advantage.

Written by

Cody Dirks, Sr. AI & ML Lead

Published

November 12, 2025

Snowflake Intelligence Brings Agentic AI Power to the Enterprise

Snowflake Intelligence Brings Agentic AI Power to the Enterprise

Finally, a way to talk to your data with natural language.

AI & Machine Learning
Data Analytics
Data & App Engineering
AI Agents & Chatbots
Snowflake

Accessing insights shouldn’t be the hardest part of working with data. Yet for many organizations, the process is still slow, fragmented, and dependent on overworked analytics teams. Snowflake Intelligence transforms that experience, enabling users across the business to securely interact with data in natural language and get immediate answers they can trust.

As a Snowflake Intelligence Launch Partner, OneSix helps organizations bring this new capability to life, accelerating decisions, improving collaboration, and driving measurable business results.

How It Works

Snowflake Intelligence makes it possible for anyone in your organization to talk to your data naturally and get instant, trusted answers.

Built on the Snowflake Data Cloud, it brings together conversational AI, intelligent automation, and enterprise-grade security to help teams move from insight to action without waiting on a dashboard.

Ask a question in plain language, like “Which products drove the most revenue growth last quarter?”, and Snowflake Intelligence gets to work behind the scenes:

How Teams Are Using It

Snowflake Intelligence is helping teams across functions and industries make faster, smarter decisions.

Every team has questions, and now they can get answers instantly. Whether it’s sales looking for performance trends or finance evaluating results, Snowflake Intelligence makes data accessible to everyone. Explore the examples below to see how it’s transforming work across functions and industries.

Use Cases by Function
FunctionUse CaseExample Question
SalesAnalyze performance across regions and products to uncover growth opportunities and forecast revenue more accurately.“What were my top product sales in the West last quarter, and why did product X outperform product Y?”
MarketingMeasure campaign performance and ROI instantly to optimize spend and strategy.“Which campaign delivered the highest ROI this month?”
Customer SuccessDetect early signs of churn by connecting CRM, usage, and sentiment data to surface risk and guide proactive outreach.“Which accounts show early signs of churn, and what’s causing it?”
FinanceCombine internal and external data to improve forecasting, risk modeling, and investment decisions.“How did portfolio performance compare to the broader market last quarter?”
Research & ProductIntegrate feedback, experimentation, and market data to reveal trends and inform product strategy.“Which feature requests are mentioned most frequently by our top customers?”
Operations & Supply ChainMonitor performance, identify inefficiencies, and predict issues before they cause delays or downtime.“Where are our biggest delays, and what’s causing them?”
Human ResourcesAnalyze workforce trends, engagement, and retention to improve employee experience and optimize hiring decisions.“Which departments have the highest turnover risk, and what factors are driving it?”
Use Cases by Industry
IndustryUse CaseExample Question
RetailAnalyze cross-channel performance and customer behavior to optimize promotions and drive loyalty.“Which products or promotions are most effective at driving repeat purchases?”
Travel & HospitalityTrack booking trends, campaign conversions, and occupancy rates to forecast demand and improve targeting.“Which destinations are seeing the fastest booking growth this season?”
TechnologyCombine product analytics and customer sentiment to enhance product roadmaps and user experiences.“What product features are driving the highest customer satisfaction scores?”
Financial ServicesStreamline insurance underwriting, claims analysis, and risk modeling by connecting policy, customer, and market data.“What claim patterns indicate emerging risk trends across regions or policy types?”
Private EquityAccelerate deal sourcing and due diligence by connecting internal deal data with market insights, financials, and news to identify high-potential targets and assess risks faster.“Which companies fit our investment criteria based on recent financial performance and market activity?”
Healthcare & Life SciencesAnalyze clinical trial data alongside published research to accelerate innovation and improve outcomes.“What patterns in recent trial data correlate with improved treatment results?”
ManufacturingPredict maintenance needs, monitor production metrics, and reduce downtime through early detection of issues.“Which machines or lines are most likely to require maintenance this month?”
Logistics & TransportationTrack routes, delivery times, and fleet performance to optimize efficiency and reduce operational costs.“Where are our biggest delivery bottlenecks, and how can we reduce delays?”

Use Case Spotlight

Customer Churn Prediction

Detect early churn signals in structured and unstructured data to help customer success teams act faster and prevent revenue loss.

OneSix’s Customer Churn Predictor, powered by Snowflake Intelligence, Cortex Analyst, and Cortex Agents, helps customer success teams detect early signs of churn hidden in structured and unstructured data like support tickets, meeting notes, and emails.

By consolidating data such as product usage, NPS scores, and sentiment directly in Snowflake, teams gain a unified, AI-driven view of customer health. The system surfaces subtle churn signals, generates automated risk briefings, and recommends next-best actions through an LLM-powered advisor.

The result: customer success teams can act earlier, retain more high-value accounts, and shift from reactive firefighting to proactive retention.

How OneSix Helps You Get Value

As a Snowflake Intelligence Launch Partner, OneSix helps organizations move from exploration to impact, turning new AI capabilities into measurable business results.

Ready to Bring Your Data to Life?

At OneSix, we help you make the leap from dashboards to intelligent experiences. As a Snowflake Intelligence Launch Partner, we’ll guide you from pilot to production and help your teams unlock measurable business value every step of the way.

Ready to see what’s possible when you can talk to your data?

Written by

Jason Drucker, Vice President of Data Practice

Published

November 4, 2025

Snowflake Cortex Search vs. Custom RAG

Snowflake Cortex Search vs. Custom RAG

Choosing the Right Approach for Enterprise AI

AI & Machine Learning
Data & AI Strategy
AI Agents & Chatbots
Forecasting & Prediction
Snowflake

Enterprise adoption of AI is moving quickly, but leaders face a critical question: how do we ground large language models (LLMs) in enterprise data while keeping solutions scalable, accurate, and cost-effective?

Snowflake’s Retrieval-Augmented Generation (RAG) and enterprise search solution, Cortex Search, and custom RAG pipelines represent two different approaches to solving that challenge. Understanding how they work—and when each is the right fit—is essential for any organization investing in enterprise-ready AI.

The debate begins with a shared starting point: RAG enables LLMs to leverage enterprise data effectively.

What is RAG?

Retrieval-Augmented Generation (RAG) is the process of giving an LLM access to relevant, external information so it can answer queries more accurately.

The typical RAG workflow looks like this:

The value of RAG is that it allows standard, off the shelf models to deliver high-quality, context-aware answers based directly off of your data—whether it’s the latest company policy, current product details, or niche industry knowledge.

Why RAG Matters Now

Enterprise AI adoption is accelerating, but models alone are not enough. RAG has become essential because it:

In other words, RAG is the bridge between broad LLM capability and business-specific intelligence.

The Rise of Enterprise-Ready AI 

Snowflake Cortex arrives at a moment when enterprise AI adoption is shifting from experimentation to scale. According to Gartner’s Emerging Tech Impact Radar: Generative AI (2025), one trends stand out that will fundamentally change how enterprises adopt and operationalize AI: AI marketplaces will reshape how enterprises buy AI.

By 2028, 40% of enterprise purchases of AI assets—models, training data, and tools—will be made through AI marketplaces, up from less than 5% in 2024. This shift will make AI assets more accessible, but it also introduces new questions: 

As enterprises weigh these decisions, the opportunity is clear: managed services like Cortex Search make it faster than ever to get started, but selecting the right approach—and understanding the tradeoffs—remains critical to long-term success.

The Big Question

With both Cortex Search and custom RAG pipelines available, leaders face a critical decision: When should you use Cortex Search, and when does a custom RAG pipeline make more sense?

The Case for Cortex Search

For many enterprises already running on Snowflake, Cortex Search offers the fastest path to RAG. It delivers a “batteries included” experience with embedding, chunking, document parsing, and auto-updates handled natively inside the Snowflake Data Cloud.

Strengths

Best Fit For

Cortex Search is best suited for teams that:

Key Considerations

When evaluating Cortex Search, organizations should consider:

These considerations highlight the importance of aligning the right approach to the right use case, helping organizations get the most out of Cortex Search today and in the future.

The Case for Custom RAG

While Cortex Search is designed to cover a wide range of enterprise use cases, some organizations encounter requirements that go beyond its current scope. In those cases, a custom RAG architecture may be the right fit.

Strengths

Best Fit For

Custom RAG is best suited for enterprises that:

Tradeoffs

The tradeoff for flexibility is complexity. Custom RAG requires:

Choosing the Right Approach

Enterprises evaluating RAG often face a decision between Cortex Search, Snowflake’s managed turnkey option, and a Custom RAG architecture built for flexibility and control. 

The right choice depends on the end goal: aligning the approach to the use case ensures organizations get the most value from their investment.

This comparison can be viewed from two angles: (1) feature differences such as setup, scaling, and control, and (2) evaluation criteria that guide leaders in choosing the best fit for their priorities.

Cortex Search vs. Custom RAG

Setup, scale, and control at a glance.

FeatureCortex SearchCustom RAG
SetupTurnkey (SQL functions, auto-updates)Complex (vector DB, pipelines, orchestration)
Supported DataText (PDF, DOCX, JPG, PNG → text only), 512 tokensMulti-modal (text, image, audio, video)
Scaling100M chunks maxUnlimited, depends on infra
ControlLimited, black-boxFull flexibility
Best UseRapid POCs, Snowflake-native appsCustom enterprise AI, domain-specific

A Hybrid Approach

For many enterprises, the best path is not either/or, but both. Cortex Search provides a fast, Snowflake-native way to launch retrieval-augmented applications with minimal setup. As needs grow — more data types, domain-specific performance, or advanced retrieval strategies — a custom RAG architecture can extend those foundations without starting over.

Aligning Approach to Use Case

The key is alignment: matching the approach to the use case. Whether the priority is speed, scale, or specialization, organizations can maximize value by choosing the right starting point and planning for future flexibility.

The Path Forward

Every enterprise’s journey with AI looks different. Whether you start with Cortex Search, scale with Custom RAG, or combine both, the key is choosing an approach that aligns to your business goals. That’s where OneSix comes in.

Written by

Osman Shawkat, Senior ML Scientist

Published

September 22, 2025

Expert Guidance,
Real-World Implementation

We help enterprises choose the right AI approach and make it real. Our team of senior engineers and PhD-trained scientists designs, deploys, and scales solutions that deliver impact fast.

Contact Us

Beyond the Prompt: Why Your RAG System May Be Underperforming

Beyond the Prompt: Why Your RAG System May Be Underperforming

This is Part 1 of a three-part series:

AI & Machine Learning
Data & AI Strategy
AI Agents & Chatbots
Forecasting & Prediction

Faced with the question “What is the capital of the Netherlands?” you have a few possible responses:

1


Answer confidently
If you know it

2


Look it up
If uncertain

3


Take a guess
Might be wrong


Large Language Models (LLMs) face the same challenge. They excel when a question falls inside their training data, but when it doesn’t, they may “hallucinate,” producing an answer that sounds plausible but is wrong. 

The key difference is that LLMs don’t have direct access to your enterprise data or knowledge bases without additional retrieval methods. That’s where Retrieval-Augmented Generation (RAG) comes in.

RAG in a Nutshell

RAG is the process of giving an LLM access to relevant, external information so it can answer queries more accurately. The typical RAG workflow looks like this:

The value of RAG is that it allows models of any size to deliver high-quality, context-aware answers, whether it’s the latest company policy, current product details, or niche industry knowledge. But RAG doesn’t operate in isolation. For RAG to deliver consistently, it needs to be part of a well-designed information environment, also known as context engineering.

 

The Shift from Prompt to Context Engineering

In the early days, “prompt engineering” was the art of crafting the right wording to get the right answer. But as AI systems have grown more complex, the industry has realized that context quality of context matters more than the cleverness of the prompt.

Context engineering builds the full information environment around the LLM, not just the immediate instruction, but also system settings, past conversation history, retrieved documents, tools, and output formats.


Prompt Engineering
Shaping single-turn prompts for answers


Context Engineering
Shaping context for multi-step tasks


RAG is a critical part of context engineering, ensuring that the model’s “world” includes the exact information needed for the task.

It’s Not Your RAG, It’s Your Context

In real-world deployments, many RAG systems disappoint, and the issue is almost never the model. It’s bad context engineering. Common pitfalls include:

Imagine an AI system reviewing legal contracts that confidently reports a key clause is missing. In reality, the clause exists, but the retrieval process never pulled it into the model’s context. This kind of gap shows why careful retrieval design is essential.

Engineering Retrieval for Success

Preventing these failures starts with designing retrieval around the business use case:

Done well, RAG produces grounded, fresh, scalable, and personalized AI outputs. But in many real-world environments, not all the information you need is text. From images and videos to audio clips and charts, handling different content formats introduces new retrieval challenges — and that’s where multi-modal context comes in.

Handling Multi-Modal Context

Most embedding models are optimized for a single type of data, and text models usually outperform others. Multi-modal embeddings (for example, image plus text models) often underdeliver in production.

A surprisingly effective solution is to convert all content to text before retrieval.

For example:

By indexing text representations, retrieval accuracy for non-text content improves dramatically.

RAG in the Real World

OneSix built an AI-powered chatbot for a higher education client to help students get answers faster.


By applying RAG, the chatbot summarized thousands of unstructured documents, giving students accurate answers instantly and helping the university better serve its community.

Real-world RAG success comes from context engineering, feeding models the right information to deliver accurate, reliable, business-ready answers.

Ready to unlock the full potential of RAG?

At OneSix, we design and deploy Retrieval-Augmented Generation systems built for the real-world. We engineer context, optimize retrieval, and integrate AI into your workflows—so your models deliver accurate, reliable, measurable results.


Let’s talk about how we can turn your AI ideas into measurable results.

Contact Us
Co-written by

Matt Altberg, Lead ML Engineer
Francisco Gonzalez, Sr. Architect

Published

August 19, 2025

AI’s Next Big Shift: What Business Leaders Need to Know

AI’s Next Big Shift: What Business Leaders Need to Know

Written by

James Townend & Nina Singer, Lead ML Scientists

Published

March 19, 2025

AI & Machine Learning
AI Agents & Chatbots

Artificial Intelligence continues to transform the tech landscape at breakneck speed. AI is driving innovation in every sector from how we process queries to the tools we use for automation. Below are five key trends shaping AI’s evolution in 2025—and why they matter.

1. Train Inference-Time Compute

"AI designers have a new control lever – spend more compute per query for higher accuracy and better reliability."
James Townend
Lead ML Scientist

Traditionally, AI performance scaled primarily with training-time compute: We spent more resources to train bigger models on more data. Now, inference-time compute—the compute spent when a trained model answers a query—has become a major new control lever.

Why It Matters

The Bigger Picture

As models shift more reasoning to real-time computation, the hardware and infrastructure for user-facing AI will need to scale to support these heavier inference workloads. This also opens opportunities for edge inference, which involves moving some computation onto devices like phones, robots, and IoT systems.

2. Enterprise Search Is Good Now

"LLMs have dramatically improved search through RAG, unlocking value from previously challenging document stores."
James Townend
Lead ML Scientist

Enterprise search was an afterthought for years, plagued by siloed data sources, poorly structured documents, and lack of meaningful relevance signals. Modern vector embeddings have changed everything, making Retrieval-Augmented Generation (RAG) the new standard.

Why It Matters

The Bigger Picture

With vector search and RAG, enterprise search resembles a true domain-expert assistant. Organizations finally have the tools to leverage vast document stores efficiently. It’s akin to what Google did for the early public internet—now applied to private, internal data.

3. AI Agents

"AI agents transform software interaction by automating multi-step workflows."
James Townend
Lead ML Scientist

The next revolution in AI-driven automation is the rise of AI Agents: task-oriented, often autonomous systems that can robustly interact with software and data.

Why It Matters

Important Considerations

Agents remain unpredictable at times, owing to LLMs’ black-box nature. For critical systems:

The Bigger Picture

We’ll see agents increasingly embedded in customer support, “low-code” software platforms, and legacy system integrations. However, organizations must weigh the potential for cost overruns (since agents call models often) against the productivity gains they deliver.

4. The Future of Openness

"As competition intensifies, we see an uptick of LLMs embracing open weights. Distilled models emerge to close the gap."
Nina Singer
Sr. Lead ML Scientist

Competition among large language models is intensifying, and with it comes a surge in open-weight models. Alongside these publicly accessible models, distilled versions—trained to mimic larger “teacher” models—are emerging as credible, cost-effective alternatives.

Why It Matters

The Bigger Picture

Open-source foundational models empower companies and researchers worldwide to build specialized solutions without huge licensing fees. This explosion in open models not only accelerates AI adoption but also raises questions about responsible use, governance, and the sustainability of massive training runs.

5. Capability Overhang

"As AI advances, new questions emerge: How else can we harness its potential? Who else can contribute to its development? How do we control its impact?"
Nina Singer
Sr. Lead ML Scientist

“Capability overhang” describes a scenario in which technology’s potential outstrips its immediate adoption and integration. We’re already seeing this with LLMs, where industrial and societal constraints—such as regulatory hurdles, skills shortages, and legacy system inertia—lag behind the AI’s actual abilities.

Why It Matters

The Bigger Picture

As AI’s capacity grows, the conversation shifts from “can we do it?” to “how should we do it responsibly?” The real power of LLMs will come from well-regulated, well-structured integrations that extend beyond flashy demos into meaningful, society-wide improvements.

Shaping the AI-Driven Future

From inference-time compute revolutionizing AI economics to enterprise search finally delivering on its promise, these five trends highlight a pivotal moment in AI’s evolution. Agents will streamline workflows, open-source models will democratize access, and the looming capability overhang challenges everyone—from entrepreneurs to regulators—to adapt responsibly.

As the AI frontier broadens, it’s up to us—innovators, policymakers, and everyday users—to steer its tremendous potential toward positive, inclusive progress. The question is no longer if AI can do something, but rather how we’ll harness its power to create lasting impact.

Get Started

Integrate these insights into your business strategy and make the most of AI and the power it has. OneSix can help you utilize emerging trends in AI and have first-hand experience of the impact it can have on your business.

Contact Us

Making AI More Human: The Power of Agentic Systems

Making AI More Human: The Power of Agentic Systems

Written by

Jack Teitel, Sr. AI/ML Scientist

Published

December 13, 2024

AI & Machine Learning
AI Agents & Chatbots
Snowflake

As AI advances, large language models (LLMs) like GPT-4 have amazed us with their ability to generate human-like responses. But what happens when a task requires more than just straightforward answers? For complex, multi-step workflows, agentic systems represent a promising frontier, offering LLMs the ability to mimic human problem-solving processes more effectively. Let’s explore what agentic systems are, how they work, and why they matter.

What are Agentic Systems?

Agentic systems go beyond traditional one-shot prompting — where you input a single prompt and receive a single response — by introducing structured, multi-step workflows. These systems break down tasks into smaller components, use external tools, and even reflect on their outputs to iteratively improve performance. The goal? Higher-quality responses that can tackle complex tasks more effectively.

Why Traditional LLMs Fall Short

In a basic one-shot prompt scenario, an LLM generates a response token by token, from start to finish. This works well for simple tasks but struggles with:

For example, if you ask a standard LLM to write an essay or debug a piece of code, it might produce a flawed output without recognizing or correcting its mistakes.

One method of correcting these limitations is to use multi-shot prompting, where the user interacts with the LLM, sending multiple prompts. By having a conversation with the LLM, a user can point out mistakes and prompt the LLM to provide better and more refined output. However, this still requires the user to analyze the output, suggest corrections, and interact with the LLM more than just the original prompt, which can be rather time-consuming.

One-Shot Prompting

Multi-Shot Prompting

Categories of Agentic Systems

Agentic systems address these limitations by employing four key strategies:

1. Reflection

Reflection enables an LLM to critique its own output and iteratively improve it. For instance, after generating code, a reflection step allows the model to check for bugs and propose fixes automatically.

Example Workflow:

2. Tool Use

Tool use allows LLMs to call external APIs or perform actions beyond simple token generation (the only action within scope of a traditional LLM). This is essential for tasks requiring access to real-time information via web search or needing to perform specialized functions, such as running unit tests or querying up-to-date pricing.

Example Workflow:

3. Planning

Planning helps LLMs tackle complex tasks by breaking them into smaller, manageable steps before execution. This mirrors how humans approach large problems, such as developing an outline before writing an essay.

Example Workflow:

4. Multi-Agent Systems

Multi-agent systems distribute tasks among specialized agents, each with a defined role (e.g., planner, coder, reviewer). These specialized agents are often different instances of an LLM with varying system prompts to guide their behavior. You can also utilize specialized agents that have been specifically trained to perform different tasks. This approach mirrors teamwork in human organizations and allows each agent to focus on its strengths.

Example Workflow:

Why Agentic Systems Matter

Agentic systems offer several advantages:

Practical Applications of Agentic Systems

Coding Assistance​

In software development, agentic systems can write code, test it, and debug autonomously. For example:

Business and Healthcare

In domains where decision-making requires transparency and reliability, agentic systems excel. By providing clear reasoning and detailed workflows, they can:

Real Time Information Analysis

Many businesses, such as finance, stock trading/analysis, e-commerce and retail, social media and marketing, rely on real-time information as a vital component of their decision-making. For these applications, agentic systems are necessary to extend the knowledgebase of stock LLMs beyond their original training data

Creative Collaboration

From generating marketing campaigns to designing product prototypes, multi-agent systems can simulate entire teams, each agent offering specialized input, such as technical accuracy, customer focus, or business strategy.

Implementing Agentic Systems

Building agentic workflows may sound complex, but tools like LangGraph simplify the process. LangGraph, developed by the creators of LangChain, allows you to define modular agent workflows visually, making it easier to manage interactions between agents. Any code or LLM can act as a node (or agent) in LangGraph.

For example, if working in Snowflake, LangGraph can be combined with Snowflake Cortex to create an agentic workflow leveraging native Snowflake LLMs, RAG systems, and SQL generation, allowing you to build complex agentic workflows in the same ecosystem as more traditional data analytics and management systems while ensuring strict data privacy and security.

For simpler use cases, platforms like LlamaIndex also support agentic capabilities, particularly when integrating data-focused workflows.

The Future of Agentic Systems

As research evolves, agentic systems are expected to remain relevant, even as base LLMs improve. The flexibility of agentic workflows ensures they can be tailored to specific domains, making them a valuable tool for automating complex, real-world tasks. In addition, as base LLMs improve, you can keep your same agentic workflows in place, but swap out the individual agents for the improved LLMs, allowing you to easily improve the overall system performance. In this way, agentic systems not only improve accuracy of traditional LLMs, but can easily scale/adapt to the current rapidly changing LLM ecosystem.

In the words of AI pioneer Andrew Ng, agentic systems represent “the next big thing” in AI. They offer a glimpse into a future where AI doesn’t just respond — it reasons, plans, and iterates like a true digital assistant.

Get Started

Ready to harness the power of Agentic AI? We’ll help you get started with tailored solutions that deliver real results. Contact us today to accelerate your AI journey.

Contact Us