AgentKit Demo: Building and Embedding an AI Agent in Minutes

🚀 Introduction: Why I built Ask Froge for Dev Day

I’m Christina Huang, and I recently demonstrated how quickly and confidently I can build an intelligent assistant using AgentKit on the OpenAI platform. In the demo, I put together a practical, production-ready agent called “Ask Froge” and embedded it directly into the Dev Day website in under eight minutes. The goal was simple: show how developers can design, test, and deploy agents that answer questions about sessions, speakers, logistics, and personalized agendas—without needing to write dozens of pages of orchestration code.

This article is a detailed, step-by-step report of that process. I’ll walk you through the conceptual design, the visual workflow builder, the tools and widgets I used, the guardrails I put in place, and how I embedded the agent into a live site using Chat Kit. I’ll also share practical tips, design rationale, and best practices so you can replicate what I did—or build something even more advanced for your own events and use cases.

🧭 Executive Summary

In short, I built a multi-agent workflow that:

Classifies incoming queries to route them to specialized agents.
Uses a Sessions Agent to pull schedule details from an attached document and render a visually engaging widget.
Uses a Dev Day Agent to answer generic event questions and emulate the friendly "Froge" personality.
Applies guardrails to detect and block sensitive personal data (PII) and to moderate responses.
Previewed, tested, and published the agent, then embedded it into the Dev Day site via Chat Kit using a workflow ID—no extensive custom backend code required.

The result was a live, user-facing assistant embedded in the site that recommends sessions, answers logistics questions, and maintains a branded persona—all built visually in the AgentKit workflow builder.

📋 The Problem I Solved

Event websites are traditionally static: schedules, speaker bios, venue maps. That’s great for browsing, but not ideal if you want personalized help or fast answers tailored to a user's interests. Attendees often want:

Personalized recommendations on which sessions match their interests.
Quick answers about speakers, start times, and locations.
Clear guidance on logistics (meals, badge scanning, Wi‑Fi, accessibility).
A consistent brand experience and friendly voice.

I wanted to show how those needs can be solved by a small, targeted agent that indexes the event information and routes intent correctly. Making it visual, testable, and safe—within minutes—was an important part of the demonstration.

🔧 Why AgentKit: The Design Philosophy

AgentKit is built around a few guiding principles that I followed during the demo:

Visual orchestration: Instead of starting from code, I used the workflow builder to wire nodes and model control flow. That made complex logic approachable and fast to iterate on.
Composable tools and agents: You can combine specialized agents (e.g., a sessions agent and a general information agent) and attach tools (file search, widgets) so each agent has focused responsibilities.
Safety first: Pre-built guardrails help block PII, handle moderation, and reduce hallucination risk. That allowed me to trust the agent in a live event setting.
Fast integration: Once published, the workflow ID can be embedded via Chat Kit or exported as code for custom environments. That makes deployment flexible.

🛠️ Step-by-step Build: From Blank Page to Published Agent

Below I unpack the exact sequence I used during the demo and explain why each choice matters. I started the clock with a strict eight-minute limit to show how fast a modern workflow-driven environment can be.

1) Planning and architecture

Before I touched any nodes, I sketched a simple architecture in my head:

An intent classifier to decide whether the user is asking about a specific session or general Dev Day information.
A Sessions Agent dedicated to schedule queries, with a document source that contains session details.
A DevDay Agent for broader questions—logistics, speakers, venue info—also backed by a document source.
A Widget for session results so responses are visually engaging and not just plain text.
Guardrails to detect and block PII and moderate unsafe content.

2) Wiring the classifier and router

I started in the workflow builder and added a categorizing agent node. This node's role was straightforward: examine the incoming message and categorize intent into at least two buckets—session-specific or general. Then I added an if-else node to route behavior based on that classifier output.

Why this matters: Intent classification early in the pipeline reduces ambiguity. It ensures the sessions agent isn’t wasting effort searching the schedule for a question like “Where’s lunch?” Conversely, the DevDay agent avoids returning a session card when the user asks about Wi‑Fi credentials.

3) Building the Sessions Agent

The sessions agent is the most specialized part of the workflow. I dragged an agent node onto the canvas, named it “Sessions Agent,” and gave it contextual instructions: pull schedule information, find sessions, and format responses with a widget when appropriate.

Then I attached a document that contained all session details—titles, descriptions, speakers, times, and locations. In AgentKit, attaching a document creates a searchable tool (think file search) that the agent can use to ground its answers in the exact content supplied.

Next, to avoid a wall of text in replies, I opted to render session search results using a dedicated widget. Instead of inventing a design from scratch, I opened the widget builder and browsed the gallery. I had previously prepared an onboarding session widget for “Froge” (our playful Dev Day mascot), which showed the session’s title, time, location, and a short summary in a compact card layout. I downloaded that widget and attached it as an output format for the Sessions Agent.

By doing that, whenever the Sessions Agent returned results, the system could render them using the widget consistently and attractively.

4) Building the DevDay Agent

Next I added a second agent node: “DevDay Agent.” This agent’s job was to answer general event questions. I populated its context with instructions on the agent’s responsibilities, and I told it to speak in the Froge persona to maintain a consistent user experience across both agents.

Then I attached a document containing all general Dev Day information—logistics, venue maps, badge instructions, schedules (summary), and any other relevant details. With that file attached, the DevDay Agent had a knowledge source it could query to answer user questions accurately.

5) Adding guardrails for safety and trust

One of the most important steps when building agents is adding guardrails. Agents can be enormously helpful, but they must also be trustworthy. I chose to enable a pre-built PII guardrail and configured it to recognize and block certain sensitive fields (for the demo, I included “name” as a field to test behavior).

I placed the guardrail check at the beginning of the workflow so any incoming message is first inspected. If a message contained PII, it was routed to a dedicated agent that explains the limitation and refuses to assist with the sensitive request—while still speaking in the Froge style. This design keeps the user experience consistent and avoids accidentally exposing or storing personal information.

6) Previewing the workflow

AgentKit makes it simple to preview the entire workflow end-to-end. I used the preview feature to simulate a user asking, “What session should I attend to learn about building agents?” The message traversed the workflow: it passed the guardrail, got categorized, was routed to the Sessions Agent, searched the session document, selected an appropriate match, and rendered the result via the onboarding session widget.

Because I set the DevDay Agent and the Sessions Agent to speak as Froge, the responses included the playful “ribbit” references (a branded touch). The preview console also shows each step—useful for debugging and confirming that tools and data sources are being used as intended.

7) Evaluations and testing

Although I didn’t run an extensive suite of automated evals live during the demo (time-limited), AgentKit supports running evaluation tests directly from the builder. You can create test prompts, expected outputs, and criteria to measure correctness, safety, and style adherence. This capability is crucial for production deployments where consistent behavior matters across many users and edge cases.

8) Publishing the workflow

When I was confident the workflow behaved correctly, I hit Publish and named the agent “Ask Froge.” Publishing created a workflow ID that represents the deployed pipeline. The UI also offered a code export so I could run the same workflow from my own servers if necessary. For the demo I chose to use the workflow ID directly to avoid the overhead of running custom orchestration code.

💻 Embedding Ask Froge into the Dev Day site

Publishing is the first step to production. The next is making the agent accessible to users. In the demo, I embedded Ask Froge as a chat experience into the Dev Day website using Chat Kit.

1) Creating a Chat Kit session

In my site code, I created a Chat Kit session and supplied the published workflow ID. This tells Chat Kit which workflow to invoke when a user sends a message. Behind the scenes, Chat Kit routes messages to the workflow endpoint and renders the responses returned by the workflow or widget outputs.

2) Client-side integration and customization

I added the Chat Kit React component to the page and connected it to a server-generated client secret so it could authenticate and create secure sessions. Then I customized the visual appearance to be Froge-themed:

Set the placeholder text to continue the “ribbit” persona.
Adjusted colors and fonts to match the Dev Day branding.
Added starter prompts to help attendees ask useful questions quickly.

3) Placement on the page

To make the assistant easy to find, I placed it in a bottom sheet that slides up from the bottom of the screen. I also added a prominent “Ask Froge” link to the top of the site so attendees could launch the chat quickly. Within seconds, the site hosted a fully operational and branded assistant accessible to visitors.

4) Live usage example

A typical interaction looked like this:

User: “What session should I attend to learn about building agents?”
Agent: Checks guardrails → categorizes intent → queries session document → returns a session card via the widget that lists “Orchestrating Agents at Scale” at 11:15 with James and Rohan, and signs off in Froge’s voice with a ribbit.

That answer is both actionable (session name, time, speakers) and engaging (branded persona and visual card), which helps attendees quickly decide what to do next.

🛡️ Guardrails, Safety, and Trust

When you build agents that interact with users in public settings, safety is paramount. I used AgentKit's built-in guardrails to protect user data and keep the experience safe and trustworthy.

What guardrails I enabled

PII detection: The workflow checks incoming messages for personal data (e.g., names, email addresses, phone numbers). If detected, the message is redirected to a response that explains the agent cannot assist with sensitive requests.
Moderation: Optional content moderation filters can block or flag abusive or inappropriate content.
Context dropping: For cases where sensitive details appear, I removed or redacted that context before the agent continued processing.

Why guardrails matter

Agent outputs are only as good as the data and controls you put in place. Guardrails reduce the risk of accidental disclosure and ensure regulatory or organizational policies are honored. Moreover, they enhance user trust—visitors are more likely to rely on an assistant if it behaves predictably and respects privacy.

🎨 Widgets and UX: Why visuals matter

One of the highlights of my demo was demonstrating how widgets transformed plain text into succinct, clickable, and visually appealing session cards. Widgets matter because they:

Improve readability and scannability. A short card with time, location, and a short blurb is easier to act on than a multi-paragraph answer.
Promote consistency. Using a template ensures results render the same way across multiple queries and agents.
Bridge to actions. Widgets can contain links or buttons that deep-link to registration pages, maps, or calendar invites.

Design choices for the session card widget

For the onboarding session widget I used the following fields:

Session title
Time and duration
Location (e.g., Golden Gate Park)
Speaker(s)
Short description
Optional action buttons (Add to calendar, Open map)

These fields address the typical attendee’s immediate needs: who, when, where, and why it matters.

📊 Testing and Evaluation Strategies

Even though my demo included a quick preview, a complete production rollout benefits from deliberate evaluation. Here are the testing stages I recommend and how AgentKit supports them:

1) Unit testing agent capabilities

Create a set of prompt-response pairs that exercise the agent’s core responsibilities. For the Sessions Agent, these include:

Direct session lookups by session title.
Queries like “show me talks about agents” or “what’s at 11:15?”
Edge cases: misspellings, synonyms, and time zone references.

2) Safety testing

Run prompts that intentionally include PII or request disallowed actions and verify the guardrails behave as expected. Check that the workflow redirects these to a refusal or redaction path.

3) Integration testing

Simulate the end-to-end user experience in the Chat Kit front end. Confirm that widgets render correctly, links work, and the branded persona remains consistent.

4) Automated evals

AgentKit allows you to create automated evals that score correctness, style, and safety. These can run on a schedule or as part of a deployment pipeline so you don’t introduce regressions when you update documents or agent behavior.

🧩 Design Patterns and Best Practices

Over the course of the demo and through my experience building agent workflows, several design patterns emerged that I rely on for reliable, maintainable agents.

1) Specialize agents by task

Rather than having one gigantic agent that must handle every question, split responsibilities. I used a Sessions Agent for schedule lookups and a DevDay Agent for broader event questions. Advantages:

Each agent can be optimized and tuned separately.
Faster, more precise retrieval from targeted document sources.
Clearer testing and isolation of failure modes.

2) Put intent classification upfront

An early classifier reduces ambiguity. Route user messages to the most appropriate agent so each agent spends resources only on queries it is meant to handle.

3) Attach small, authoritative documents

Attach concise, curated documents rather than dumping massive amounts of unstructured data. Short, authoritative files are cheaper to search and reduce hallucination risk because the agent has fewer irrelevant facts to choose from.

4) Use widgets where appropriate

Widgets provide clarity and reduce back-and-forth. For actions like session recommendations, a widget card gives users the core information they need in one glance.

5) Human-in-the-loop for sensitive actions

For actions that modify state (e.g., registration, payment, or sending emails) include a human approval step or explicit confirmation. Agents should inform users what they can or cannot do and request a human intervention for high-risk tasks.

🔁 Iteration, Maintenance, and Continuous Improvement

Once your agent is live, iteration is crucial. The initial deployment is just the starting point. I typically iterate across three dimensions:

1) Content updates

Keep documents up to date—session changes, speaker swaps, room reassignments. With AgentKit, updating an attached document can immediately affect agent answers without changing the workflow logic.

2) Behavioral tuning

If users report that answers are off-tone or too verbose, I tweak the agent's system instructions and run evals to check for adherence to the persona and response length guidelines.

3) New tools and capabilities

Gradually add tools and widgets to expand functionality. For instance, after the event you might:

Hook into a ticketing system for registration or refunds.
Enable calendar-side effects like adding sessions to personal calendars.
Attach location-aware maps to help people find rooms in real time.

⚠️ Common Pitfalls and How I Avoided Them

Building agents quickly can lead to mistakes. Here are common pitfalls I watch for and the mitigations I used during the demo.

Pitfall: Over-broad knowledge sources

If you attach massive or poorly structured documents, the agent may return inaccurate or vague answers. I mitigate this by attaching small, curated files and using clear agent instructions to limit the search scope.

Pitfall: No guardrails

Without guardrails, agents can inadvertently provide disallowed content or expose sensitive data. I enabled PII detection and added a refusal path for sensitive requests so the assistant refuses gracefully.

Pitfall: Unclear persona or style drift

When multiple agents respond in different tones, the user experience feels disjointed. I set both agents to speak in the Froge persona to ensure a consistent brand voice across all interactions.

Pitfall: Relying solely on free-form LLM responses

Purely text responses can be less usable. Adding widgets reduces the cognitive load on users and provides structured results that are easier to act upon.

📈 Real-world Use Cases Beyond Dev Day

Although I built Ask Froge specifically for Dev Day, the same architecture and patterns apply broadly. Here are potential real-world applications:

Conferences and trade shows: personalized agendas, booth navigation, and speaker Q&As.
Customer support portals: triage user issues, fetch knowledge base articles, and escalate complex cases to humans.
Onboarding tools: guide new employees through HR tasks, policy documents, and training schedules.
Public-facing product docs: answer technical questions, surface relevant API references, and link to tutorials.

🔍 Measured Outcomes and Expectations

When you deploy a targeted assistant, you can expect concrete benefits if you measure the right metrics. I typically look at:

Engagement: number of unique users interacting with the assistant and session length.
Resolution rate: percent of queries answered without human escalation.
Time-to-answer: average time from question to useful response.
User satisfaction: simple thumbs-up/down feedback or short surveys after an interaction.

With Ask Froge, the quick wins include faster session discovery for attendees and fewer support interruptions for event staff, both of which translate to better attendee experience and lower operational load.

🧠 Behind the Scenes: What Makes This Fast?

Three platform features made the eight-minute build possible:

Visual workflow builder: I didn’t need to hand-code orchestrations. Dragging nodes and wiring them up removed a lot of friction.
Composable tools and widgets: Pre-built tools like file search and reusable widget templates accelerated development.
Publish and embed flow: Publishing yields a workflow ID that integrates instantly with Chat Kit so I could avoid writing an orchestration server for the demo.

💬 A Few Memorable Lines from the Demo

“We’re OpenAI. We need to have AI in our Dev Day website.” — I used that as a tongue-in-cheek motivator for building Ask Froge.

“Ask Froge.” — The name I chose for the agent to keep the experience on brand and playful.

Those quotes capture two important truths: (1) internal teams want smarter experiences, and (2) branding and persona can be fun but also a practical tool for keeping communications consistent.

📝 Code Export vs. Workflow ID: Choosing the Right Integration

AgentKit gives you two deployment options once you publish a workflow:

1) Use the workflow ID (what I chose)

Pros:

Fastest path to integration via Chat Kit.
No need to manage orchestration servers; the platform handles it.
Great for quick demos, events, or when you want to iterate rapidly.

Cons:

Less control over low-level orchestration details.
May not fit complex enterprise constraints that require running in your own environment.

2) Export code and run locally

Pros:

Full control over execution, logging, and integrations.
Better fit for organizations with strict compliance or infrastructure requirements.

Cons:

Requires more engineering effort to host, monitor, and scale.
Longer time to iterate compared to the visual publish-and-embed flow.

📚 Practical Checklist for Your First Agent

If you want to replicate my approach, here’s a checklist you can follow:

Define the problem and split responsibilities into focused agents (e.g., Sessions Agent, Info Agent).
Prepare small, authoritative documents for each agent’s knowledge source.
Design or select widgets for common output types (session cards, speaker bios, maps).
Wire an intent classifier and add routing logic in the visual workflow builder.
Enable guardrails (PII detection, moderation) and create an explicit refusal or redirection path.
Preview and run sample queries to confirm the pipeline works end-to-end.
Run automated evals for correctness and safety.
Publish the workflow and choose an integration path (workflow ID for quick embed, code export for self-hosting).
Embed using Chat Kit and customize the UI to match your brand and persona.
Monitor metrics, collect feedback, and iterate.

🔭 Next Steps: Expanding Ask Froge Post-Event

After the event, the same foundation can be repurposed in many ways:

Turn session recommendations into personalized follow-up emails.
Combine session attendance data with feedback surveys to build recommender systems for future events.
Offer post-event resources like recordings and slides via the same agents or a new “On-Demand” agent.
Make Ask Froge a permanent community assistant for ongoing meetups and webinars.

📣 Final Thoughts and Call to Action

Building Ask Froge in under eight minutes showcased a few important realities about modern agent development: with the right platform abstractions, you can go from idea to production without an army of engineers or months of work. Visual workflow builders, composable tools, guardrails, and widget-driven UX make creating practical, safe, and delightful assistants achievable for a wide range of teams.

If you’re building an assistant for an event, product, or internal tooling, start small and iterate. Focus on the most frequent user needs, attach curated documents, add safety checks, and make results visually actionable. That approach gives you measurable wins quickly and provides a path to expand functionality over time.

✨ Closing Quote

“In just a few minutes, we designed an agent workflow visually, added tools and widgets, previewed it, deployed it, tested it, and now it’s live.” — I said this during the demo to emphasize how quickly you can move from concept to production with a platform designed for building agents.

📌 Where to Start Today

If you’d like to try this approach yourself, take these immediate steps:

Identify one narrow use case (e.g., session lookup, FAQ, or onboarding).
Create a concise document that contains the authoritative answers or catalog you need.
Use the workflow builder to create a classifier → specialized agent pipeline.
Attach a simple widget to present results cleanly.
Enable basic guardrails and preview your workflow.
Publish and embed the workflow using a client integration like Chat Kit.

With that, you can deliver a meaningful, safe, and branded assistant experience quickly—and iterate from there.

📬 Need help replicating Ask Froge?

If you want a checklist or a starter template for a Sessions Agent and widget, I can provide a step-by-step guide and sample system prompts. Tell me the use case you want to build and I’ll outline a tailored plan you can implement in AgentKit today.

AgentKit Demo: Building and Embedding an AI Agent in Minutes

🚀 Introduction: Why I built Ask Froge for Dev Day

🧭 Executive Summary

📋 The Problem I Solved

🔧 Why AgentKit: The Design Philosophy

🛠️ Step-by-step Build: From Blank Page to Published Agent

1) Planning and architecture

2) Wiring the classifier and router

3) Building the Sessions Agent

4) Building the DevDay Agent

5) Adding guardrails for safety and trust

6) Previewing the workflow

7) Evaluations and testing

8) Publishing the workflow

💻 Embedding Ask Froge into the Dev Day site

1) Creating a Chat Kit session

2) Client-side integration and customization

3) Placement on the page

4) Live usage example

🛡️ Guardrails, Safety, and Trust

What guardrails I enabled

Why guardrails matter

🎨 Widgets and UX: Why visuals matter

Design choices for the session card widget

📊 Testing and Evaluation Strategies

1) Unit testing agent capabilities

2) Safety testing

3) Integration testing

4) Automated evals

🧩 Design Patterns and Best Practices

1) Specialize agents by task

2) Put intent classification upfront

3) Attach small, authoritative documents

4) Use widgets where appropriate

5) Human-in-the-loop for sensitive actions

🔁 Iteration, Maintenance, and Continuous Improvement

1) Content updates

2) Behavioral tuning

3) New tools and capabilities

⚠️ Common Pitfalls and How I Avoided Them

Pitfall: Over-broad knowledge sources

Pitfall: No guardrails

Pitfall: Unclear persona or style drift

Pitfall: Relying solely on free-form LLM responses

📈 Real-world Use Cases Beyond Dev Day

🔍 Measured Outcomes and Expectations

🧠 Behind the Scenes: What Makes This Fast?

💬 A Few Memorable Lines from the Demo

📝 Code Export vs. Workflow ID: Choosing the Right Integration

1) Use the workflow ID (what I chose)

2) Export code and run locally

📚 Practical Checklist for Your First Agent

🔭 Next Steps: Expanding Ask Froge Post-Event

📣 Final Thoughts and Call to Action

✨ Closing Quote

📌 Where to Start Today

📬 Need help replicating Ask Froge?

Answers to common AI questions — For when your team want convincing

On the Ground in Malawi: Stories of Change with Canva & GiveDirectly

The AI Smart Home is Finally Here: Gemini Powers Up Google Home

OpenAI on OpenAI: Applying AI to Our Own Workflows

AIWorldVision