AgentKit Demo: Building and Embedding an AI Agent in Minutes

🚀 Introduction: Why I built Ask Froge for Dev Day
I’m Christina Huang, and I recently demonstrated how quickly and confidently I can build an intelligent assistant using AgentKit on the OpenAI platform. In the demo, I put together a practical, production-ready agent called “Ask Froge” and embedded it directly into the Dev Day website in under eight minutes. The goal was simple: show how developers can design, test, and deploy agents that answer questions about sessions, speakers, logistics, and personalized agendas—without needing to write dozens of pages of orchestration code.
This article is a detailed, step-by-step report of that process. I’ll walk you through the conceptual design, the visual workflow builder, the tools and widgets I used, the guardrails I put in place, and how I embedded the agent into a live site using Chat Kit. I’ll also share practical tips, design rationale, and best practices so you can replicate what I did—or build something even more advanced for your own events and use cases.
🧭 Executive Summary
In short, I built a multi-agent workflow that:
- Classifies incoming queries to route them to specialized agents.
- Uses a Sessions Agent to pull schedule details from an attached document and render a visually engaging widget.
- Uses a Dev Day Agent to answer generic event questions and emulate the friendly "Froge" personality.
- Applies guardrails to detect and block sensitive personal data (PII) and to moderate responses.
- Previewed, tested, and published the agent, then embedded it into the Dev Day site via Chat Kit using a workflow ID—no extensive custom backend code required.
The result was a live, user-facing assistant embedded in the site that recommends sessions, answers logistics questions, and maintains a branded persona—all built visually in the AgentKit workflow builder.
📋 The Problem I Solved
Event websites are traditionally static: schedules, speaker bios, venue maps. That’s great for browsing, but not ideal if you want personalized help or fast answers tailored to a user's interests. Attendees often want:
- Personalized recommendations on which sessions match their interests.
- Quick answers about speakers, start times, and locations.
- Clear guidance on logistics (meals, badge scanning, Wi‑Fi, accessibility).
- A consistent brand experience and friendly voice.
I wanted to show how those needs can be solved by a small, targeted agent that indexes the event information and routes intent correctly. Making it visual, testable, and safe—within minutes—was an important part of the demonstration.
🔧 Why AgentKit: The Design Philosophy
AgentKit is built around a few guiding principles that I followed during the demo:
- Visual orchestration: Instead of starting from code, I used the workflow builder to wire nodes and model control flow. That made complex logic approachable and fast to iterate on.
- Composable tools and agents: You can combine specialized agents (e.g., a sessions agent and a general information agent) and attach tools (file search, widgets) so each agent has focused responsibilities.
- Safety first: Pre-built guardrails help block PII, handle moderation, and reduce hallucination risk. That allowed me to trust the agent in a live event setting.
- Fast integration: Once published, the workflow ID can be embedded via Chat Kit or exported as code for custom environments. That makes deployment flexible.
🛠️ Step-by-step Build: From Blank Page to Published Agent
Below I unpack the exact sequence I used during the demo and explain why each choice matters. I started the clock with a strict eight-minute limit to show how fast a modern workflow-driven environment can be.
1) Planning and architecture
Before I touched any nodes, I sketched a simple architecture in my head:
- An intent classifier to decide whether the user is asking about a specific session or general Dev Day information.
- A Sessions Agent dedicated to schedule queries, with a document source that contains session details.
- A DevDay Agent for broader questions—logistics, speakers, venue info—also backed by a document source.
- A Widget for session results so responses are visually engaging and not just plain text.
- Guardrails to detect and block PII and moderate unsafe content.
2) Wiring the classifier and router
I started in the workflow builder and added a categorizing agent node. This node's role was straightforward: examine the incoming message and categorize intent into at least two buckets—session-specific or general. Then I added an if-else node to route behavior based on that classifier output.
Why this matters: Intent classification early in the pipeline reduces ambiguity. It ensures the sessions agent isn’t wasting effort searching the schedule for a question like “Where’s lunch?” Conversely, the DevDay agent avoids returning a session card when the user asks about Wi‑Fi credentials.
3) Building the Sessions Agent
The sessions agent is the most specialized part of the workflow. I dragged an agent node onto the canvas, named it “Sessions Agent,” and gave it contextual instructions: pull schedule information, find sessions, and format responses with a widget when appropriate.
Then I attached a document that contained all session details—titles, descriptions, speakers, times, and locations. In AgentKit, attaching a document creates a searchable tool (think file search) that the agent can use to ground its answers in the exact content supplied.
Next, to avoid a wall of text in replies, I opted to render session search results using a dedicated widget. Instead of inventing a design from scratch, I opened the widget builder and browsed the gallery. I had previously prepared an onboarding session widget for “Froge” (our playful Dev Day mascot), which showed the session’s title, time, location, and a short summary in a compact card layout. I downloaded that widget and attached it as an output format for the Sessions Agent.
By doing that, whenever the Sessions Agent returned results, the system could render them using the widget consistently and attractively.
4) Building the DevDay Agent
Next I added a second agent node: “DevDay Agent.” This agent’s job was to answer general event questions. I populated its context with instructions on the agent’s responsibilities, and I told it to speak in the Froge persona to maintain a consistent user experience across both agents.
Then I attached a document containing all general Dev Day information—logistics, venue maps, badge instructions, schedules (summary), and any other relevant details. With that file attached, the DevDay Agent had a knowledge source it could query to answer user questions accurately.
5) Adding guardrails for safety and trust
One of the most important steps when building agents is adding guardrails. Agents can be enormously helpful, but they must also be trustworthy. I chose to enable a pre-built PII guardrail and configured it to recognize and block certain sensitive fields (for the demo, I included “name” as a field to test behavior).
I placed the guardrail check at the beginning of the workflow so any incoming message is first inspected. If a message contained PII, it was routed to a dedicated agent that explains the limitation and refuses to assist with the sensitive request—while still speaking in the Froge style. This design keeps the user experience consistent and avoids accidentally exposing or storing personal information.
6) Previewing the workflow
AgentKit makes it simple to preview the entire workflow end-to-end. I used the preview feature to simulate a user asking, “What session should I attend to learn about building agents?” The message traversed the workflow: it passed the guardrail, got categorized, was routed to the Sessions Agent, searched the session document, selected an appropriate match, and rendered the result via the onboarding session widget.
Because I set the DevDay Agent and the Sessions Agent to speak as Froge, the responses included the playful “ribbit” references (a branded touch). The preview console also shows each step—useful for debugging and confirming that tools and data sources are being used as intended.
7) Evaluations and testing
Although I didn’t run an extensive suite of automated evals live during the demo (time-limited), AgentKit supports running evaluation tests directly from the builder. You can create test prompts, expected outputs, and criteria to measure correctness, safety, and style adherence. This capability is crucial for production deployments where consistent behavior matters across many users and edge cases.
8) Publishing the workflow
When I was confident the workflow behaved correctly, I hit Publish and named the agent “Ask Froge.” Publishing created a workflow ID that represents the deployed pipeline. The UI also offered a code export so I could run the same workflow from my own servers if necessary. For the demo I chose to use the workflow ID directly to avoid the overhead of running custom orchestration code.
💻 Embedding Ask Froge into the Dev Day site
Publishing is the first step to production. The next is making the agent accessible to users. In the demo, I embedded Ask Froge as a chat experience into the Dev Day website using Chat Kit.
1) Creating a Chat Kit session
In my site code, I created a Chat Kit session and supplied the published workflow ID. This tells Chat Kit which workflow to invoke when a user sends a message. Behind the scenes, Chat Kit routes messages to the workflow endpoint and renders the responses returned by the workflow or widget outputs.
2) Client-side integration and customization
I added the Chat Kit React component to the page and connected it to a server-generated client secret so it could authenticate and create secure sessions. Then I customized the visual appearance to be Froge-themed:
- Set the placeholder text to continue the “ribbit” persona.
- Adjusted colors and fonts to match the Dev Day branding.
- Added starter prompts to help attendees ask useful questions quickly.
3) Placement on the page
To make the assistant easy to find, I placed it in a bottom sheet that slides up from the bottom of the screen. I also added a prominent “Ask Froge” link to the top of the site so attendees could launch the chat quickly. Within seconds, the site hosted a fully operational and branded assistant accessible to visitors.
4) Live usage example
A typical interaction looked like this:
- User: “What session should I attend to learn about building agents?”
- Agent: Checks guardrails → categorizes intent → queries session document → returns a session card via the widget that lists “Orchestrating Agents at Scale” at 11:15 with James and Rohan, and signs off in Froge’s voice with a ribbit.
That answer is both actionable (session name, time, speakers) and engaging (branded persona and visual card), which helps attendees quickly decide what to do next.
🛡️ Guardrails, Safety, and Trust
When you build agents that interact with users in public settings, safety is paramount. I used AgentKit's built-in guardrails to protect user data and keep the experience safe and trustworthy.
What guardrails I enabled
- PII detection: The workflow checks incoming messages for personal data (e.g., names, email addresses, phone numbers). If detected, the message is redirected to a response that explains the agent cannot assist with sensitive requests.
- Moderation: Optional content moderation filters can block or flag abusive or inappropriate content.
- Context dropping: For cases where sensitive details appear, I removed or redacted that context before the agent continued processing.
Why guardrails matter
Agent outputs are only as good as the data and controls you put in place. Guardrails reduce the risk of accidental disclosure and ensure regulatory or organizational policies are honored. Moreover, they enhance user trust—visitors are more likely to rely on an assistant if it behaves predictably and respects privacy.
🎨 Widgets and UX: Why visuals matter
One of the highlights of my demo was demonstrating how widgets transformed plain text into succinct, clickable, and visually appealing session cards. Widgets matter because they:
- Improve readability and scannability. A short card with time, location, and a short blurb is easier to act on than a multi-paragraph answer.
- Promote consistency. Using a template ensures results render the same way across multiple queries and agents.
- Bridge to actions. Widgets can contain links or buttons that deep-link to registration pages, maps, or calendar invites.
Design choices for the session card widget
For the onboarding session widget I used the following fields:
- Session title
- Time and duration
- Location (e.g., Golden Gate Park)
- Speaker(s)
- Short description
- Optional action buttons (Add to calendar, Open map)
These fields address the typical attendee’s immediate needs: who, when, where, and why it matters.
📊 Testing and Evaluation Strategies
Even though my demo included a quick preview, a complete production rollout benefits from deliberate evaluation. Here are the testing stages I recommend and how AgentKit supports them:
1) Unit testing agent capabilities
Create a set of prompt-response pairs that exercise the agent’s core responsibilities. For the Sessions Agent, these include:
- Direct session lookups by session title.
- Queries like “show me talks about agents” or “what’s at 11:15?”
- Edge cases: misspellings, synonyms, and time zone references.
2) Safety testing
Run prompts that intentionally include PII or request disallowed actions and verify the guardrails behave as expected. Check that the workflow redirects these to a refusal or redaction path.
3) Integration testing
Simulate the end-to-end user experience in the Chat Kit front end. Confirm that widgets render correctly, links work, and the branded persona remains consistent.
4) Automated evals
AgentKit allows you to create automated evals that score correctness, style, and safety. These can run on a schedule or as part of a deployment pipeline so you don’t introduce regressions when you update documents or agent behavior.
🧩 Design Patterns and Best Practices
Over the course of the demo and through my experience building agent workflows, several design patterns emerged that I rely on for reliable, maintainable agents.
1) Specialize agents by task
Rather than having one gigantic agent that must handle every question, split responsibilities. I used a Sessions Agent for schedule lookups and a DevDay Agent for broader event questions. Advantages:
- Each agent can be optimized and tuned separately.
- Faster, more precise retrieval from targeted document sources.
- Clearer testing and isolation of failure modes.
2) Put intent classification upfront
An early classifier reduces ambiguity. Route user messages to the most appropriate agent so each agent spends resources only on queries it is meant to handle.
3) Attach small, authoritative documents
Attach concise, curated documents rather than dumping massive amounts of unstructured data. Short, authoritative files are cheaper to search and reduce hallucination risk because the agent has fewer irrelevant facts to choose from.
4) Use widgets where appropriate
Widgets provide clarity and reduce back-and-forth. For actions like session recommendations, a widget card gives users the core information they need in one glance.
5) Human-in-the-loop for sensitive actions
For actions that modify state (e.g., registration, payment, or sending emails) include a human approval step or explicit confirmation. Agents should inform users what they can or cannot do and request a human intervention for high-risk tasks.
🔁 Iteration, Maintenance, and Continuous Improvement
Once your agent is live, iteration is crucial. The initial deployment is just the starting point. I typically iterate across three dimensions:
1) Content updates
Keep documents up to date—session changes, speaker swaps, room reassignments. With AgentKit, updating an attached document can immediately affect agent answers without changing the workflow logic.
2) Behavioral tuning
If users report that answers are off-tone or too verbose, I tweak the agent's system instructions and run evals to check for adherence to the persona and response length guidelines.
3) New tools and capabilities
Gradually add tools and widgets to expand functionality. For instance, after the event you might:
- Hook into a ticketing system for registration or refunds.
- Enable calendar-side effects like adding sessions to personal calendars.
- Attach location-aware maps to help people find rooms in real time.
⚠️ Common Pitfalls and How I Avoided Them
Building agents quickly can lead to mistakes. Here are common pitfalls I watch for and the mitigations I used during the demo.
Pitfall: Over-broad knowledge sources
If you attach massive or poorly structured documents, the agent may return inaccurate or vague answers. I mitigate this by attaching small, curated files and using clear agent instructions to limit the search scope.
Pitfall: No guardrails
Without guardrails, agents can inadvertently provide disallowed content or expose sensitive data. I enabled PII detection and added a refusal path for sensitive requests so the assistant refuses gracefully.
Pitfall: Unclear persona or style drift
When multiple agents respond in different tones, the user experience feels disjointed. I set both agents to speak in the Froge persona to ensure a consistent brand voice across all interactions.
Pitfall: Relying solely on free-form LLM responses
Purely text responses can be less usable. Adding widgets reduces the cognitive load on users and provides structured results that are easier to act upon.
📈 Real-world Use Cases Beyond Dev Day
Although I built Ask Froge specifically for Dev Day, the same architecture and patterns apply broadly. Here are potential real-world applications:
- Conferences and trade shows: personalized agendas, booth navigation, and speaker Q&As.
- Customer support portals: triage user issues, fetch knowledge base articles, and escalate complex cases to humans.
- Onboarding tools: guide new employees through HR tasks, policy documents, and training schedules.
- Public-facing product docs: answer technical questions, surface relevant API references, and link to tutorials.
🔍 Measured Outcomes and Expectations
When you deploy a targeted assistant, you can expect concrete benefits if you measure the right metrics. I typically look at:
- Engagement: number of unique users interacting with the assistant and session length.
- Resolution rate: percent of queries answered without human escalation.
- Time-to-answer: average time from question to useful response.
- User satisfaction: simple thumbs-up/down feedback or short surveys after an interaction.
With Ask Froge, the quick wins include faster session discovery for attendees and fewer support interruptions for event staff, both of which translate to better attendee experience and lower operational load.
🧠 Behind the Scenes: What Makes This Fast?
Three platform features made the eight-minute build possible:
- Visual workflow builder: I didn’t need to hand-code orchestrations. Dragging nodes and wiring them up removed a lot of friction.
- Composable tools and widgets: Pre-built tools like file search and reusable widget templates accelerated development.
- Publish and embed flow: Publishing yields a workflow ID that integrates instantly with Chat Kit so I could avoid writing an orchestration server for the demo.
💬 A Few Memorable Lines from the Demo
“We’re OpenAI. We need to have AI in our Dev Day website.” — I used that as a tongue-in-cheek motivator for building Ask Froge.
“Ask Froge.” — The name I chose for the agent to keep the experience on brand and playful.
Those quotes capture two important truths: (1) internal teams want smarter experiences, and (2) branding and persona can be fun but also a practical tool for keeping communications consistent.
📝 Code Export vs. Workflow ID: Choosing the Right Integration
AgentKit gives you two deployment options once you publish a workflow:
1) Use the workflow ID (what I chose)
Pros:
- Fastest path to integration via Chat Kit.
- No need to manage orchestration servers; the platform handles it.
- Great for quick demos, events, or when you want to iterate rapidly.
Cons:
- Less control over low-level orchestration details.
- May not fit complex enterprise constraints that require running in your own environment.
2) Export code and run locally
Pros:
- Full control over execution, logging, and integrations.
- Better fit for organizations with strict compliance or infrastructure requirements.
Cons:
- Requires more engineering effort to host, monitor, and scale.
- Longer time to iterate compared to the visual publish-and-embed flow.
📚 Practical Checklist for Your First Agent
If you want to replicate my approach, here’s a checklist you can follow:
- Define the problem and split responsibilities into focused agents (e.g., Sessions Agent, Info Agent).
- Prepare small, authoritative documents for each agent’s knowledge source.
- Design or select widgets for common output types (session cards, speaker bios, maps).
- Wire an intent classifier and add routing logic in the visual workflow builder.
- Enable guardrails (PII detection, moderation) and create an explicit refusal or redirection path.
- Preview and run sample queries to confirm the pipeline works end-to-end.
- Run automated evals for correctness and safety.
- Publish the workflow and choose an integration path (workflow ID for quick embed, code export for self-hosting).
- Embed using Chat Kit and customize the UI to match your brand and persona.
- Monitor metrics, collect feedback, and iterate.
🔭 Next Steps: Expanding Ask Froge Post-Event
After the event, the same foundation can be repurposed in many ways:
- Turn session recommendations into personalized follow-up emails.
- Combine session attendance data with feedback surveys to build recommender systems for future events.
- Offer post-event resources like recordings and slides via the same agents or a new “On-Demand” agent.
- Make Ask Froge a permanent community assistant for ongoing meetups and webinars.
📣 Final Thoughts and Call to Action
Building Ask Froge in under eight minutes showcased a few important realities about modern agent development: with the right platform abstractions, you can go from idea to production without an army of engineers or months of work. Visual workflow builders, composable tools, guardrails, and widget-driven UX make creating practical, safe, and delightful assistants achievable for a wide range of teams.
If you’re building an assistant for an event, product, or internal tooling, start small and iterate. Focus on the most frequent user needs, attach curated documents, add safety checks, and make results visually actionable. That approach gives you measurable wins quickly and provides a path to expand functionality over time.
✨ Closing Quote
“In just a few minutes, we designed an agent workflow visually, added tools and widgets, previewed it, deployed it, tested it, and now it’s live.” — I said this during the demo to emphasize how quickly you can move from concept to production with a platform designed for building agents.
📌 Where to Start Today
If you’d like to try this approach yourself, take these immediate steps:
- Identify one narrow use case (e.g., session lookup, FAQ, or onboarding).
- Create a concise document that contains the authoritative answers or catalog you need.
- Use the workflow builder to create a classifier → specialized agent pipeline.
- Attach a simple widget to present results cleanly.
- Enable basic guardrails and preview your workflow.
- Publish and embed the workflow using a client integration like Chat Kit.
With that, you can deliver a meaningful, safe, and branded assistant experience quickly—and iterate from there.
📬 Need help replicating Ask Froge?
If you want a checklist or a starter template for a Sessions Agent and widget, I can provide a step-by-step guide and sample system prompts. Tell me the use case you want to build and I’ll outline a tailored plan you can implement in AgentKit today.