Combine Skills and MCP to Close the Context Gap — Pedro Rodrigues, Supabase

34 slides extracted.

Slide 1 — 0:08 (watch)

I have the green light, so we can get started.

Slide 2 — 0:30 (watch)

Hello, everyone. You may have noticed that the title has changed slightly from what is listed on the schedule. When I submitted the talk, the debate between MCP and skills was still ongoing and quite a hot topic. I believe we have now settled on the understanding that they are different and have their own roles. The current debate seems to focus more on MCP versus CLI.

Slide 3 — 0:58 (watch)

I thought it would be useful to explain how we developed our Supabase skill and the lessons we learned from writing this document. I have never spent more time on a single document since I wrote my master's thesis. While writing a skill may seem simple, it can be quite complex, especially for a product as intricate as Supabase.

Slide 4 — 1:24 (watch)

I'm Pedro, an AI tooling engineer at Supabase and an MCP enthusiast. I'm also a co-founder of the Lisbon AI Week, which will take place in late October this year. By the way, I prefer dark mode for presentations. How many of you prefer dark mode over light mode? I thought the majority would.

Slide 5 — 1:42 (watch)

Let's proceed with this presentation in dark mode.

Slide 6 — 2:00 (watch)

We can all agree that agents are already quite capable of performing mundane tasks independently. However, when presented with a task involving something new or updated since their training, such as your product, they require appropriate guidance.

Slide 7 — 2:30 (watch)

At Supabase, we observed that LLMs often overlook security pitfalls, such as the role-level security instructions necessary to protect your application. They tend to rely on outdated knowledge from their training data and can be reluctant to acknowledge their need for updated information. We aim to guide them on specific workflows that we believe are optimized for our products. To start, how many of you have written a skill before?

Slide 8 — 3:12 (watch)

Skills are folders that contain instructions, scripts, and resources that agents progressively discover. This is the main selling point of skills. Each skill includes an envelope called front matter, which contains the name and description, helping the agent decide when to load the skill. The actual instructions are stored in the main file, skill.md, along with optional bundled resources such as scripts for actions or reference files that provide additional information without needing to be loaded immediately into context.

Slide 9 — 4:16 (watch)

At Supabase, we tested the same agent, Claude Sonnet 4.6, with a simple task involving a collaborative app. We wanted to create a new SQL view on a table that had row-level security (RLS) enabled, ensuring that users could only see their own information. We provided the agent with just the MCP under one condition, and then with the MCP plus the agent skills. The results were as expected. In PostgreSQL, when you create a skill over a table with RLS enabled, if you do not explicitly set the security invoker flag to true, it will bypass the RLS. Consequently, the view will expose data that is not available by default on the table.

Slide 10 — 4:52 (watch)

The agent with the skill and knowledge successfully implemented the information correctly and safely, while the agent with only access to the integration and the MCP tool did not.

Slide 11 — 5:08 (watch)

To address this, we aimed to enable agents to work effectively with Supabase. Today, we are officially announcing the Supabase agent skill that I have been developing over the past few months. To make this announcement more engaging, I will try something new.

Slide 12 — 5:28 (watch)

I'm going to live-tweet it on stage to make it official.

Slide 13 — 5:48 (watch)

This skill focuses on building capabilities for your products. You can create a skill similar to ours, and I'm open to discussing the details. Currently, we use free text, and we have not yet established a standard for it, so feel free to ask questions later.

Slide 14 — 6:18 (watch)

To outline some principles we adhere to, the first is to avoid duplicating information. Treat skills as documentation for yourself, and just as you wouldn't duplicate your documentation, you shouldn't duplicate information within your skills. You already have documentation for your product; direct the agent to the most up-to-date version. Be persistent in guiding the model to search the web or your documentation. Provide clear instructions on where and how to find the documentation, and remain steadfast in this approach.

Slide 15 — 6:52 (watch)

We conducted an experiment where we exposed our documentation through SSH. This allows agents to search for documentation as if it were part of a file system. The main reason for this approach is to leverage the agents' familiarity with file systems, enabling them to navigate and find information more effectively. You can read more about this on our blog.

Slide 16 — 7:08 (watch)

They are very familiar with file systems and are adept at navigating them to find files and information using Linux-based tools.

Slide 17 — 7:24 (watch)

By providing a remote interface, we believe users will find it easier to navigate the agent. We would also appreciate your feedback on this idea during or after the conference.

Slide 18 — 7:38 (watch)

The second principle I want to share is that if something can be skipped, it will be skipped.

Slide 19 — 8:10 (watch)

What I mean by this is that, in addition to new information on searching online, agents find fetching information or calling tools expensive, so they mostly default to their training data. The same applies to reference files. You may have noticed that even when the agent loads a skill, it tends to be reluctant to load reference files. If it does load one reference file, and your problem requires information from multiple files, it will likely not load more than one. In fact, it becomes almost impossible for it to load three or four files. Therefore, you need to be very critical about what you include in your skill.md file from the beginning.

Slide 20 — 8:54 (watch)

Include information that is unlikely to change, such as a security checklist for Supabase, which we wanted the agent to prioritize. Initially, we placed this in a reference file, but the agent often overlooked it. Therefore, we moved it to the skill.md file. Any information that the agent must not miss and that defines your product should be included in the skill.md file rather than a reference file.

Slide 21 — 9:28 (watch)

The third principle for writing a product skill is to be opinionated. You know your product best and understand how to work with it. You should also be aware of how your users interact with it. Don’t hesitate to guide the agents on workflows that you believe are the most effective for your products. For example, consider managing a database schema.

Slide 22 — 10:14 (watch)

How many of you know Supabase and what it does? For those who don’t, we are essentially a back-end service that provides a storage database, authentication, and other features necessary for back-end development. We offer a database that allows the agent to interact with and manipulate your schema. We found that this approach is the most effective workflow for agents to efficiently manage the schema. It enables direct DDL operations, allowing changes to the schema on your development or staging database. Once you are satisfied with the changes, we provide an advisor that identifies any security or performance issues with the database, addresses them, and then creates the migration file.

Slide 23 — 10:50 (watch)

This approach prevents the agent from creating a migration file each time the schema changes. We found this to be the best workflow for managing the schema, so it should be included in the skill when working with Supabase.

Slide 24 — 11:16 (watch)

We tested this skill during a fascinating time when we can now conduct free tests on documents and documentation, which would have seemed unimaginable years ago. I specifically tested a markdown file using evals, short for evaluations. Evals are tests that you can run similarly to how you would run tests in your CI, but instead of evaluating code, you evaluate an agent, an LLM, and its behavior, including the tools it calls and its reasoning.

Slide 25 — 12:00 (watch)

We conducted a set of six specific scenarios for Supabase, focusing on ongoing projects in various contexts. These scenarios were tested with four different agents from two vendors across three test conditions. We aimed to evaluate a baseline with no MCP or skills, with just the MCP server, and with the MCP server plus the skills. The evaluation was based on a test completeness score, which is a four-point scale, and was executed using Braintrust. If you’re unfamiliar with them, they are sponsors of this conference, so I encourage you to visit their booth and learn more about their product.

Slide 26 — 12:34 (watch)

The results indicate that the combination of skills and the MCP outperforms all other conditions across every model tested. We conducted tests on cloud code for Opus 4.6, Sonnet 4.6, and codecs for GPT 5.4 and GPT 5.4 mini.

Slide 27 — 13:06 (watch)

We can conclude that the skills significantly improved performance and the test completeness score by providing the right guidance to the agents. We already had the tools and an MCP server; we just needed the correct guidance on how to operate with Supabase. This approach is agent-agnostic, and more agents are adopting this open standard for skills. Currently, the bottleneck is not the context but the guidance.

Slide 28 — 13:34 (watch)

When building a skill for your product, establish a single source of truth by pointing to your documentation. Be opinionated; you know your product, so don’t hesitate to showcase it. Start with a minimal approach.

Slide 29 — 13:50 (watch)

Any model vendor, whether it's Anthropic or OpenAI, will advise you to start minimal and slow when developing skills. Then, iterate and expand as needed. Don't hesitate to create new versions.

Slide 30 — 14:06 (watch)

If you want to learn more, we have published a blog post that is live today. You can find it on my Twitter account, on the Supabase blog, or you can run this command to install it in your project and start using it now.

Slide 31 — 14:22 (watch)

Thank you very much. I'll be around if you have any questions.

Slide 32 — 14:36 (watch)

I have one more thing to show you. We are running a giveaway.

Slide 33 — 14:46 (watch)

To enter the giveaway for a Mac mini, scan the QR codes and sign up for Supabase. Good luck! Do I have time for any questions?

Slide 34 — 16:32 (watch)

It's an interesting question. The number of customers has been increasing over time, particularly regarding the use cases for vectors, which are mainly for embeddings. One of the most compelling applications is semantic search. For example, by exposing documentation through SSH, we can enhance traditional bash tools to provide semantic search capabilities instead of just allowing navigation with basic commands. I see significant potential in using vectors, and our customers are definitely exploring this solution more.

Regarding the distribution of skills within an organization, that’s a great question. Currently, one of the challenges we face is the distribution system for skills. Various players are attempting to establish a registry or distribution method. Vercel has introduced a skills package, and we are seeing plugins that can be bundled with MCP servers, although these are model-specific. This remains an unsolved problem.

At present, we package skills within the repositories themselves. If you want to create a plugin, you can create a .cloud plugin or a .cursor plugin in that repository, making it available or discoverable if the repository is open-sourced or accessible. Other companies are also distributing their skills in a similar manner, using repositories to package skills into knowledge. Thank you. Are there any more questions? I think we have time for one more. Yes? I recently built self-improving skills. Should we collaborate? Sure, let’s talk after. Once again, thank you very much. It was a pleasure to be here, and I’ll be around. Thank you.

Slide 1 — 0:08 (watch)#

Slide 2 — 0:30 (watch)#

Slide 3 — 0:58 (watch)#

Slide 4 — 1:24 (watch)#

Slide 5 — 1:42 (watch)#

Slide 6 — 2:00 (watch)#

Slide 7 — 2:30 (watch)#

Slide 8 — 3:12 (watch)#

Slide 9 — 4:16 (watch)#

Slide 10 — 4:52 (watch)#

Slide 11 — 5:08 (watch)#

Slide 12 — 5:28 (watch)#

Slide 13 — 5:48 (watch)#

Slide 14 — 6:18 (watch)#

Slide 15 — 6:52 (watch)#

Slide 16 — 7:08 (watch)#

Slide 17 — 7:24 (watch)#

Slide 18 — 7:38 (watch)#

Slide 19 — 8:10 (watch)#

Slide 20 — 8:54 (watch)#

Slide 21 — 9:28 (watch)#

Slide 22 — 10:14 (watch)#

Slide 23 — 10:50 (watch)#

Slide 24 — 11:16 (watch)#

Slide 25 — 12:00 (watch)#

Slide 26 — 12:34 (watch)#

Slide 27 — 13:06 (watch)#

Slide 28 — 13:34 (watch)#

Slide 29 — 13:50 (watch)#

Slide 30 — 14:06 (watch)#

Slide 31 — 14:22 (watch)#

Slide 32 — 14:36 (watch)#

Slide 33 — 14:46 (watch)#

Slide 34 — 16:32 (watch)#