Creating a Copilot declarative agent with VS Code and the Teams Toolkit – baeke.info

Spread the love

If you are a Microsoft 365 Copilot user, you have probably seen that the words “agent” and “Copilot agent” are popping up here and there. For example, if you chat with Copilot there is an Agents section in the top right corner:

Above, there is a Visual Creator agent that’s built-in. It’s an agent dedicated to generating images. Below Visual Creator, there are agents deployed to your organisation and ways to add and create agents.

A Copilot agent in this context, runs on top of Microsoft 365 Copilot and uses the Copilot orchestrator and underlying model. An agent is dedicated to a specific task and has the following properties. Some of these properties are optional:

Name: name of the agent
Description: you guessed it, the description of the agent
Instructions: instructions for the agent about how to do its work and respond to the user; you can compare this to a system prompt you give an LLM to guide its responses
Conversation starters: prompts to get started like the Learn More and Generate Ideas in the screenshot above
Documents: documents the agent can use to provide the user with answers; this will typically be a SharePoint site or a OneDrive location
Actions: actions the agents can take to provide the user with an answer; these actions will be API calls that can fetch information from databases, create tickets in a ticketing system and much more…

There are several ways to create these agents:

Start from SharePoint and create an agent based on the documents you select
Start from Microsoft 365 Copilot chat
Start from Copilot Studio
Start from Visual Studio Code

Whatever you choose, you are creating the agent declaratively. You do not have to write code to create the agent. Depending on the tool you use, not all capabilities are exposed. For example, if you want to add actions to your agent, you need Copilot Studio or Visual Studio Code. You could start creating the agent from SharePoint and then add actions with Copilot Studio.

In this post, we will focus on creating a declarative agent with Visual Studio Code.

Table of Contents

Getting Started

You need Visual Studio Code or a compatible editor and add the Teams Toolkit extension. Check Microsoft Learn to learn about all requirements. After installing it in VS Code, click the extension. You will be presented with the options below:

To create a declarative agent, click Create a New App. Select Copilot Agent.

Next, select Declarative Agent. You will be presented with the choices below:

Creating an agent with API plugin so we can call APIs

To make this post more useful, we will add actions to the agent. Although the word “action” is not mentioned above, selecting Add plugin will give us that functionality.

We will create our actions from an OpenAPI 3.0.x specification. Select Start with an OpenAPI Description Document as shown below.

When you select the above option, you can either:

Use a URL that returns the OpenAPI document
Browse for an OpenAPI file (json or yaml) on your file system

I downloaded the OpenAPI specification for JSON Placeholder from https://arnu515.github.io/jsonplaceholder-api-docs/. JSON Placeholder is an online dummy API that provides information about blog posts. After downloading the OpenAPI spec, browse for the swagger.json file via the Browse for an OpenAPI file option. In the next screen, you can select the API operations you want to expose:

Select the operations you want the agent to use

I only selected the GET /posts operation (getPosts). Next, you will be asked for a folder location and a name for your project. I called mine DemoAgent. After specifying the name, a new VS Code window will pop up:

You might get questions about installing additional extensions and even to provision the app.

How does it work?

Before explaining some of the internals, let’s look at the end result in Copilot chat. Below is the provisioned app, provisioned only to my own account. This is the app as created by the extension, without modifications on my part.

Agent in Copilot Chat; sample API we use returns Latin 😉

Above, I have asked for three posts. Copilot matches my intent to the GET /posts API call and makes the call. The JSONPlaceholder API does not require authentication so that’s easy. Authentication is supported but that’s for another post. If it’s the first time the API is used, you will be asked for permission to use it.

In Copilot, I turned on developer mode by typing -developer on in the chat box. When you click Show plugin developer info, you will see something like the below screenshot:

Above, the Copilot orchestrator has matched the function getPosts from the DemoAgent plugin. Plugin is just the general name for Copilot extensions that can perform actions (or functions). Yes, naming is hard. The Copilot orchestrator selected the getPosts function to execute. The result was a 200 OK from the underlying API. If you click the 200 OK message, you see the raw results returned from the API.

Now let’s look at some of the files that are used to create this agent. The main file, from the agent’s point of view, is declarativeAgent.json in the appPackage folder. It contains the name, description, instructions and actions of the agent:

{
    "$schema": "https://developer.microsoft.com/json-schemas/copilot/declarative-agent/v1.0/schema.json",
    "version": "v1.0",
    "name": "DemoAgent",
    "description": "Declarative agent created with Teams Toolkit",
    "instructions": "$[file('instruction.txt')]",
    "actions": [
        {
            "id": "action_1",
            "file": "ai-plugin.json"
        }
    ]
}

The instructions property references another file which contains the instructions for the agent. One of the instructions is: You should start every response and answer to the user with “Thanks for using Teams Toolkit to create your declarative agent!”. That’s the reason why my question had that in the response to start with.

Of course, the actions are where the magic is. You can provide your agent with multiple actions. Here, we only have one. These actions are defined in a file that references the OpenAPI spec. Above, that file is ai-plugin.json. This file tells the agent what API call to make. It contains a functions array with only one function in this case: getPosts. It’s important you provide a good description for the function because Copilot selects the function to call based on its description. See the Matched functions list in the plugin developer info section.

Below the functions array is a runtimes array. It specifies what operation to call from the referenced OpenAPI specification. In here, you also specify the authentication to the API. In this case, the auth type is None but agents support HTTP bearer authentication with a simple key or OAuth.

Here’s the entire file:

{
    "$schema": "https://developer.microsoft.com/json-schemas/copilot/plugin/v2.1/schema.json",
    "schema_version": "v2.1",
    "name_for_human": "DemoAgent",
    "description_for_human": "Free fake API for testing and prototyping.",
    "namespace": "demoagent",
    "functions": [
        {
            "name": "getPosts",
            "description": "Returns all posts",
            "capabilities": {
                "response_semantics": {
                    "data_path": "$",
                    "properties": {
                        "title": "$.title",
                        "subtitle": "$.id"
                    },
                    "static_template": {
                        "type": "AdaptiveCard",
                        "$schema": "http://adaptivecards.io/schemas/adaptive-card.json",
                        "version": "1.5",
                        "body": [
                            {
                                "type": "TextBlock",
                                "text": "id: ${if(id, id, 'N/A')}",
                                "wrap": true
                            },
                            {
                                "type": "TextBlock",
                                "text": "title: ${if(title, title, 'N/A')}",
                                "wrap": true
                            },
                            {
                                "type": "TextBlock",
                                "text": "body: ${if(body, body, 'N/A')}",
                                "wrap": true
                            },
                            {
                                "type": "TextBlock",
                                "text": "userId: ${if(userId, userId, 'N/A')}",
                                "wrap": true
                            }
                        ]
                    }
                }
            }
        }
    ],
    "runtimes": [
        {
            "type": "OpenApi",
            "auth": {
                "type": "None"
            },
            "spec": {
                "url": "apiSpecificationFile/openapi.json"
            },
            "run_for_functions": [
                "getPosts"
            ]
        }
    ],
    "capabilities": {
        "localization": {},
        "conversation_starters": [
            {
                "text": "Returns all posts"
            }
        ]
    }
}

As you can see, you can also control how the agent responds by providing an adaptive card. Teams toolkit decided on the format above based on the API specification and the data returned by the getPosts operation. In this case, the card looks like this:

Addaptive card showing the response from the API: id, title, body and userId of the fake blog post

Adding extra capabilities

You can add conversation starters to the agent in declarativeAgent.json. They are shown in the opening screen of your agent:

These starters are added to declarativeAgent.json:

{
    "$schema": "https://developer.microsoft.com/json-schemas/copilot/declarative-agent/v1.0/schema.json",
    "version": "v1.0",
    "name": "DemoAgent",
    "description": "Declarative agent created with Teams Toolkit",
    "instructions": "$[file('instruction.txt')]",
    "actions": [
        ...
    ],
    "conversation_starters": [
    {
        "title": "Recent posts",
        "text": "Show me recent posts"
    },
    {
        "title": "Last post",
        "text": "Show me the last post"
    }
]
}

In addition to conversation starters, you can also enable web searches. Simply add the following to the file above,

"capabilities": [
    {
        "name": "WebSearch"
    }
]

With this feature enabled, the agent can search the web for answers via Bing. It will do so when it thinks it needs to or when you tell it to. For instance: “Search the web for recent news about AI” gets you something like this:

In the plugin developer info, you will see that none of your functions were executed. Developer info does not provide additional information about the web search.

Next to starter prompts and WebSearch, here are some of the other things you can do:

Add OneDrive and SharePoint content: extra capability with name OneDriveAndSharePoint; the user using the agent needs access to these files or they cannot be used to generate an answer
Add Microsoft Graph Connectors content: extra capability with name GraphConnectors; Graph Connectors pull in data from other sources in Microsoft Graph; by specifying the connector Ids, that data can then be retrieved by the agent

More information about the above settings can be found here: https://learn.microsoft.com/en-us/microsoft-365-copilot/extensibility/declarative-agent-manifest.

Provisioning

To provision the agent just for you, open VS Code’s command palette and search for Teams: Provision. You will be asked to log on to Microsoft 365. When all goes well, you should see the messages below in the Output pane:

If you are familiar with app deployment to Teams in general, you will notice that this is the same.

When the app is provisioned, it should appear in the developer portal at https://dev.teams.microsoft.com/apps:

Note that the extension adds dev to the agent when you provision the app. When you publish the app, this is different. You can also see this in VS Code in the build folder:

Note: we did not discuss the manifest.json file which is used to configure the Teams app as a whole. Use it to set developer info, icons, name, description and more.

There are more steps to take to publish the app and make it available to your organisation. See https://learn.microsoft.com/en-us/microsoftteams/platform/toolkit/publish for more information

Conclusion

The goal of this blogpost was to show how easy it is to create a declarative agent on top of Microsoft 365 Copilot in VS Code. Remember that these agents use the underlying Copilot orchestrator and model and that is something you cannot change. If you need more freedom (e.g., control over LLM, its parameters, advanced prompting techniques etc…) and you want to create such an app in Teams, there’s always the Custom Engine Agent.

Declarative agents don’t require you to code although you do need to edit multiple files to get it to work?

In a follow-up post, we will take a look at adding a custom API with authentication. I will also show you how to easily add additional actions to an agent without too much manual editing. Stay tuned!