Build and Connect Blazor Chat App with MCP C# SDK

In my previous blog post, I showed how you can create your own MCP server with C#. In this post we are going to build a Blazor Web Chat application that connects to that server and utilizes local LLAMA 3.1 model with Ollama. This means that our client, MCP server and LLM model will all be running in our local environment.

What is MCP C# SDK?

We are going to use the new MCP C# SDK to utilize MCP servers from our code, but first let’s see what that SDK really is. According to its GitHub page this project is official SDK for MCP. I don’t know what the official means in this context, but there are some people from Microsoft behind this project. As of today (21.4.2025) the latest release is 0.1.0.Preview10 so we are talking about quite a fresh thing, which is not production-ready.

With this SDK we can easily build MCP Servers and MCP Clients. On client side we will use IMcpClient abstraction and on server side things are heavily leaning against attributes. There is a local server and SSE support for client, so you can use models from internet also with this project.

Building Our Chat with Blazor

I didn’t want to go with plain console app this time, so I asked AI to generate me a starting point for LLM Chat application. The AI added LLMChat.razor page in my empty Blazor project with following HTML code:

<PageTitle>Chat with Ollama MCP</PageTitle>

<div class="chat-container">
    <h3>Chat with Ollama MCP</h3>
    <div id="messages" class="messages">
        @foreach (var message in chatHistory)
        {
            <div class="message @(message.Role == ChatRole.User ? "user-message" : "ai-message")">
                <strong>@GetDisplayName(message.Role):</strong> @message.Text
            </div>
        }
        @if (isProcessing)
        {
            <div class="message ai-message">
                <strong>AI:</strong> <span class="loading-indicator">Thinking...</span>
            </div>
        }
    </div>
    <div class="input-container">
        <input @bind="userInput" @bind:event="oninput" @onkeydown="HandleKeyPress" placeholder="Type your message..." />
        <button @onclick="SendMessage" disabled="@isProcessing">Send</button>
    </div>
</div>

So, we have a div that contains the chat messages, and each message is presented as its own div. We also have a div for processing, which is shown while the AI is generating a response. Finally, at the bottom of the page, we have a text box used for entering questions into the chat. The CSS for the following HTML looks like this:

.chat-container {
    display: flex;
    flex-direction: column;
    height: 80vh;
    min-height: 600px;
    border: 1px solid #ccc;
    border-radius: 8px;
    padding: 10px;
    margin: 20px;
}

.messages {
    flex: 1;
    overflow-y: auto;
    padding: 10px;
    display: flex;
    flex-direction: column;
    gap: 10px;
}

.message {
    padding: 10px;
    border-radius: 8px;
    max-width: 80%;
}

.user-message {
    align-self: flex-end;
    background-color: #e6f7ff;
}

.ai-message {
    align-self: flex-start;
    background-color: #f6f6f6;
}

.input-container {
    display: flex;
    margin-top: 10px;
    padding: 10px;
}

    .input-container input {
        flex: 1;
        padding: 10px;
        border-radius: 4px;
        border: 1px solid #ccc;
        margin-right: 10px;
    }

    .input-container button {
        padding: 10px 20px;
        border-radius: 4px;
        border: none;
        background-color: #0366d6;
        color: white;
        cursor: pointer;
    }

        .input-container button:disabled {
            background-color: #cccccc;
            cursor: not-allowed;
        }

.loading-indicator {
    display: inline-block;
    animation: ellipsis 1.5s infinite;
}

@keyframes ellipsis {
    0% {
        content: ".";
    }

    33% {
        content: "..";
    }

    66% {
        content: "...";
    }
}

So nothing special about in here and all code is AI generated (I used Cline with help of GitHub Copilot to get good start for this sample project).

Building the MCP Client

For MCP Client we need few NuGet packages, that will help us along the way. We need to add the ModelContextProtocol which is the main SDK and Microsoft.Extensions.AI to use the OllamaClient to connect into our local LLM.

I wanted to use my local model with MCP server, so I added Ollama and Microsoft.Extensions.AI NuGet packages

In Program.cs file we simply want to create the OllamaClient that will be injected into our Chat page. We could handle the MCPClient initialization also in Program class, but I wanted to add it into Chat window for better visibility on what is happening in that page.

So this is our OllamaClient initilization at Program.cs

IChatClient client = new OllamaChatClient(new Uri("http://localhost:11434/"), "llama3.1:latest")
    .AsBuilder()
    .UseFunctionInvocation()
    .Build();

builder.Services.AddSingleton(client);

The most important part here is the UseFunctionInvocation. Without that our client app wont be able to call the MCP server. Next let’s see how the Chat page C# part looks like.

In LLMChat.razor page we want to maintaing isProcessing state, that indicates when AI is thinking and ChatHistory which is provided when something is asked from the LLM.

I will first show the whole code and then go through the lines.

private string userInput = "";
private List<ChatMessage> chatHistory = new();
private bool isProcessing = false;
private IList<McpClientTool> tools = new List<McpClientTool>();
private IMcpClient mcpClient;

protected override void OnInitialized()
{
    // Add a system message to start the conversation
    chatHistory.Add(new ChatMessage(ChatRole.System, "You are a helpful AI assistant."));
}

protected override async Task OnInitializedAsync()
{
    mcpClient = await McpClientFactory.CreateAsync(new ModelContextProtocol.McpServerConfig
    {
        TransportType = "stdio",
        Id = "GitHub",
        Name = "GitHub",
        Location = @"I:\own\MCPGitHubServer\bin\Debug\net9.0\MCPGitHubServer.exe"
    });

    IList<McpClientTool> clientTools = await mcpClient.ListToolsAsync();

    tools = clientTools;
}

private async Task HandleKeyPress(KeyboardEventArgs e)
{
    if (e.Key == "Enter" && !isProcessing)
    {
        await SendMessage();
    }
}

private async Task SendMessage()
{
    if (string.IsNullOrWhiteSpace(userInput) || isProcessing)
        return;

    var userMessage = userInput.Trim();
    userInput = "";
    isProcessing = true;

    // Add user message to chat history
    chatHistory.Add(new ChatMessage(ChatRole.User, userMessage));

    try
    {
        // Create the request to Ollama MCP
        var response = await ChatClient.GetResponseAsync(chatHistory, new ChatOptions() { Tools = [.. tools] }
        );

        chatHistory.Add(new ChatMessage(ChatRole.Assistant, response.Text));
    }
    catch (Exception ex)
    {
        chatHistory.Add(new ChatMessage(ChatRole.System, $"Error: {ex.Message}"));
    }
    finally
    {
        isProcessing = false;
        StateHasChanged();
        await JSRuntime.InvokeVoidAsync("scrollToBottom", "messages"); // This ensures that textbox is scrolled to the end
    }
}

private string GetDisplayName(ChatRole? role)
{
    if (role == ChatRole.User)
    {
        return "You";
    }
    else if (role == ChatRole.Assistant)
    {
        return "AI";
    }
    else if (role == ChatRole.System)
    {
        return "System";
    }
    else
    {
        return role.ToString();
    }
}

That seems like a lot of code, but actually, the MCP part is rather small. A big part of the code in this sample is related to Blazor.

At the OnInitialized method we add a welcome message that is prompted by the AI. This is simple way to show the message in UI, but it will increase our message context which is send to LLM.

OnInitializedAsync method creates the MCP client with McpClientFactory. We could hard wire that in Program.cs to DI container, but I think it is simpler and cleaner to do it near the usage place, as it defines what kind of servers we are going to use. In this sample I am using the GitHub MCP Server, which I implemented in my earlier blog post. For Location we need to define the location of the MCP exe file that is created during the server project build. Finally in this method we will extract the tools into local variable. We will pass these tools to our OllamaChatClient, so that it knows what kind of methods it can invoke and what they do.

HandleKeyPress is simple helper method to send message when user presses enter.

SendMessage is a bit longer method. At the begin of this method we are doing some sanity checks to validate the input and add it into ChatHistory as new ChatMessage. Then we will invoke the Ollama ChatClient with whole ChatHistory and attach the tools for the request. LLM can now use the tools if it needs any of them to provide the response. If we compare this to the old way to define LLM functions, we can now just call the server to get list of features and provide them to LLM. With functions we had to implement the functions inside our own app and provided them to LLM. Now we can use all the existing server implementations no matter of what programming language they were implemented.

The rest of the SendMessage function is just getting the response and updating the Blazor UI.

GetDisplayName method converts the existing roles into more readable format. Assistant is AI and user is You. This is not mandatory, but gives a nice little touch in the UI.

Ollama

To run this application we need to first fire our Ollama and load the LLAMA 3.1 model. Ollama usage is quite straight forward and you can use ollama serve command to run the models as a “service”. LLAMA 3.1 model is quite heavy, but still runs decently on my NVidia GeForce 2080 Super. You can use lighter model also, but they just need to support the function invocation. You might want to check that, because not all models does.

LLAMA 3.1 loaded and it takes hefty amount of memory from GPU.

The basic Ollama commands that you need are:

ollama pull llama3.1:latest
ollama serve llama3.1:latest
ollama ps
ollama stop llama3.1:latest

Result

This is what our final result looks like. The model uses LLAMA model which is run on local computer and it is able to invoke the MCP server and return the list of GitHub issues. You can see that it uses the formatting that I had in my previous blog post in response, so we know that it is really calling the server implementation.

Model can answer questions about life in general or questions about GitHub issues!

Summary

MCP Servers are powerful way to provide tools for LLM’s. We can run these servers locally, or use hosted ones with SSE protocol (basically HTTP). Clients for these servers are really easy to implement and the ModelContextProtocol SDK provides nice and very dotnetish API to hook up with servers. I personally think that MCP servers are going to be a huge part of LLM solutions in near future and when we get the agent-to-agent support into it, it is going to explode the AI world.