I've been reading Stephen Toub's blog post about building a simple console-based .NET chat application from the ground up with semantic-kernel. I'm following the examples but instead of OpenAI I want to use microsoft Phi 3 and the nomic embedding model. The first examples in the blog post I could recreate using the semantic kernel huggingface plugin. But I can't seem to run the text embedding example.
I've downloaded Phi and nomic embed text and are running them on a local server with lm studio.
Here's the code I came up with that uses the huggingface plugin:
using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using System.Numerics.Tensors;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel.ChatCompletion;
#pragma warning disable SKEXP0070, SKEXP0003, SKEXP0001, SKEXP0011, SKEXP0052, SKEXP0055, SKEXP0050 // Type is for evaluation purposes only and is subject to change or removal in future updates.
internal class Program
{
private static async Task Main(string[] args)
{
//Suppress this diagnostic to proceed.
// Initialize the Semantic kernel
IKernelBuilder kernelBuilder = Kernel.CreateBuilder();
kernelBuilder.Services.ConfigureHttpClientDefaults(c => c.AddStandardResilienceHandler());
var kernel = kernelBuilder
.AddHuggingFaceTextEmbeddingGeneration("nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q8_0.gguf",
new Uri("http://localhost:1234/v1"),
apiKey: "lm-studio",
serviceId: null)
.Build();
var embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithTextEmbeddingGeneration(embeddingGenerator);
memoryBuilder.WithMemoryStore(new VolatileMemoryStore());
var memory = memoryBuilder.Build();
// Download a document and create embeddings for it
string input = "What is an amphibian?";
string[] examples = [ "What is an amphibian?",
"Cos'è un anfibio?",
"A frog is an amphibian.",
"Frogs, toads, and salamanders are all examples.",
"Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
"They are four-limbed and ectothermic vertebrates.",
"A frog is green.",
"A tree is green.",
"It's not easy bein' green.",
"A dog is a mammal.",
"A dog is a man's best friend.",
"You ain't never had a friend like me.",
"Rachel, Monica, Phoebe, Joey, Chandler, Ross"];
for (int i = 0; i < examples.Length; i++)
await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
var embed = await embeddingGenerator.GenerateEmbeddingsAsync([input]);
ReadOnlyMemory<float> inputEmbedding = (embed)[0];
// Generate embeddings for each chunk.
IList<ReadOnlyMemory<float>> embeddings = await embeddingGenerator.GenerateEmbeddingsAsync(examples);
// Print the cosine similarity between the input and each example
float[] similarity = embeddings.Select(e => TensorPrimitives.CosineSimilarity(e.Span, inputEmbedding.Span)).ToArray();
similarity.AsSpan().Sort(examples.AsSpan(), (f1, f2) => f2.CompareTo(f1));
Console.WriteLine("Similarity Example");
for (int i = 0; i < similarity.Length; i++)
Console.WriteLine($"{similarity[i]:F6} {examples[i]}");
}
}
At the line:
for (int i = 0; i < examples.Length; i++)
await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
I get the following exception:
JsonException: The JSON value could not be converted to Microsoft.SemanticKernel.Connectors.HuggingFace.Core.TextEmbeddingResponse
Does anybody know what I'm doing wrong?
I've downloaded the following nuget packages into the project:
Id | Versions | ProjectName |
---|---|---|
Microsoft.SemanticKernel.Core | {1.15.0} | LocalLlmApp |
Microsoft.SemanticKernel.Plugins.Memory | {1.15.0-alpha} | LocalLlmApp |
Microsoft.Extensions.Http.Resilience | {8.6.0} | LocalLlmApp |
Microsoft.Extensions.Logging | {8.0.0} | LocalLlmApp |
Microsoft.SemanticKernel.Connectors.HuggingFace | {1.15.0-preview} | LocalLlmApp |
Newtonsoft.Json | {13.0.3} | LocalLlmApp |
Microsoft.Extensions.Logging.Console | {8.0.0} | LocalLlmApp |
I found a solution to this problem thanks to Bruno Capuano's blog post about building a local RAG scenario using Phi-3 and SemanticKernel.
The code up to the string input = "What is an amphibian?";
line now looks like this:
// Initialize the Semantic kernel
IKernelBuilder kernelBuilder = Kernel.CreateBuilder();
Kernel kernel = kernelBuilder
.AddOpenAIChatCompletion(
modelId: "phi3",
endpoint: new Uri("http://localhost:1234"),
apiKey: "lm-studio")
.AddLocalTextEmbeddingGeneration()
.Build();
// get the embeddings generator service
var embeddingGenerator = kernel.Services.GetRequiredService<ITextEmbeddingGenerationService>();
var memory = new SemanticTextMemory(new VolatileMemoryStore(), embeddingGenerator);
So although we're not using OpenAI we can still use the AddOpenAIChatCompletion method.
The AddLocalTextEmbeddingGeneration() method is from the SmartComponents.LocalEmbeddings.SemanticKernel Nuget package
I wrote a small console program with most of the examples from the blog posts. You can find it on github
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With