Google Home or Alexa are very useful personal assistants that listen to your voice to execute the command you ask.

Basically, it’s just composed by a microphone, speech/text recognition and some logic to understand the intents the user want to do. So, in fact, using Cognitive Services Speech SDK and LUIS (Language Understanding Service), we can easily perform the same thing, as all the feature are available out-of-the-box!

So, let’s code 🙂

LUIS Configuration

First step is to create a LUIS resource in the Azure portal (you can find it in the MarketPlace), as it’s now integrated on it:

Once the resource is created, go to the LUIS portal and create a new application. We won’t go in deep details how to create the LUIS part but here is the main steps:

Once the application is created, it’s now time to create the intent you want to recognize. An intent is just the action you want your engine to recognize like, “Turn on lights”, “Give me weather”, etc.

Of course, each intents depends on what you want to recognize, according to your use-case. Just don’t forget to add, for each intent, enough utterance to train the model.

OK, your LUIS model is now ready, it’s time to assign the resource you created in Azure Portal to the application created in LUIS portal. So, basically, you’ll tell the LUIS application which subscription and resources will be used so you can be charged 🙂

Please note: when writing this article, I was able to select only a LUIS resource that has been created in West US region.

Speech SDK demo application

Now that LUIS part is done, it’s now time for some code! Thanks to the Cogntive Services Speech SDK, we’ll create an app that will listen to what the user is saying, translate it from speech to text, call LUIS to detect the user’s intent and come back with the information correctly provided.

Let’s start by create a new Console app and add the Speeck SDK Nuget package on it:

Once the package have been added, we can add some code on the application:

var config = SpeechConfig.FromSubscription(
LUIS_API_KEY, LUIS_API_SERVICE_REGION);
config.SpeechRecognitionLanguage = LUIS_API_LANGUAGE_RECOGNITION;

First, we build a configuration object, passing the API key and Service Region of the LUIS application build previously. We also change the SpeechRecognitionLanguage property to specify the language the user will use to talk (French, so “fr-FR” in our case).

Then, we’ll instruct our recognizer that it can use our LUIS model so it’ll be able to detect which intent corresponds to what the user said:

var model = LanguageUnderstandingModel.FromAppId(
LUIS_APP_KEY);
recognizer.AddIntent(model, "TurnOffLights", "TurnOff");
recognizer.AddIntent(model, "TurnOnLights", "TurnOn");
recognizer.AddIntent(model, "GiveMeWeather", "Weather");

And, finally, we can just perform the recognition:

var result = await recognizer.RecognizeOnceAsync()
.ConfigureAwait(false);

if (result.Reason == ResultReason.RecognizedIntent)
{
     var jsonResults = result.Properties.GetProperty(
PropertyId.LanguageUnderstandingServiceResponse_JsonResult);
                    
    // jsonResults can be parsed to find the entity
    switch (result.IntentId)
    {
        case "TurnOff":
            Console.WriteLine($"You said: {result.Text} so 
let me turn off the lights for you");
            break;

        case "TurnOn":
            Console.WriteLine($"You said: {result.Text} so 
let me turn on the lights for you");
        break;
    }
}

As you’ll notice, the engine will help you to recognize the intent but there is no direct way to get the entity found by LUIS. What you can do is just parse the result retrieved from LUIS and find, from there, the corresponding information.
Also, note the method used to perform the recognition (RecognizeOnceAsync): if this one is useful for “one-shot”, it’s not the recommanded way to perform “real-time” recognition. If this is your use-case, it’s better to use StartContinuousRecognitionAsync.

And voilà! You know have a system that can:

  • Retrieve what user said
  • Convert it from speech to text
  • Send the text to LUIS
  • Determine the intent
  • Send it back to the application

Of course, this is your work to finish and to perform the actions according to the intents but this is a good start for you!

If you’re interested, full code is available here: https://github.com/ThomasLebrun/SpeechSDKWithLuis


Happy coding 🙂


[SQL Server] Check if a job has already been started or scheduled Rebooting my blog (again!)

Leave a Reply

Your email address will not be published. Required fields are marked *