Where is the Sayaka voice in Speech API OneCore?

2

Windows 10. I've installed the Japanese TTS voices in the Settings. Now, when I use voice enumeration in Speech API 5.4 OneCore (not in 5.4 proper though), I get 6 voices:

  • David
  • Zira
  • Ayumi
  • Haruka
  • Mark
  • Ichiro

The Speech settings page also shows those 6. But there's clearly a seventh one in the registry, Sayaka (HKLM\SOFTWARE\WOW6432Node\Microsoft\Speech_OneCore\Voices\Tokens\MSTTS_V110_jaJP_SayakaM). Its files are present under C:\windows\Speech_OneCore\Engines\TTS\ja-JP. Compared to the rest, there's an extra file, .heq. Why doesn't it enumerate?

The enumeration code goes:

    #import "libid:E6DA930B-BBA5-44DF-AC6F-FE60C1EDDEC8" rename_namespace("SAPI") //v5.4 OneCore

    HRESULT hr;
    SAPI::ISpVoicePtr v;
    v.CreateInstance(__uuidof(SAPI::SpVoice));
    SAPI::ISpObjectTokenPtr tok;
    hr = v->GetVoice(&tok); //Retrieve the default voice
    SAPI::ISpObjectTokenCategoryPtr cat;
    hr = tok->GetCategory(&cat); //Retrieve the voices category
    SAPI::IEnumSpObjectTokensPtr toks;
    hr = cat->EnumTokens(0, 0, &toks);

    //And enumerate
    unsigned long i, n;
    hr = toks->GetCount(&n);
    LPWSTR ws;
    for (i = 0; i < n; i++)
    {
        hr = toks->Item(i, &tok);
        hr = tok->GetId(&ws);
        CoTaskMemFree(ws);
    }

The only other mention of Sayaka online that I could find is here

Edit

Enumerating by Reset()/Next() gives the same 6. Trying to create a token directly around the registry path gives error 0x8004503a (SPERR_NOT_FOUND). Doing so while watching with Process Monitor reveals an interesting fact: rather than Sayaka under HKLM, the process interrogates the following key:

HKCU\Software\Microsoft\Speech_OneCore\Isolated\7WUiMB20NMV5Y7TgZ2WJXbUw32iGZQSvSkeaf0AevtQ\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens\MSTTS_V110_jaJP_SayakaM

There's indeed a key like that under HKCU, and it contains a copy of HKLM and HKCU settings for SAPI, and there's indeed no Sayaka under Voices in that key. Just the six I've mentioned.

So there's some kind of isolation going on, with SAPI settings in several copies. There are 7 different subkeys under Isolated, and the voice sets are different under those. Two contain voices that have nothing in common with the ones we know, and those have to do with Cortana. Hard to tell what's the unit of isolation - maybe a user, maybe an app package (in the UWP sense).

Edit

Like I suspected, there's an app package based isolation going on. I've created a brand new project with the same code, ran it, and got a different isolation key - F2yLLxINh6S1e3y3MkJo4ilfh036RB_9pHLEVL88yL0. Looks like every time you run a SAPI enabled application, it derives an isolation profile from the current executable. A moment ago, that isolation profile wasn't there, now it is. So it was created by SAPI on the fly. I don't think the voices are hard-coded, so it copied the voices in the isolation profile from somewhere, from the master list.

Where is the master list? It's not HKLM\...\Speech_OneCore, since one can see Sayaka is there. It could be tokens_TTS_ja-JP.xml under C:\Windows\SysWOW64\Speech_OneCore\Common\ja-JP, since Ayumi/Ichiro/Haruka are listed there but Sayaka isn't. The security on that file is quite draconian though, I'm having trouble editing that file even with admin rights. Also, it's a second hardlink to C:\Windows\WinSxS\wow64_microsoft-windows-t..peech-ja-jp-onecore_31bf3856ad364e35_10.0.18362.1_none_46741f8a666da90a.

The SysWOW64\Speech_OneCore folder allows write for administrators, but SysWOW64\Speech_OneCore\Common doesn't. Only TrustedInstaller can write it.

By the way, the isolation logic is specific to OneCore. SetId() in SAPI 5.4 proper looks in the key that matches the provided Id.


Alternative approach: the SAPI 5.4 docs mention the ISpRegDataKey interface, that lets one initialize a token directly from a HKEY. It's not in the typelib though.

text-to-speech
speech
sapi
asked on Stack Overflow Mar 10, 2020 by Seva Alekseyev • edited Mar 19, 2020 by halfer

2 Answers

1

This answer is about enabling Sayaka for those SAPI apps that don't explicitly opt in.

The master list of Japanese TTS voices is under C:\Windows\System32\Speech_OneCore\Common\ja-JP. It's not just one file - SAPI enumerates all XMLs there. The problem is, in order to write files to that folder one will need a utility that lets one run programs as TrustedInstaller. Those exist; there's a list here. I've used the one called PowerRun.

You need to create a file called something like tokens_TTS_ja-JP_Sayaka.xml (the exact name doesn't really matter) with the following content:

<?xml version="1.0" encoding="utf-8"?>
<Tokens>
  <Category name="Voices" categoryBase="HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore">
    <Token name="MSTTS_V110_jaJP_SayakaM">
      <String name="" value="Microsoft Sayaka - Japanese (Japan)" />
      <String name="LangDataPath" value="%windir%\Speech_OneCore\Engines\TTS\ja-JP\MSTTSLocjaJP.dat" />
      <String name="VoicePath" value="%windir%\Speech_OneCore\Engines\TTS\ja-JP\M1041Sayaka" />
      <String name="411" value="Microsoft Sayaka - Japanese (Japan)" />
      <String name="CLSID" value="{179F3D56-1B0B-42B2-A962-59B7EF59FE1B}" />
      <Attribute name="Version" value="11.0" />
      <Attribute name="Language" value="411" />
      <Attribute name="Gender" value="Female" />
      <Attribute name="Age" value="Adult" />
      <Attribute name="DataVersion" value="11.0.2016.0221" />
      <Attribute name="SharedPronunciation" value="" />
      <Attribute name="Name" value="Microsoft Sayaka" />
      <Attribute name="Vendor" value="Microsoft" />
      <Attribute name="SayAsSupport" value="spell=NativeSupported; cardinal=GlobalSupported; ordinal=NativeSupported; date=GlobalSupported; time=GlobalSupported; telephone=NativeSupported; address=NativeSupported; message=NativeSupported; url=NativeSupported; currency=NativeSupported; alphanumeric=NativeSupported" />
      <Attribute name="SampleText" value="既定の音声として%1を選びました" />
    </Token>
  </Category>
</Tokens>

And then copy that file, as TrustedInstaller, to C:\Windows\System32\Speech_OneCore\Common\ja-JP. On 64-bit Windows, also place a copy into C:\Windows\SysWOW64\Speech_OneCore\Common\ja-JP to cover the 32-bit applications.

Then all desktop SAPI applications will get Sayaka too, even the ones that already had an isolated settings key at the moment. It looks like SAPI refreshes the isolated settings from the master list, if necessary.

Sayaka will show up in the voice list under Settings/Speech, too, and say her greeting if asked.

answered on Stack Overflow Mar 12, 2020 by Seva Alekseyev • edited May 20, 2020 by Seva Alekseyev
1

If the isolation registry key doesn't have Sayaka, but HKLM does, an application can copy the Sayaka token to the isolation key on the first run. The key insight here is that the isolation key is writable without elevation, and SAPI supports creating and populating tokens. This doesn't rely on the specifics of isolation. Create a token with a hard-coded ID for Sayaka, and copy the properties and the attributes from HKLM. Like this:

#import "libid:E6DA930B-BBA5-44DF-AC6F-FE60C1EDDEC8" rename_namespace("SAPI") //v5.4 OneCore

//Get the default voice to avoid hard-coding the category
SAPI::ISpVoicePtr v;
SAPI::ISpObjectTokenPtr tok;
v.CreateInstance(__uuidof(SAPI::SpVoice));
v->GetVoice(&tok);
LPWSTR ws;
tok->GetId(&ws);
wchar_t TokID[200];
wcscpy_s(TokID, ws);
CoTaskMemFree(ws);

//Check if Sayaka is already registered in SAPI
SAPI::ISpObjectTokenCategoryPtr cat;
tok->GetCategory(&cat); //The category of voices
SAPI::IEnumSpObjectTokensPtr toks;
cat->EnumTokens(L"name=Microsoft Sayaka", 0, &toks);
unsigned long n;
toks->GetCount(&n);

if (n == 0) //Sayaka is not registered already
{
    //Is Sayaka present under HKLM\..\Voices\Tokens?
    HKEY hkSayaka, hkAttrs;
    if (RegOpenKeyEx(HKEY_LOCAL_MACHINE, L"SOFTWARE\\Microsoft\\Speech_OneCore\\Voices\\Tokens\\MSTTS_V110_jaJP_SayakaM", 0, KEY_READ, &hkSayaka) == ERROR_SUCCESS)
    {
        if (RegOpenKeyEx(hkSayaka, L"Attributes", 0, KEY_READ, &hkAttrs) == ERROR_SUCCESS)
        {
            //If yes, create a Sayaka token where SAPI OneCore thinks it should be!

            //Replace the final path component of the default voice's ID with Sayaka
            LPWSTR pbs = wcsrchr(TokID, L'\\');
            wcscpy_s(pbs + 1, _countof(TokID) - (pbs - TokID) - 1, L"MSTTS_V110_jaJP_SayakaM");
            tok.CreateInstance(__uuidof(SAPI::SpObjectToken));
            //Note the 1 in the third parameter - "create if needed"
            HRESULT hr = tok->SetId(0, (LPWSTR)TokID, 1);

            DWORD dwi;
            wchar_t ValName[100]; //Enough
            unsigned char ValData[1000]; //Enough
            DWORD ValNameLen, ValDataLen, Type;

            //Copy all values from the Sayaka key
            //They are all strings
            for (dwi = 0; RegEnumValue(hkSayaka, dwi, ValName, &(ValNameLen = _countof(ValName)), 0, &Type, ValData, &(ValDataLen = sizeof(ValData))) == ERROR_SUCCESS; dwi++)
                tok->SetStringValue(ValName, (LPWSTR)ValData);

            //Copy all attributes from the Sayaka\Attributes key
            //All strings too.
            SAPI::ISpDataKeyPtr attrs;
            tok->CreateKey((LPWSTR)L"Attributes", &attrs);
            for (dwi = 0; RegEnumValue(hkAttrs, dwi, ValName, &(ValNameLen = _countof(ValName)), 0, &Type, ValData, &(ValDataLen = sizeof(ValData))) == ERROR_SUCCESS; dwi++)
                attrs->SetStringValue(ValName, (LPWSTR)ValData);

            RegCloseKey(hkAttrs);
        }
        RegCloseKey(hkSayaka);
    }
}

A similar approach to exposing the hidden TTS voices is described here: https://www.ghacks.net/2018/08/11/unlock-all-windows-10-tts-voices-system-wide-to-get-more-of-them/


Since my original problem was limited to one TTS enabled app, I'm going to accept this answer and no the other one. That said, the whole issue with not inviting Sayaka to the party is probably a Microsoft oversight that they should ultimately address. Feel free to upvote my Feedback Hub request. Windows 10 users only.

answered on Stack Overflow Mar 13, 2020 by Seva Alekseyev • edited May 20, 2020 by Seva Alekseyev

User contributions licensed under CC BY-SA 3.0