Voicepeak API

From Novoyuuparosk Wiki
Revision as of 13:23, 26 September 2023 by Mikkeli (talk | contribs)

Regarding the title: more like 'Using VOICEPEAK via command line'.


References:

  1. https://atarms.hatenablog.com/entry/2023/03/12/164118
  2. https://takashiski.hatenablog.com/entry/2023/01/13/235249


Apparently, neither AHS nor Dreamtonics bothers giving an extensive documentation / manual for using VOICEPEAK without the GUI. For comparison, VOICEVOX doesn't do great but at least you can get a grip scavenging through what they have in the repo.

So, half of this is translation of the aforementioned Japanese blogs, the other half is what I found out by trying.

Basic usage

./voicepeak.exe [OPTION..]

Starting with ./voicepeak.exe -h:

 -s, --say Text               Text to say
 -t, --text File              Text file to say
 -o, --out File               Path of output file
 -n, --narrator Name          Name of narrator, check --list-narrator
 -e, --emotion Expr           Emotion expression, for example:
                              happy=50,sad=50. Also check --list-emotion
     --list-narrator          Print narrator list
     --list-emotion Narrator  Print emotion list for given narrator
 -h, --help                   Print help
     --speed Value            Speed (50 - 200)
     --pitch Value            Pitch (-300 - 300)

One thing to notice is that the command line execution is languishedly slow, probably because an unseen GUI gets initiated and terminated every time voicepeak.exe is called. I don't know the details but it is painfully slow.

Some further breakdown on the options and arguments:

-s, --say Text: The Text part is essentially a string. Reference #1 says the maximum length is 140 characters (probably determined by Twitter, just a wild guess) so generally nothing comically long.

-t, --text File: Basically the same with -s, but reads a text file for the text to speak.

-o, --out File: Specifies the name (path) of the output file. If not supplied, the default is 'output.wav' at the current (shell) directory.

-n, --narrator Name: Specifies the narrator (character). I think this defaults to 'the first one in the narrator list' but I have only 1 narrator in Koharuri so I don't know.

-e, --emotion Expr: Emotion ratios. Only works if the narrator (manually selected or default) is compatible with the given emotions. Note that Koharuri has totally different emotion names than the standard 6 nameless voices.I'll list these in another section in this page

--list-narrator: Returns a list of available, locally installed Narrators. The names can and should be used when a Narrator is required as an additional argument, such as in -n, --narrator.

--list-emotion Narrator: Returns a list of all possible emotion handle / variable names / tags / you name it for the given Narrator.

Emotions

Koharu Rikka

 hightension
 livid
 lamenting
 despising
 narration

Generic VOICEPEAK voices

The 'Japanese Male/Female 1/2/3' Voices. Also one called 'Japanese Female Child'.

 happy
 fun
 angry
 sad