The British company Sonantic has developed a neural network capable of convincingly imitating “deep human emotions” during the sounding of a text, for example, crying or sighing. In this respect, it exceeds by several orders of magnitude the capabilities of the same voice assistants Siri or Alexa, which can hardly be called emotional or expressive.
Sonantic’s sound editing software uses many different voice models based on the voices of live actors. The company demonstrated the results of its work in a published video. All voices that can be heard in the dialogue at the beginning of the video are generated by a computer algorithm that plays the role of a parting mother and daughter.
In April, the company raised €2.3 million in investment and is currently working with several game producers. The latter often have to record thousands of dialogue lines for further sounding. The use of Sonantic audio editor in turn will make it cheaper and faster to develop games. The program will be able to change the voice under different game circumstances – for example, if a character speaks while running – and not to lose “naturalness” when the scenario requires crying or screaming.
Developers do not believe that their technology will completely replace voice actors. Rather, it should become what computer graphics (CGI) for movie production became in its time. Sonantic specialists say that their technology will allow telling new stories in a fantastic way.