Opened 4 years ago

Last modified 3 years ago

#11919 closed defect

TTS/screenreader support — at Version 1

Reported by: ObjectInSpace Owned by:
Priority: normal Component: --Other--
Version: Keywords:
Cc: ObjectInSpace Game:

Description (last modified by ObjectInSpace)

ScummVM should provide the ability for in-game text to be read via synthesized speech. Apart from the convenience of playing interactive fiction without having to look at a screen, this will enable all games using this engine to be enjoyed by the estimated 289 million people around the world with vision loss, some of whom enjoy adventure. Several Z-code interpreters support TTS, as does Retroarch, so it should be feasible for this project also.

Windows has an open-source library available called Tolk which should do most of the heavy lifting itself: https://github.com/dkager/tolk/
There is another called Universal Speech which appears to do a similar thing, but I don't think it has been updated as recently: https://github.com/qtnc/UniversalSpeech
Both of these libraries support JAWS+NVDA which are the most popular screenreaders. They also offer SAPI speech for universal compatibility.

Microsoft also provides a text-to-speech API via the XBox SDK. This is specific to Narrator, which is also included on Windows. Similarly for OSX, IOS and Android, support for their screenreaders is provided via native accessibility APIs.

Ideally, a player of an IF game should be able to hear the response of their input read back to them. Graphical adventure games should have the name of the currently selected object or action be read, along with any text as it is displayed on screen such as conversation prompts.

Scenario 1: player of Zork 1 types :open mailbox", hears back "the mailbox is now open. There's a leaflet inside."

Scenario 2: Day of the Tentacle player moves the cursor over the clock, hears "grandfather clock." Player presses o, hears "open."
Scenario 3: player of Grim Fandango decides to talk to Carla, hears: "1. Busy night? 2. What's the shuttle waiting for? 3. Can I try out your metal detector?"

There are a few different ways to achieve this. Retroarch uses optical character recognition (OCR), which converts the text from screenshots into a machine readable format via pattern matching algorithms. another project called SoniFight essentially reverse-engineers certain games to find the text from the memory address. (https://github.com/FedUni/SoniFight) However I feel that these are both sort of hackish. The real solution should be to find when and where those strings are referred to in the game and then have them be exposed to that platform's assistive technology via ScummVM.

Change History (1)

comment:1 by ObjectInSpace, 4 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.