Opened 16 months ago

Last modified 16 months ago

#13276 new defect

AGS: Cannot open files with non-ASCII characters in their name

Reported by: criezy Owned by:
Priority: normal Component: Engine: AGS
Version: Keywords:
Cc: Game:


When trying to play Bustin' the Bastille in French I noticed that the game was in English and that the following error:

Cannot open translation: Fran?ais.tra

where the ? is actually 0xE7

The file is actually named Français.tra and 0xE7 is the ç character in iso-8859-1. I suspect there is a confusion between uft-8 and iso-8859-1 encodings somewhere in the AGS engine or in the fs code.

ScummVM version: current master (162924da00)
System: macOS

Change History (2)

comment:1 by criezy, 16 months ago

Interestingly, if I select Français as the game language in the ScummVM game options before starting the game, then I still get the in-game language selection screen when starting the game, I still get the error in the log, but the game is in French.

comment:2 by criezy, 16 months ago

I now suspect the encoding used is CP 1252, which apparently is what is used by Windows for the file name encoding (at least in most Western countries), rather than ISO-8859-1.

I have also verified that the same issue exists with the standalone AGS interpreter. But I suspect it might work on Windows, although maybe not on Russian or Japanese Windows for example.

Here is the call stack where the error occurs:

  frame #0: AGS3::init_translation(lang, fallback_lang, quit_on_error=false) at translation.cpp:61:6
  frame #1: AGS3::Game_ChangeTranslation(newFilename="Fran\xe7ais") at game.cpp:725:6
  frame #2: AGS3::Sc_Game_ChangeTranslation(params, param_count=1) at game.cpp:1491:2
  frame #3: AGS3::ccInstance::Run(this, curpc=0) at cc_instance.cpp:1023:20
  frame #4: AGS3::ccInstance::CallScriptFunction(this, funcname="hFrancais_AnyClick", numargs=0, params) at cc_instance.cpp:341:15
  frame #5: AGS3::RunScriptFunctionIfExists(sci, tsname="hFrancais_AnyClick", numParam=0, params) at script.cpp:363:32
  frame #6: AGS3::RunTextScript(sci, tsname="hFrancais_AnyClick") at script.cpp:412:14
  frame #7: AGS3::RunScriptFunction(sc_inst=kScInstRoom, fn_name="hFrancais_AnyClick", param_count=0, p1, p2) at script.cpp:269:4
  frame #8: AGS3::post_script_cleanup() at script.cpp:554:3
  frame #9: AGS3::RunScriptFunctionIfExists(sci, tsname="hFrancais_AnyClick", numParam=1, params) at script.cpp:380:2

So basically Game_ChangeTranslation gets called from a script with the filename of the translation to use. I cannot see any encoding conversion, so it looks like the script is expected to provide the filename in whatever encoding is used by the OS. Since the game was only released for Windows (and since that is the main system targeted by AGS), it makes sense to assume CP 1252 encoding.

I also checked that the AGS documentation for Game_ChangeTranslation and for the File class does not mention anything about the encoding.

I also suspect that it doesn't work on Windows in ScummVM. If I understand properly the following two PRs, the ScummVM FS code now expects filenames in UTF-8 on Windows:

UTF-8 is also what is used on macOS, and most of the time on Linux I think.

So maybe we should assume CP 1252 in Game_ChangeTranslation, assume we need UTF-8 in ScummVM, and do the encoding conversion for the file name?
That would not impact translation set in our GUI as it calls directly init_translation .
However are those assumptions correct? Could a Russian game for example use a different encoding if it has file names with non ASCII characters? We could make an assumption on the encoding based on the game language if that is needed. But for translations that would be useless as then it is multilingual and we don't specify a game language in the detection.

And is this something that should be done at a lower level (e.g. in the AGS File class) so that it is also done when opening other types of files the game may be using? However there we would have another issue as we have filenames that come from ScummVM (e.g. savegame files, translation files when set in the ScummVM GUI) and those would already be in UTF-8.

I am hoping this is an issue that are restricted to only a few games and that most games don't have the strange idea to use non-ASCII characters for file names. So maybe we don't need to overthink it.

A simple fix here would be to do CP-1252 to UTF-8 conversion in Game_ChangeTranslation (or maybe in SC_ Game_ChangeTranslation )? It will not break games that only use ASCII characters, and for games that have non-ASCII characters in another encoding (if there is any) they would not work anyway currently. If in the future we find such a game, the code could be updated. And if we see the same issue for other types of files, we could add a similar conversion in the corresponding script function.

Finally, as an aside, the reason it works when selecting the translation in the ScummVM GUI is because there the encoding for the file name is properly handled (it is consistent). And that sets the translation when starting the game. Then when it gets to the language selection screen it fails to change the translation when selection French and thus remains on the one previously set.

Note: See TracTickets for help on using tickets.