Opened 6 years ago

Closed 6 years ago

#6345 closed defect (fixed)

TINSEL: DW2 is broken on big endian hardware

Reported by: SF/canavan Owned by: bluegr
Priority: normal Component: Engine: Tinsel
Keywords: Cc:
Game: Discworld II

Description

As far as I can tell, both Discworld and Discworld II may be broken on big endian hardware since commit ce87175bede46c1bb938b73484e1db05212defbd.

TINSEL: Revert the BE -> LE resource conversion for DW1 Mac

scummvm 1.5.0 works, 1.6.0 doesn't. Both Games work if I check out af667771a918370988cc656412e2ddea3d1d44a3. I'm using N32 binaries on IRIX with the MIPSPro compilers.

discworld (detected as CD/DOS/English)fails with an assertion:

$ ./scummvm
WARNING: Could not find theme 'scummmodern' falling back to builtin!
User picked target 'dw-cd' (gameid 'tinsel')...
Looking for a plugin supporting this gameid... Tinsel
Starting 'Tinsel engine game'
Assertion failed: handle < g_numHandles, file engines/tinsel/handle.cpp, line 329, pid 65049

discworld II (detected as CD/DOS/English (GB)) dies with a segfault:

Process 65090 (scummvm) stopped on signal SIGSEGV: Segmentation violation (handler sig_fixup_mask) at [Tinsel::GetBytes(const unsigned char*,const Tinsel::WorkaroundEntry*&,int&,unsigned int):540 +0x68,0x1049ba24]
540 tmp = code[ip++ * (TinselV0 ? 4 : 1)];
(dbx) active thread 0x10000
Thread 0x10000 is active
(dbx) where

Thread 0x10000
> 0 Tinsel::GetBytes(const unsigned char*,const Tinsel::WorkaroundEntry*&,int&,unsigned int)(scriptCode = 0x113e0e38, wkEntry = 0x7ffb6b04, ip = 0x7ffb6b00, numBytes = 0) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/pcode.cpp":540, 0x1049ba24]
1 Tinsel::Interpret(Common::CoroBaseContext*&,Tinsel::INT_CONTEXT*)(coroParam = 0x10bad4c0, ic = 0x10b36f18) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/pcode.cpp":608, 0x1049bf70]
2 Tinsel::SceneTinselProcess(Common::CoroBaseContext*&,const void*)(coroParam = 0x10bbbb60, param = 0x10bbbb84) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/scene.cpp":176, 0x104d946c]
3 Common::CoroutineScheduler::schedule(void)(this = 0x10bc3950) ["/usr/people/canavan/src/scummvm/git/scummvm/common/coroutines.cpp":231, 0x1080c37c]
4 Tinsel::TinselEngine::NextGameCycle(void)(this = 0x10bb6568) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/tinsel.cpp":1032, 0x1048cd10]
5 Tinsel::TinselEngine::run(void)(this = 0x10bb6568, <no name> = 0x7ffb7538) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/tinsel.cpp":980, 0x1048b6ec]
6 ::runGame(const PluginSubclass<MetaEngine>*,OSystem&,const Common::String&)(<no name> = 0x7ffb7cf8, plugin = 0x10a22528, system = 0x10a28fb0, edebuglevels = 0x7ffb7738) ["/usr/people/canavan/src/scummvm/git/scummvm/base/main.cpp":226, 0x100e4190]
7 ::scummvm_main(argc = 1, argv = 0x7ffb7ee4) ["/usr/people/canavan/src/scummvm/git/scummvm/base/main.cpp":452, 0x100e57a4]
8 ::main(argc = 1, argv = 0x7ffb7ee4) ["/usr/people/canavan/src/scummvm/git/scummvm/backends/platform/sdl/posix/posix-main.cpp":45, 0x100e0410]
9 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":177, 0x100db1a8]

Even when compiling with -g0 (no optimization), I haven't been able to convice the debugger to provide any useable info about local/stack variables in Tinsel::GetBytes.

Ticket imported from: #3614416. Ticket imported from: bugs/6345.

Change History (44)

comment:1 by wjp, 6 years ago

Just to be sure, you have the DOS versions of DW1 and DW2? And do you have their 'platform' in the 'Edit game' dialog for those entries set to DOS?

Since in theory I don't think commit ce87175bede46c1bb938b73484e1db05212defbd has any effect on DOS versions of DW1/2.

comment:2 by SF/canavan, 6 years ago

Autodetection has set the platform to DOS. The CDs for both DW and DW2 bear "PC CDROM" labels, the boxes list "IBM-compatible (PC|computer)" in the requirements section.

comment:3 by bluegr, 6 years ago

I don't have a BE system to test, so could you please try and bisect this?

As wjp said, commit b05fa7f20414d6a7571a9ba52f542e527f598c62 doesn't seem to be the cause of this...

comment:4 by bluegr, 6 years ago

Hm... if you are talking about commit, there's one thing that I found:

Inside palette.cpp:
// get hardware palette pointer
- pPalette = (const PALETTE *)LockMem(pDACtail->pal.hRGBarray);
+ pPalette = (const PALETTE *)LockMem(FROM_32(pDACtail->pal.hRGBarray));

I tried to make sure that I didn't add any endianess handling in that commit, but it seems that line slipped through by accident :(

If that is the offending commit, could you try changing that line to what it was? i.e.
pPalette = (const PALETTE *)LockMem(pDACtail->pal.hRGBarray);

comment:5 by SF/canavan, 6 years ago

I'm not sure if I've interpreted the output of git log correctly. A round of git bisect identifies this release as the cause:

c90d56355fa0bbcdd3122f3e376e5609422338b3 is the first bad commit
commit c90d56355fa0bbcdd3122f3e376e5609422338b3
Author: Filippos Karapetis <bluegr@gmail.com>
Date: Mon Dec 10 17:18:56 2012 +0200

TINSEL: Simplify the scene entrance handling code

This also reverts the rest of the BE resource handling code

:040000 040000 acfc6e90597bacc597c3673e559a69356adc1195 705e229062666e136e601323e74d8fbea6f82d36 M engines

manually testing show that
git checkout c90d56355fa0bbcdd3122f3e376e5609422338b3 is bad
git checkout c6cf4827d719c1833ce4d7e108410db81f00c358 is bad.

comment:6 by bluegr, 6 years ago

So can you verify that:

da971e38a565a52af92c4d419546820c8af9955b works
while
c6cf4827d719c1833ce4d7e108410db81f00c358 doesn't work?

comment:7 by SF/canavan, 6 years ago

ading the FROM_32() to the pPalette assignment in HEAD doesn't change anything - scummvm still crashes.

I may be able to provide access to a big endian system for testing, however that one would be somewhat slowish with only 2x 400MHz CPUs.

comment:8 by digitall, 6 years ago

Summary: TINSEL broken on big endian hardwareTINSEL: Engine broken on big endian hardware

comment:9 by digitall, 6 years ago

There is the method described here which may be of help to allow you to run a BE VM machine on a fast LE (x86) native machine:
http://wiki.scummvm.org/index.php/HOWTO-Debug-Endian-Issues

comment:10 by SF/canavan, 6 years ago

da971e38a565a52af92c4d419546820c8af9955b
and
c6cf4827d719c1833ce4d7e108410db81f00c358 both work.

My claim below that c6cf... is bad appears to be a cut-and-paste error.

c90d56355fa0bbcdd3122f3e376e5609422338b3 doesn't work

comment:11 by SF/canavan, 6 years ago

Summary: TINSEL: Engine broken on big endian hardwareTINSEL broken on big endian hardware

comment:12 by SF/canavan, 6 years ago

Just checked with gcc 4.7, same problem as with the MIPSPro compilers.

comment:13 by digitall, 6 years ago

canavan: Thank you for the further debugging information.

Can you try running a Git bisection to locate the exact point where the breakage/regression was introduced please?
If you are not familar with bisection, see:
http://git-scm.com/book/en/Git-Tools-Debugging-with-Git

comment:14 by SF/canavan, 6 years ago

No new surprises running git bisect again:

scummvm :) $ git bisect good
c90d56355fa0bbcdd3122f3e376e5609422338b3 is the first bad commit
commit c90d56355fa0bbcdd3122f3e376e5609422338b3
Author: Filippos Karapetis <bluegr@gmail.com>
Date: Mon Dec 10 17:18:56 2012 +0200

TINSEL: Simplify the scene entrance handling code

This also reverts the rest of the BE resource handling code

:040000 040000 acfc6e90597bacc597c3673e559a69356adc1195 705e229062666e136e601323e74d8fbea6f82d36 M engines

comment:15 by digitall, 6 years ago

canavan: Thanks for doing that... as suspected. Could you also try reverting / splitting down the change chunks in that commit (This will have to be a manual process as I don't think Git offers that option) and see which of the logical changes is provoking this regression?

comment:16 by bluegr, 6 years ago

@canavan: I've fixed the regression in that commit in commit 4b69071. Please, try fetching the newest changes on master and try again. Many thanks for your bisecting work :)

comment:17 by wjp, 6 years ago

Owner: set to bluegr
Resolution: fixed
Status: newclosed

comment:18 by wjp, 6 years ago

I've just tested DW1 on debian/powerppc, and I can confirm that commit fixed the crash.

comment:19 by SF/canavan, 6 years ago

4b69071 does indeed fix the assertion when starting DW1, but DW2 still segfaults here. None of the "obvious" (to me, e.g. the handling of hProcess, numProcess, hScript, hMusicScript) other changes appear to fix DW2 while keeping DW1 usable.

comment:20 by wjp, 6 years ago

Resolution: fixed
Status: closednew

comment:21 by bluegr, 6 years ago

Can you provide any more information? What kind of error are you getting? Could you please help in bisecting the last commit that works with DW2? You can apply commit 4b69071 while bisecting - IIRC, that part of the code hasn't been touched by any of the subsequent commits

comment:22 by bluegr, 6 years ago

Summary: TINSEL broken on big endian hardwareTINSEL: DW2 is broken on big endian hardware

comment:23 by bluegr, 6 years ago

I assume (from the lack of responses) that DW1 has been fixed on BE systems now, thus I'm changing the bug description to only mention DW2

comment:24 by SF/canavan, 6 years ago

Apologies for the late response. DW1 is indeed working after 4b69071. Even with 4b69071 applied, git bisect still points at c90d56355fa0bbcdd3122f3e376e5609422338b3 as the culprit for the DW2 crash on startup.

c90d56355fa0bbcdd3122f3e376e5609422338b3 is the first bad commit
commit c90d56355fa0bbcdd3122f3e376e5609422338b3
Author: Filippos Karapetis <bluegr@gmail.com>
Date: Mon Dec 10 17:18:56 2012 +0200

TINSEL: Simplify the scene entrance handling code

This also reverts the rest of the BE resource handling code

:040000 040000 acfc6e90597bacc597c3673e559a69356adc1195 705e229062666e136e601323e74d8fbea6f82d36 M engines

The backtrace still looks the same:

> 0 Tinsel::GetBytes(const unsigned char*,const Tinsel::WorkaroundEntry*&,int&,unsigned int)(scriptCode = <illegal>, wkEntry = <illegal>, ip = <illegal>, numBytes = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/pcode.cpp":541, 0x1016cc60]
1 Tinsel::Interpret(Common::CoroBaseContext*&,Tinsel::INT_CONTEXT*)(coroParam = <illegal>, ic = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/pcode.cpp":608, 0x1016ce9c]
2 Tinsel::SceneTinselProcess(Common::CoroBaseContext*&,const void*)(coroParam = <illegal>, param = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/scene.cpp":177, 0x1019d5b0]
3 Common::CoroutineScheduler::schedule(void)(this = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/common/coroutines.cpp":231, 0x104003e8]
4 Tinsel::TinselEngine::NextGameCycle(void)(this = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/tinsel.cpp":1031, 0x10160614]
5 Tinsel::TinselEngine::run(void)(this = <illegal>, <no name> = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/engines/tinsel/tinsel.cpp":976, 0x1015f638]
6 ::runGame(const PluginSubclass<MetaEngine>*,OSystem&,const Common::String&)(<no name> = <illegal>, plugin = <illegal>, system = <illegal>, edebuglevels = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/base/main.cpp":229, 0x10090a38]
7 ::scummvm_main(argc = 0, argv = 0x1054c330) ["/usr/people/canavan/src/scummvm/git/scummvm/base/main.cpp":474, 0x10091d6c]
8 ::main(argc = <illegal>, argv = <illegal>) ["/usr/people/canavan/src/scummvm/git/scummvm/backends/platform/sdl/posix/posix-main.cpp":45, 0x1008db88]
9 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":177, 0x100872e8]

comment:25 by wjp, 6 years ago

bluegr: I've tried to debug this on a BE machine a while ago, but got lost in the maze of commits and reverts mixed together around the commit canavan's bisect points to. Can you summarize what these commits are meant to do?

comment:26 by bluegr, 6 years ago

Sorry for the late reply, just saw this

wjpalestijn: That was the time when I was trying to add support for DW1 Mac (which is BE in many parts in its resources). Initially, I created an intermediate layer that would translate all BE resources to LE and leave the rest of the code untouched.

That didn't work, as we needed to have two separate resource handlers, which would complicate the code. Thus I started adapting the engine itself for the BE resources in DW1 Mac, hence the reverts made that day.

comment:27 by wjp, 6 years ago

So, what shall we do about this? Revert everything back to before this whole situation?

comment:28 by wjp, 6 years ago

Or maybe create a branch purely intended for debugging, in which we first revert everything, and then get it back to the current situation with a chain of clear commits, so it's actually visible what's changing.

comment:29 by bluegr, 6 years ago

No, I don't want to revert these commits, as they are part of the chain of commits that add support for the Macintosh version of DW1. It's very hard to debug tinsel regarding endianess, as the whole engine accesses objects in memory directly.

I don't have a BE system to debug, but the offending commits have been identified, so it's a case of finding the place that has the endianess issues. I can't find what's wrong with that commit, so we will need to trace which part of the DW2 code has the endianess issues.

comment:30 by wjp, 6 years ago

Yes, I tried that, and it's much harder than it needs to be because reverts and new features are mixed in with eachother.

Note that the identified commit has "Simplify the scene entrance handling code. This also reverts the rest of the BE resource handling code" as the commit message.

comment:31 by wjp, 6 years ago

So my suggestion would be to, possibly in a branch in someone's fork somewhere: revert everything, verify it works on BE and LE, and then try this resource thing again, in a clear sequence of commits.

I'm willing to test and debug on BE if the sequence of commits is clear.

comment:32 by raziel-, 6 years ago

Sorry for the duplicate entry on this behaviour
https://sourceforge.net/tracker/?func=detail&atid=418820&aid=3615101&group_id=37116

As it is nowhere mentioned here i'll add the assertion i'm getting with DW2:
assertion "ic->sp >= 0" failed: file "engines/tinsel/pcode.cpp", line 757

comment:33 by wjp, 6 years ago

GetSceneStruc does nothing for DW2, completely breaking Scene support for DW2.

It's also suspicious that not all FindChunk(..., CHUNK_SCENE) calls are wrapped in GetSceneStruc.

comment:34 by wjp, 6 years ago

Filippos: I understand you don't want to revert these commits, but keeping DW2 broken as it is is just not an option. If it's not fixed soon I see little other option than reverting everything back to a working state, even if that does remove Mac DW1 support again.

comment:35 by bluegr, 6 years ago

Hopefully, the BE issues in DW2 have been fixed in commit 5931df4. Please test.

comment:36 by bluegr, 6 years ago

wjp did some brief tests after the commit. Here are his results:

"I briefly tested it during breakfast, and (DOS) DW1 and 2 both start now on Linux
running on a BE PPC G4, but I'm unable to do proper testing until around
Christmas since I'll largely be away from home the next few weeks. I don't have
DW1Mac to test, so I can't do that part."

comment:37 by raziel-, 6 years ago

(I edited out all of the parts that doesn't belong here :-)
Funny that i always find out about a solution to a problem AFTER i hit Post) - Sorry for the noise

I just want to add some information, which may be of help.

AmigaOS4 - PPC - SDL - BE

On starting the games from the launcher it will lead to one of the following three stops of ScummVM

It will always load in the midi information to the attached hardware (visible in the midi hardware display) and only after that come up with one of the following

1) It will silently quit with the message "SetCD() problem!"
2) It will silently quit with the message "Interpret() - Unknown opcode!"
3) It will assert with "assertion "ic ->sp >= 0" failed: file "engines/tinsel/pcode.cpp", line 788

Until a few days ago i was able to start DW 2 and play a bit, but now it wouldn't even let me start the games because of the behaviour described above.

DW 1 is working great (apart from the PSX loading from laucher problem)!

After checking with Fillipos i can add that 1.4.0 and 1.5.0 are able to start and play the game.

comment:38 by wjp, 6 years ago

With which revision of ScummVM are you trying this exactly? DW2 properly started for me on a BE PPC G4 after the changes on 5 Dec. There haven't been any recent changes that would explain a change "a few days ago", as far as I can tell.

comment:39 by wjp, 6 years ago

Or do you mean to say that you can start DW2, but not actually play for any length of time? (I haven't tested that far yet.)

comment:40 by raziel-, 6 years ago

Sorry, my comment were a bit misleading.

I meant to write that it is a game of luck to actually be able to play the game.
On start of the game from launcher i can not reach the game anymore at all.
Either of those three errors come up and stop me from entering to play it.

"A few days ago" (i think it was the 30.12 or 31.12 i was able to enter the game and play a bit, i reached most of the first act rooms before another assertion kicked me out.

Only after that assertion i wasn't able to reach the actual game to play (i wanted to because i wanted to see if the assertion reproduceable)

I have only one save right after the intro.
Using this save or using to start the game new results in midi information being sent to my hardware, screen stays black and either the assertion or the other two error messages happen.

No game since the last time i was able to get in.

I may actually be able to enter the game, but for now i'm always shut out.

comment:41 by SF/canavan, 6 years ago

The current git master (84d4e97d) still works for me with DW2 under IRIX. I was able to start a new game, walk to various places, talk to a few characters and pick up a few items without any obvious flaws and absolutely no assertions.

comment:42 by wjp, 6 years ago

A follow-up on IRC suggested that this was possibly caused by a broken local merge.

Hubert, did you already get the chance to confirm?

comment:43 by raziel-, 6 years ago

@wjp

Sorry for the delay.
Downloaded a todays package, build and played

No more of those behaviour, my local copy is flawed

comment:44 by wjp, 6 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.