Opened 7 years ago

Last modified 12 months ago

#5862 new defect

AGOS: Deadlock when quitting Waxworks demo

Reported by: lordhoto Owned by: Mataniko
Priority: high Component: Engine: AGOS
Keywords: Cc:
Game: Waxworks


ScummVM revision 833c692fc96820a47da4ec4068517a85cafb8fc0
Linux/PowerPC (32bit)
g++ 4.6.1

Sound driver: AdLib
Demo from our demos page.

When exiting the Waxworks Demo at the Accolade logo (via closing the Window) I got a deadlock.

I attached the backtraces for the main thread and the sound thread as files.

Ticket imported from: #3419778. Ticket imported from: bugs/5862.

Attachments (4)

thread1.txt (2.2 KB) - added by lordhoto 7 years ago.
Main thread backtrace
thread2.txt (2.0 KB) - added by lordhoto 7 years ago.
Sound thread backtrace
simon2_main_thread.txt (2.6 KB) - added by lordhoto 7 years ago.
simon2_audio_thread.txt (2.1 KB) - added by lordhoto 7 years ago.

Download all attachments as: .zip

Change History (12)

Changed 7 years ago by lordhoto

Attachment: thread1.txt added

Main thread backtrace

Changed 7 years ago by lordhoto

Attachment: thread2.txt added

Sound thread backtrace

comment:1 Changed 7 years ago by digitall

This looks like an issue I had with an engine I was working on.
It occurs when you try to destruct the midiPlayer/midiDriver while the midiDriver is still assigned to the timerCallback of the player and causes sporadic issues at quit i.e. if the callback is running and locks the mutex..
The code in audio/midiplayer.cpp line 45 i.e. default implementation indicates that it is expected for the destructor of a midiPlayer to unhook the callback prior to driver deletion and player destruction.

comment:2 Changed 7 years ago by digitall

I have looked at the AGOS code and this seems likely to be the cause.
Kirben is going to commit a correction, but hard to know if this is "fixed" as the problem is not reliably replicable anyway.

Lordhoto: It should be noted that all other engines in tree should be quickly reviewed by grep to check that their midiPlayer implementations do unhook the callback in their destructors otherwise they wil likely be affected by similar issues.

comment:3 Changed 7 years ago by sev-

This bug is nice to get fixed before the release. Raising priority for keeping the track.

comment:4 Changed 7 years ago by sev-

Priority: normalhigh

comment:5 Changed 7 years ago by digitall

sev-: Kirben has already committed a correction as d5a763b763a39dbd0873f7eb6e17105a9b1cc958. However, it is not east to test that this is fixed as I indicated earlier i.e. "hard to know if this is "fixed" as the problem is not reliably replicable anyway."

However, the bug was also left open on reviewing the other engines for failure to unhook the MidiDriver timer callbacks. A cursory grep seems to indicate that AGI, CGE, Draci, Hugo, M4, MADE, Parallaction, Prisoner, SAGA, SCI. Tinsel and Touche may need corrections to their MIDIPlayer destructors i.e. missing _driver->setTimerCallback(NULL, NULL); or similar prior to _driver deletion?

comment:6 Changed 7 years ago by lordhoto

I was able to reproduce another dead lock in the MIDI code with the recent master in Simon 2. I tried to quit ScummVM via the window close button when the closet appeared in Simon's room in the intro when the deadlock happend.

I attached backtraces from the main and audio thread.

Changed 7 years ago by lordhoto

Attachment: simon2_main_thread.txt added

Changed 7 years ago by lordhoto

Attachment: simon2_audio_thread.txt added

comment:7 Changed 13 months ago by Mataniko

Owner: changed from Kirben to Mataniko
Resolution: worksforme
Status: newclosed

Tried both Waxworks demo and Simon 2 and I couldn't reproduce. Closing this with the assumption that the fix implemented 6 years ago worked

comment:8 Changed 12 months ago by csnover

Resolution: worksforme
Status: closednew

This is still a problem, it is just non-deterministic and requires you to get unlucky and have the engine lock its mutex and make a call to the audio mixer after the audio mixer has already locked its mutex but before it has called the engine’s audio callback. The root problem isn’t solved until the double-mutexes are removed, the data members access by both threads are is replaced with lockless atomics (which OSystem does not have), the audio thread stops making callbacks while holding a lock (C++ Core Guidelines CP.22), or the lock attempts are made non-blocking (also not a feature that OSystem has).

Last edited 12 months ago by csnover (previous) (diff)
Note: See TracTickets for help on using tickets.