Opened 18 years ago

Closed 17 years ago

Last modified 2 years ago

#735 closed defect (worksforme)

ALL: Crash on exit

Reported by: eriktorbjorn Owned by: fingolfin
Priority: high Component: Graphics: Scalers
Keywords: Cc:
Game:

Description

This is going to be one annoyingly vague bug report. I'm filing it partly because I'd like to know if anyone else has seen anything like it.

I was replaying parts of Loom CD, using an April 5 CVS snapshot, and on two occasions it crashed for me. Here's a stack trace from one of the occasions:

#0 Normal3x(unsigned char*, unsigned, unsigned char*, unsigned char*, unsigned, int, int) (srcPtr=0x404aab88 "#", srcPitch=646, null=0x0, dstPtr=0xc91da <Address 0xc91da out of bounds>, dstPitch=1920, width=34, height=25) at common/scaler.cpp:792 #1 0x0804dae5 in OSystem_SDL::update_screen() (this=0x815d798) at backends/sdl/sdl.cpp:315 #2 0x080521ad in Scumm::mainRun() (this=0x816e960) at scumm/scummvm.cpp:1662 #3 0x08052396 in Scumm::go() (this=0x816e960) at scumm/scummvm.cpp:1732 #4 0x080ae56a in main (argc=2, argv=0xbffffbf4) at common/main.cpp:219

The other one was very similar, except I was using a different scaler at the time.

From what I could tell by using gdb on the core dump, _hwscreen->pixels was NULL when the scaler function was called. Which makes no sense at all to me. I thought the only way to change _hwscreen while the game is running was to change scalers, and I wasn't at the time.

Of course, I wasn't able to reproduce it once I added debugging printf()s and assert()s...

Ticket imported from: #716591. Ticket imported from: bugs/735.

Change History (25)

comment:1 by fingolfin, 18 years ago

This doesn't sound that extremly odd, actually: this problem would occur if one tries to acces the pixels of a "real" HW surface when it is not locked. Typical causes for this: 1) the surface isn't locked by the code 2) the code does lock the surface, but then a thread interrupts it and unlocks the surface before the actual blit takes place.

This was it what was (is?) causing us problems in the SMUSH player. We might have to move the mutex from the SMUSH code to the sdl backend itself, so whenever the screen is accessed, we first lock that mutex, then lock the surface/blit/etc., then unlock the surface, then the mutex. However one has to be careful in this, to avoid dead lock possibilities, etc.

comment:2 by fingolfin, 18 years ago

If you run into this again, could you check which other threads where running, and attach there stack traces, too?

comment:3 by fingolfin, 18 years ago

Priority: normalhigh

comment:4 by eriktorbjorn, 18 years ago

Could I have seen that if I had still had the core dumps, or would I have to run ScummVM in the debugger from the beginning? I'm not all that familiar with gdb, I'm afraid...

comment:5 by fingolfin, 18 years ago

No idea if this works with core dumps.

anyway to get a list of actives threads: info threads

to get a backtrace for a specific thread (or a list of them): thread apply 1 bt thread apply 1 2 bt etc. You can use the "thread apply" for other commands, too

comment:6 by fingolfin, 18 years ago

Owner: set to fingolfin
Resolution: fixed
Status: newclosed

comment:7 by fingolfin, 18 years ago

I put a fix for this into CVS. That is, I think my change fixes this... if you ever see it again, please reopen.

comment:8 by eriktorbjorn, 17 years ago

Status: closednew

comment:9 by eriktorbjorn, 17 years ago

I did see it, or something similar, in Loom CD again a few days ago. I (stupidly) didn't save the backtrace or the core dump, but I remember it crashed in one of the scaler functions and the srcPtr parameter looked strange so _tmpscreen->pixels may have been 0.

comment:10 by SF/ender, 17 years ago

Er, is this still happening?

comment:11 by eriktorbjorn, 17 years ago

I don't know. It's only happened a few times, and never in any predictable fashion. I don't know if we should close the bug report, or keep it open during the test period, in case it happens to someone else.

comment:12 by SF/ender, 17 years ago

I'm going to close it for now I think, as I havn't seen it or heard of anyone else encountering it - if you see the problem again just reopen this.

comment:13 by SF/ender, 17 years ago

Owner: changed from fingolfin to SF/ender
Status: newclosed

comment:14 by fingolfin, 17 years ago

Well there is definitely still one (or more) crashing bugs when you exit ScummVM. At least with the SDL backend. I regulary get those, e.g. when I play FOA or MI2 with non-adlib midi driver, then very often when I quit, ScummVM either segfaults or dies with an error from the OS telling me that free() was either called on something already free'd or on a random value (often it's on an odd number; since over here memory returned by malloc() is aligned to 16 byte boundaries, that means it must be a bogus value).

comment:15 by fingolfin, 17 years ago

Owner: SF/ender removed
Status: closednew
Summary: Mysterious crash in SDL backendCrash on exit

comment:16 by eriktorbjorn, 17 years ago

I'm pretty sure there are cases where we free() stuff that was never allocated. As long as the pointer variable is NULL it should be ok (though not, I think, very pretty), but maybe it's completely uninitialized in some case...

comment:17 by fingolfin, 17 years ago

Component: Engine: SCUMM--Unset--
Game: Loom
Summary: Crash on exitALL: Crash on exit

comment:18 by fingolfin, 17 years ago

I haven't seen any of these crashes-on-exit recently. Of course that doesn't meant much :-)

"I'm pretty sure there are cases where we free() stuff that was never allocated" - you are? hm... Valgrind should catch all and every of those, and so does my OS (it complains with a warning message in the console if a program does a double-free, or a free on a random value).

comment:19 by eriktorbjorn, 17 years ago

> you are? hm... Valgrind should catch all and every of those, > and so does my OS (it complains with a warning message in > the console if a program does a double-free, or a free on a > random value).

I meant freeing a NULL pointer. I get plenty of hits if I add a check for ptr == NULL in free_check(). But supposedly that's allowed. (Which doesn't mean I have to like it, of course. :-)

comment:20 by fingolfin, 17 years ago

"free(0);" is legal, just as "delete 0;" is legal. Knowing that, one can replace this code: if (foo) free(foo); by simply free(foo); Don't see why you seem to think adding a (redundant) if is better, but then it's probably a matter of taste :-)

comment:21 by SF/ender, 17 years ago

What is the status of this item?

comment:22 by fingolfin, 17 years ago

Can't reproduce this anymore

comment:23 by fingolfin, 17 years ago

Resolution: fixedworksforme
Status: newclosed

comment:24 by Kirben, 11 years ago

Owner: set to fingolfin

comment:25 by digitall, 2 years ago

Component: --Unset--Graphics: Scalers
Note: See TracTickets for help on using tickets.