Opened 16 years ago

Closed 16 years ago

Last modified 6 months ago

#735 closed defect (worksforme)

ALL: Crash on exit

Reported by: eriktorbjorn Owned by: fingolfin
Priority: high Component: Graphics: Scalers
Keywords: Cc:
Game:

Description

This is going to be one annoyingly vague bug report.
I'm filing it partly because I'd like to know if anyone
else has seen anything like it.

I was replaying parts of Loom CD, using an April 5 CVS
snapshot, and on two occasions it crashed for me.
Here's a stack trace from one of the occasions:

#0 Normal3x(unsigned char*, unsigned, unsigned char*,
unsigned char*, unsigned, int, int) (srcPtr=0x404aab88
"#", srcPitch=646, null=0x0,
dstPtr=0xc91da <Address 0xc91da out of bounds>,
dstPitch=1920, width=34,
height=25) at common/scaler.cpp:792
#1 0x0804dae5 in OSystem_SDL::update_screen()
(this=0x815d798)
at backends/sdl/sdl.cpp:315
#2 0x080521ad in Scumm::mainRun() (this=0x816e960) at
scumm/scummvm.cpp:1662
#3 0x08052396 in Scumm::go() (this=0x816e960) at
scumm/scummvm.cpp:1732
#4 0x080ae56a in main (argc=2, argv=0xbffffbf4) at
common/main.cpp:219

The other one was very similar, except I was using a
different scaler at the time.

From what I could tell by using gdb on the core dump,
_hwscreen->pixels was NULL when the scaler function was
called. Which makes no sense at all to me. I thought
the only way to change _hwscreen while the game is
running was to change scalers, and I wasn't at the time.

Of course, I wasn't able to reproduce it once I added
debugging printf()s and assert()s...

Ticket imported from: #716591. Ticket imported from: bugs/735.

Change History (25)

comment:1 Changed 16 years ago by fingolfin

This doesn't sound that extremly odd, actually: this problem would occur if one tries to acces the pixels of a "real" HW surface when it is not locked. Typical causes for this:
1) the surface isn't locked by the code
2) the code does lock the surface, but then a thread interrupts it and unlocks the surface before the actual blit takes place.

This was it what was (is?) causing us problems in the SMUSH player. We might have to move the mutex from the SMUSH code to the sdl backend itself, so whenever the screen is accessed, we first lock that mutex, then lock the surface/blit/etc., then unlock the surface, then the mutex. However one has to be careful in this, to avoid dead lock possibilities, etc.

comment:2 Changed 16 years ago by fingolfin

If you run into this again, could you check which other threads where running, and attach there stack traces, too?

comment:3 Changed 16 years ago by fingolfin

Priority: normalhigh

comment:4 Changed 16 years ago by eriktorbjorn

Could I have seen that if I had still had the core dumps, or
would I have to run ScummVM in the debugger from the
beginning? I'm not all that familiar with gdb, I'm afraid...

comment:5 Changed 16 years ago by fingolfin

No idea if this works with core dumps.

anyway to get a list of actives threads:
info threads

to get a backtrace for a specific thread (or a list of them):
thread apply 1 bt
thread apply 1 2 bt
etc.
You can use the "thread apply" for other commands, too

comment:6 Changed 16 years ago by fingolfin

Owner: set to fingolfin
Resolution: fixed
Status: newclosed

comment:7 Changed 16 years ago by fingolfin

I put a fix for this into CVS. That is, I think my change fixes this... if
you ever see it again, please reopen.

comment:8 Changed 16 years ago by eriktorbjorn

Status: closednew

comment:9 Changed 16 years ago by eriktorbjorn

I did see it, or something similar, in Loom CD again a few
days ago. I (stupidly) didn't save the backtrace or the core
dump, but I remember it crashed in one of the scaler
functions and the srcPtr parameter looked strange so
_tmpscreen->pixels may have been 0.

comment:10 Changed 16 years ago by SF/ender

Er, is this still happening?

comment:11 Changed 16 years ago by eriktorbjorn

I don't know. It's only happened a few times, and never in
any predictable fashion. I don't know if we should close the
bug report, or keep it open during the test period, in case
it happens to someone else.

comment:12 Changed 16 years ago by SF/ender

I'm going to close it for now I think, as I havn't seen it
or heard of anyone else encountering it - if you see the
problem again just reopen this.

comment:13 Changed 16 years ago by SF/ender

Owner: changed from fingolfin to SF/ender
Status: newclosed

comment:14 Changed 16 years ago by fingolfin

Well there is definitely still one (or more) crashing bugs when you
exit ScummVM. At least with the SDL backend. I regulary get
those, e.g. when I play FOA or MI2 with non-adlib midi driver,
then very often when I quit, ScummVM either segfaults or dies
with an error from the OS telling me that free() was either called
on something already free'd or on a random value (often it's on an
odd number; since over here memory returned by malloc() is
aligned to 16 byte boundaries, that means it must be a bogus
value).

comment:15 Changed 16 years ago by fingolfin

Owner: SF/ender deleted
Status: closednew
Summary: Mysterious crash in SDL backendCrash on exit

comment:16 Changed 16 years ago by eriktorbjorn

I'm pretty sure there are cases where we free() stuff that
was never allocated. As long as the pointer variable is NULL
it should be ok (though not, I think, very pretty), but
maybe it's completely uninitialized in some case...

comment:17 Changed 16 years ago by fingolfin

Component: Engine: SCUMM--Unset--
Game: Loom
Summary: Crash on exitALL: Crash on exit

comment:18 Changed 16 years ago by fingolfin

I haven't seen any of these crashes-on-exit recently. Of course
that doesn't meant much :-)

"I'm pretty sure there are cases where we free() stuff that was
never allocated" - you are? hm... Valgrind should catch all and
every of those, and so does my OS (it complains with a warning
message in the console if a program does a double-free, or a free
on a random value).

comment:19 Changed 16 years ago by eriktorbjorn

> you are? hm... Valgrind should catch all and every of those,
> and so does my OS (it complains with a warning message in
> the console if a program does a double-free, or a free on a
> random value).

I meant freeing a NULL pointer. I get plenty of hits if I
add a check for ptr == NULL in free_check(). But supposedly
that's allowed. (Which doesn't mean I have to like it, of
course. :-)

comment:20 Changed 16 years ago by fingolfin

"free(0);" is legal, just as "delete 0;" is legal. Knowing that, one
can replace this code:
if (foo) free(foo);
by simply
free(foo);
Don't see why you seem to think adding a (redundant) if is better,
but then it's probably a matter of taste :-)

comment:21 Changed 16 years ago by SF/ender

What is the status of this item?

comment:22 Changed 16 years ago by fingolfin

Can't reproduce this anymore

comment:23 Changed 16 years ago by fingolfin

Resolution: fixedworksforme
Status: newclosed

comment:24 Changed 10 years ago by Kirben

Owner: set to fingolfin

comment:25 Changed 6 months ago by digitall

Component: --Unset--Graphics: Scalers
Note: See TracTickets for help on using tickets.