Opened 5 years ago

Closed 3 years ago

Last modified 3 years ago

#6513 closed defect (fixed)

NEVERHOOD: Some nasty bugs

Reported by: SF/badfiles Owned by: johndoe123
Priority: normal Component: Engine: Neverhood
Keywords: Cc:
Game: The Neverhood

Description

  1. After Willie's death there is a buggy letter from Willie among the red ones from Klogg.

  2. The radio hint is heard only once a ride, though it should be played on every entry.

  3. The game randomly crashes while reading the Chronicles on the wall.

Ticket imported from: bugs/6513.

Attachments (5)

neverhood nasty bugs.rar (48.4 KB) - added by SF/diggly 5 years ago.
valgrind.log (89.0 KB) - added by digitall 5 years ago.
neverhood-debugmessages.txt (4.2 KB) - added by eriktorbjorn 5 years ago.
neverhood-radio.txt (506 bytes) - added by eriktorbjorn 5 years ago.
neverhood-radio2.txt (478 bytes) - added by eriktorbjorn 5 years ago.

Download all attachments as: .zip

Change History (34)

comment:1 Changed 5 years ago by Strangerke

Summary: Some nasty bugsNEVERHOOD: Some nasty bugs

comment:2 Changed 5 years ago by Strangerke

Please next time fill in 3 bug reports, it's much easier to keep track of issues that way. Thanks!

comment:3 Changed 5 years ago by bluegr

Please, savegames where these issues occur

Changed 5 years ago by SF/diggly

Attachment: neverhood nasty bugs.rar added

comment:4 Changed 5 years ago by SF/diggly

I've attached savegames for these.

After Willie's death there is a buggy letter from Willie among the red ones from Klogg.

http://theneverhood.wikia.com/wiki/Kloggs_Letters
The letters appear in the order listed on that wiki page, up until the "I was once naive like you" letter. The next letter after that one should be the "I write well, yes?" letter, but instead the "You should give up" letter appears again, and then the next one is the glitched Willie letter. There should be no Willie letters at this point in the game.
The "I write well, yes?" Klogg letter is completely missing.
Also, if you cycle through the letters again you get a different glitched Willie letter at the same point.

The radio hint is heard only once a ride, though it should be played on every entry.

to recreate this bug from the savegame, go up one screen, then back down. Then go up again, and the radio won't be playing anymore.

The game randomly crashes while reading the Chronicles on the wall.

I read the entire chronicle wall and couldn't recreate this bug. Nevertheless, I've included a savegame of it if anyone else wants to give it a shot.

comment:5 Changed 5 years ago by bluegr

Many thanks for these saved games douglas! Much appreciated :)

I've fixed the buggy Willie letter (an off-by-one error in the function that gets Klogg's letters) in commit 35362716334b83e3af61280a1ffa79198c3285e6.

Will check the other two bugs later on

comment:6 Changed 5 years ago by SF/diggly

I'm still seeing glitched Willie letters, just in a different spot.
After you read all the Klogg letters, continue clicking the mailbox to start reading them again. The glitched Willie letter will be in-between the 1st and 2nd Klogg letters.

comment:7 Changed 5 years ago by eriktorbjorn

I think I have fixed the stray Willie letter. What happened was that when the function that decided which Klogg letter to show wrapped around to the beginning of the list, it would return the same ID twice in a row. This was misintepreted so that the game would pick one of Willie's nonsense letters instead.

(There are two kinds of letters from Willie: Ones that comment on specific parts of the game, and nonsense letters. It picks a nonsense letter if it would otherwise show the same specific letter twice in a row. But when checking for same same ID twice in a row, it doesn't make any difference between Willie and Klogg letters.)

Changed 5 years ago by digitall

Attachment: valgrind.log added

comment:8 Changed 5 years ago by digitall

Loaded savestate 006 from the nasty bugs rar above in the latest Git master on x86_64 to test the random crash reading the chronicles in the hall of records.

This is module 2200. Can't replicate the crash, but tested with Valgrind. No memory access issues, but SEVERE memory leaks are present. This is probably the cause of the crash. My run leaked over 20Mb over 3-4 screens, so reading all 40-ish screens of the full hall will probably leak at least 200MB+ ... Attaching valgrind log so that these leaks can be fixed.

comment:9 Changed 5 years ago by eriktorbjorn

I believe I've fixed a memory leak, hopefully without causing any new and exciting regressions. Digitall, could you check if it made any difference? My computer isn't quite fast enough to do any extensive Valgrind testing of The Neverhood.

comment:10 Changed 5 years ago by digitall

Checking the hall of records again now...

comment:11 Changed 5 years ago by digitall

Right. Major memory leaks are now fixed. One smaller one remains in static data loading and there is now one invalid write :/

==30412== Invalid write of size 1
==30412== at 0x4CD6B3: Neverhood::BaseSurface::copyFrom(Graphics::Surface, short, short, Neverhood::NDrawRect&) (graphics.cpp:124)
==30412== by 0x4CDC15: Neverhood::FontSurface::drawChar(Neverhood::BaseSurface
, short, short, unsigned char) (graphics.cpp:175)
==30412== by 0x4CDC7C: Neverhood::FontSurface::drawString(Neverhood::BaseSurface, short, short, unsigned char const, int) (graphics.cpp:184)
==30412== by 0x481359: Neverhood::Scene2208::drawRow(short) (module2200.cpp:1435)
==30412== by 0x480BA0: Neverhood::Scene2208::Scene2208(Neverhood::NeverhoodEngine, Neverhood::Module, int) (module2200.cpp:1346)
==30412== by 0x47A3F6: Neverhood::Module2200::createScene(int, int) (module2200.cpp:104)
==30412== by 0x47C1CB: Neverhood::Module2200::updateScene() (module2200.cpp:448)
==30412== by 0x43334F: Neverhood::Entity::handleUpdate() (entity.cpp:64)
==30412== by 0x43F040: Neverhood::Module::updateChild() (module.cpp:104)
==30412== by 0x435ACF: Neverhood::GameModule::updateModule() (gamemodule.cpp:620)
==30412== by 0x43334F: Neverhood::Entity::handleUpdate() (entity.cpp:64)
==30412== by 0x41DA6D: Neverhood::NeverhoodEngine::mainLoop() (neverhood.cpp:196)
==30412== Address 0xca44267 is not stack'd, malloc'd or (recently) free'd

==30412== 5,136 (1,040 direct, 4,096 indirect) bytes in 65 blocks are definitely lost in loss record 913 of 981
==30412== at 0x4C2C037: operator new(unsigned long) (vg_replace_malloc.c:298)
==30412== by 0x429182: Neverhood::StaticData::load(char const) (staticdata.cpp:47)
==30412== by 0x41D034: Neverhood::NeverhoodEngine::run() (neverhood.cpp:85)
==30412== by 0x40B00D: runGame(PluginSubclass const
, OSystem&, Common::String const&) (main.cpp:242)
==30412== by 0x40BFFF: scummvm_main (main.cpp:487)
==30412== by 0x409B7F: main (posix-main.cpp:45)

comment:12 Changed 5 years ago by eriktorbjorn

Hmph. I was afraid I might cause regressions like that. The change I made was that when a BaseSurface is deleted, it also deletes the buffer it allocates. Maybe something uses the surface's buffer even after the surface itself has been deleted?

(I still think my change should be the correct thing to do, since I believe it follows the principle of least astonishment.)

comment:13 Changed 5 years ago by digitall

I don't think it actually was you... The issue is only a single bad 1 byte write at the iteration in void BaseSurface::copyFrom(Graphics::Surface sourceSurface, int16 x, int16 y, NDrawRect &sourceRect) in engines/neverhood/graphics.cpp, so I suspect this is one of the images in the hall of records in some case can be read one byte off the end of the buffer i.e. it is widthheight, but the data is only (width*height)-1 long. Probably was there before, but needs an exact case to trigger.

I'll look at closing the other memory leak at engine exit.

comment:14 Changed 5 years ago by digitall

Have located the cause of the static data memory leak and closed in f0a0537095e0b309f768de0b44cd20d05376c1d9.

This is due to entries with duplicate id values in the messageLists part of the neverhood.dat file. These now produce a warning and the previous entry is correctly deleted before the new one is added, removing the leaks.

I have also added a destructor to clear the data allocated. However, it appears this is never called due to the _system->quit() premature exit call in the main loop of neverhood.cpp.

If I remove the EVENT_QUIT handling, then this works as expected with shouldQuit() breaking the main loop and the destruction running.. However, in this case, there are some further leaks shown up by Valgrind to be addressed.

comment:15 Changed 5 years ago by eriktorbjorn

Looking at the first three columns from left to right triggers a Valgrind warning for me. Going straight to the third does not.

comment:16 Changed 5 years ago by eriktorbjorn

Or maybe not... I was able to duplicate the Valgrind warning twice, but then I tried it another two times without success...

comment:17 Changed 5 years ago by eriktorbjorn

It's getting a bit too late at night for me to want to look further into this right now, so I'm just writing down a few things that may or may not be relevant.

Right now, I'm thinking that it's the third column that's special, and that the first two aren't really involved. But I didn't realize at first that where you click on the column affects which rows of the text you will be presented with when the close-up first appears.

I'm still trying to figure out a reliable way to trigger it. Adding out-of-bounds checking to BaseSurface::copyFrom() seems to indicate that simply scrolling through the text should trigger the problem.

One possible reason why the third column is different from the first two is that _maxRowIndex is calculated differently for "stLineagex" in Scene2208's constructor. Perhaps this means that the height of _surface should also be adjusted accordingly in that case?

Changed 5 years ago by eriktorbjorn

Attachment: neverhood-debugmessages.txt added

comment:18 Changed 5 years ago by eriktorbjorn

Digitall asked me to attach a patch with the debug messages I had added while searching for the cause of the invalid writes. This is the most recent version.

comment:19 Changed 5 years ago by digitall

Thanks eriktorbjorn for the investigation so far. Will try using this to work out a stable replication case...

comment:20 Changed 5 years ago by eriktorbjorn

A few more notes. From what I understand of Scene2208::drawRow(), the largest value y will ever assume is 480. (That happens when 'rowIndex' is 10. At 11, it wraps around to 0.)

The character height for the font used on the walls (at least in the version I have) is 60 pixels, though it's possible that the bottom rows are always transparent. Still to be safe from overflows in this case, shouldn't the _background surface be 540 pixels tall, instead of its current 528 pixels?

By the way, this probably holds true for all the columns. While there appears to be a special case for the third one, I think the reason the other ones didn't trigger the error is that the row of text drawn at the bottommost Y coordinate (which apparently has nothing to do with where it eventually appears on the screen) happened to be blank.

In the debug messages I described earlier, the largest Y coordinate I saw used was 532, which means the surface needed to be only 533 pixels tall. But it's probably better to be safe than sorry.

comment:21 Changed 5 years ago by johndoe123

I had a look and the values seem correct after checking the idb (48 for row height and 528 for the bitmap height).
The only thing which is most likely the cause for the bug is that for some reason I didn't implement clipping to BaseSurface::copyFrom. It clips the coordinates in the original.
I'll check more later.

comment:22 Changed 5 years ago by digitall

@john_doe: Thanks for the attention. I haven't had time to look further at this, but if I get any conclusion / patches, I will update here.

Changed 5 years ago by eriktorbjorn

Attachment: neverhood-radio.txt added

comment:23 Changed 5 years ago by eriktorbjorn

This fixes the radio bug for me, but I don't know if it's the correct fix.

Changed 5 years ago by eriktorbjorn

Attachment: neverhood-radio2.txt added

comment:24 Changed 5 years ago by eriktorbjorn

On second thought, this one might make more sense.

The idea, as I understand it, is that _radioMusicInitialized is used to keep track of whether or not the radio music started playing yet since you entered that room. (Not, as I first assumed, to keep track whether the correct radio music had been decided yet.)

The first patch I posted clears this variable when you leave the room. The second clears it when you enter the room. (Which is why I now think the second one makes more sense.)

comment:25 Changed 5 years ago by eriktorbjorn

I've committed the neverhood-radio.txt patch. I've been told that this is how the original game worked.

So that only leaves the Valgrind warnings, I guess.

comment:26 Changed 5 years ago by digitall

eriktorbjorn: Thanks for this. Yes, just that and I think the EVENT_QUIT handling and _system->quit() should be removed. They cause a premature exit which means the engine destructor is never called... However, removing these shows a few new leaks that should be fixed as well.

comment:27 Changed 5 years ago by digitall

A likely fix for the hall of records crashes was committed yesterday by john_doe as bdee71f27922d1ba146323374e4501ec462d687d

eriktorbjorn: So just the _system->quit() removal, leaks and other valgrind issues to go?

comment:28 Changed 3 years ago by sev-

Owner: set to johndoe123
Resolution: fixed
Status: newclosed

comment:29 Changed 3 years ago by sev-

Closing this. Valgriind could be addressed any time.

Note: See TracTickets for help on using tickets.