Opened 2 years ago

Closed 2 years ago

#10113 closed enhancement (wontfix)

TITANIC: Optimization improvements for low-end systems

Reported by: dafioram Owned by: dafioram
Priority: normal Component: Engine: Titanic
Keywords: Cc:
Game: Starship Titanic

Description

I was able to complete the game on my raspberry pi 3 and it was very playable. I think the user experience could be greatly improved on this system and other low-end systems by targeting parts of the game that I saw during my playthrough that were the most laggy.

I will now list the specific parts in the order of most laggy to least:

  1. Bellbot walking in/out when called Demo: https://streamable.com/qlq81
  2. Deskbot animating Demo: https://streamable.com/96asy
  3. Opening SGT room and riding SGT minilifts up/down
  4. Arborteum, opening/closing vines to season exhibit
  5. Arborteum, raining animation in season exhibit
  6. Starlings Flying around before fan is on
  7. Saving a games take 6 seconds (loading is fast)
  8. Riding Pellerator
  9. Parrot Animating
  10. Final Star Puzzle

I had --release-mode as a config parameter.

My playthrough started on 660f7bf11479222689cf77060077fe47d65a465e (August 8th) and then jumped to 2df37f4eb0dc661c0159ca397e66ef6fe19de64c (August 10th).

Attachments (8)

titanic-win.030 (109.5 KB ) - added by dafioram 2 years ago.
Arboretum Winter (Rowbot singing)
titanic-win.021 (106.6 KB ) - added by dafioram 2 years ago.
Starlings flying around
titanic-win.044 (105.5 KB ) - added by dafioram 2 years ago.
After finishing titania, bridge is accessible and star puzzle
profileNoPR.txt (11.8 KB ) - added by dafioram 2 years ago.
profile without PR975 applied
profilePR975.txt (12.1 KB ) - added by dafioram 2 years ago.
profile with PR975 applied
prof1fc708b43f.txt (12.0 KB ) - added by dafioram 2 years ago.
Profile with blit changes
profiled7d75d97fda6.txt (11.6 KB ) - added by dafioram 2 years ago.
Profile without blit changes
gprofRPI3_1fc708b43f.zip (433.1 KB ) - added by dafioram 2 years ago.
gprof of titanic on rpi3 with release for commit with wjp blit changes

Download all attachments as: .zip

Change History (18)

by dafioram, 2 years ago

Attachment: titanic-win.030 added

Arboretum Winter (Rowbot singing)

by dafioram, 2 years ago

Attachment: titanic-win.021 added

Starlings flying around

by dafioram, 2 years ago

Attachment: titanic-win.044 added

After finishing titania, bridge is accessible and star puzzle

comment:1 by hamakei, 2 years ago

The game does seem to use a LOT of CPU time...I think a lot of it is from video playback. I'm running a quad-core 3.9Ghz system here, and frequently see my CPU usage go up to anything from 80-130% during elevator or rowboat rides...considering the original could run on a 100Mhz Pentium that's quite a leap.

comment:2 by wjp, 2 years ago

Two large parts of this are video stream reading ( https://github.com/scummvm/scummvm/pull/975 tries to take care of part of this), and the blitting of video frames, which is currently rather inefficient due to an unnecessarily large amount of logic in the blitter inner loops.

comment:3 by dafioram, 2 years ago

I did an earlier commit on my rpi3 with PR975 merged in and I didn't notice a significant difference in the laggyness for the first two spots in the game that I identified above (bellbot and deskbot).

So it is a step in the right direction, but I hope more work is done beyond PR975. 90% of the game runs smooth as butter so I am not asking for optimization for the same of optimization, but really trying to focus it on key aspects where improvement to speed would be noticeable from a players stand point.

I need to add another item to the list and it is:

  1. Waiting for the text parser to respond to the user after hitting enter for typed in text.

comment:4 by wjp, 2 years ago

Have you tried profiling any of these scenes on your rpi3?

comment:5 by dafioram, 2 years ago

I haven't done that yet. Were you think that I should build scummvm with profiling on or use something more like valgrind?

I don't think it would be possible to run valgrind with scummvm running on my rpi3. I can run valgrind on a more powerful machine, that won't reveal the exact limitations that my raspberry pi 3 is subject to, but it should reveal what resources are getting used a lot.

comment:6 by wjp, 2 years ago

The former.

comment:7 by dafioram, 2 years ago

I build scummvm with the profiler option and when I run it I get a gmon.out file, but when I try to run gprof on it grpof says that scummvm has no symbols. Any ideas?

by dafioram, 2 years ago

Attachment: profileNoPR.txt added

profile without PR975 applied

by dafioram, 2 years ago

Attachment: profilePR975.txt added

profile with PR975 applied

comment:8 by dafioram, 2 years ago

I was able to reproduce the lag on my linux vm. It does seem to be a cpu limitation. When running top while playing the CPU would hit 100% while the memory would only use about 3-4% of 1GB (RPI3 has 1GB). I ran the profiler with the 1GB and 1CPU (a 3.2GHz) setup with PR975 not applied and then applied with it applied to commit 42c6f68f7a14772ade8902826a308d3fbc2f82cf (Aug 20).

For the test I spent about 60 seconds in scummvm, I called the bellbot and talked to him a bunch of times, since when you talk to him he animates and causes lag.

I only kept about the first 100 lines so that the files would not be huge.

by dafioram, 2 years ago

Attachment: prof1fc708b43f.txt added

Profile with blit changes

by dafioram, 2 years ago

Attachment: profiled7d75d97fda6.txt added

Profile without blit changes

comment:9 by dafioram, 2 years ago

I ran both versions on my linux VM in release mode. Both ran smoothly so I will need to run on RP3, next, to see if there is a visual difference.

I was talking to the bellbot a lot so that is why there are so many TTscriptMappings::load() calls. I did that because talking to the bellbot has him animate, so the TTscriptMappings::load() calls can be ignored, although that might be an area to improve also.

Graphics::ManagedSurface::blitFrom went from the 5th top item to the 6th and it takes about 80% as long as it did before (50 msecs shorter).

Titanic::CVideoSurface::transBlitRect was called about the same number of times in both, but it takes about twice as long in the blit changed one (230 msecs vs. 390 msecs). So maybe inlining Titanic::CVideoSurface::transBlitRect would help.

by dafioram, 2 years ago

Attachment: gprofRPI3_1fc708b43f.zip added

gprof of titanic on rpi3 with release for commit with wjp blit changes

comment:10 by dafioram, 2 years ago

Owner: set to dafioram
Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.