Opened 15 years ago

Closed 15 years ago

Last modified 5 years ago

#4345 closed defect (fixed)

Loading large Smacker movies is slow

Reported by: eriktorbjorn Owned by: bluegr
Priority: normal Component: Video
Version: Keywords:
Cc: Game: Broken Sword 2


Current SVN snapshot

Loading a large Smacker movie, such as the Broken Sword 2 intro, is pretty slow at the moment. (By slow, I mean there's a one-second delay or so on my fairly new computer.) It looks to me like most of the delay is in the BigHuffmanTree::decodeTree() function, but I'm not sure exactly what it's doing so I can't suggest any way of speeding it up.

Ticket imported from: #2794216. Ticket imported from: bugs/4345.

Change History (15)

comment:1 by Kirben, 15 years ago

This problem is even more noticeable in the opening sequences of The Feeble Files, which uses several video files.

The Feeble Files demo (which includes opening sequence) could be used for testing, if anyone doesn't have the game.

comment:2 by fingolfin, 15 years ago

Our smacker decoding code looks a lot less efficient then what FFmpeg has (see <;a=blob;f=libavcodec/smacker.c>, their SVN repository seems to be gone?!). In particular, we do bad things like using Common::Array where they had a simple fixed size array, etc.

Assuming that the FFmpeg code is performant / efficient, I think it should be possible to improve the speed of our code to similar levels.

comment:3 by eriktorbjorn, 15 years ago

The header has fields for mMapSize, mClrSize, fullSize and typeSize. They can probably be used to calculate the size of the array that's needed. If you multiply them by four you get *almost*, but not quite, the final size of the _tree arrays. At least for the few cases I've tried.

comment:4 by SF/madm00se, 15 years ago

The BigHuffmanTree uses a prefix lookup table to speed up decodes, but the SmallHuffmanTrees do not. I didn't do that at first because the small trees are only used during setup and I never got around to it because it wasn't necessary for me. I would suggest that adding this, and figuring out if memory can be pre-allocated, be the first approach to speeding up initial setup.

The bigtree prefix tables use 8 bits of lookup, some googling suggests 9 bits is recommended. This might be a secondary improvement.

comment:5 by fingolfin, 15 years ago

Also note that our Common::Array code is riddled with bounds checking and other assert() statements, which won't help speed either.

Anyway, I just reactivated my old 400 Mhz G4 Mac, which should serve as a great testbed for profiling the smacker code on a low end device (and a big endian one at that, though I'd hope the code is already endian safe ;)

comment:6 by eriktorbjorn, 15 years ago

I timed it a bit more carefully, and on my computer (which claims to have a 3.4 GHz CPU), the delay before the Broken Sword 2 intro was actually a little over 3 seconds.

There were some optimizations made earlier today (pre-allocating a little bit of memory for the trees, and fixes to the Array class, I believe), which has brought the time down to a little over 100 milliseconds, which certainly is good enough on my hardware. I'd be curious to hear the before/after figures for Fingolfin's Mac, though.

comment:7 by SF/madm00se, 15 years ago

I implemented prefix lookup for the SmallHuffmanTree class like it's done already for the BigHuffmanTree.

I can't attach to this bug report so I pasted it here:

Please test if it improves the speed of initialization.

comment:8 by eriktorbjorn, 15 years ago

That patch brings down the load time for the Broken Sword 2 intro from a little over 100 ms to a little over 40 ms on my computer.

comment:9 by SF/madm00se, 15 years ago

This patch implements both the prefix lookup for the SmallHuffmanTree class and it preallocates the tree storage as plain c++ arrays.

It's been tested (lightly) on the BS1 intro sequence.

comment:10 by SF/madm00se, 15 years ago

I apparently forgot to delete[] the _tree-member from the SmallHuffmanTree. That's what you get for switching to plain arrays :-)

I don't have a source tree on this machine (or even svn) so if anybody applies this patch, please add a destructor to the SmallHuffmanTree-class that delete[]s the _tree-member.

comment:11 by SF/madm00se, 15 years ago

Here's another patch against trunk.

This changes the BitStream class to not require a two byte look-ahead, the _trees are now fixed sized, and there's some cleanup.

As before, it works on the BS1 intro sequence. I have not tested it on anything else.

comment:12 by eriktorbjorn, 15 years ago

The latest patch brings the time down to a bit less than 30 ms. (Actually, I've only been timing the creation of the four big Huffman trees, since that was the bottleneck before.) I no longer trust the values to be exact, but it's at least an indication that the patch speeds things up a bit further.

I'd still be curious to hear what difference the patch and the recent changes make on low-end hardware, though.

comment:13 by bluegr, 15 years ago

Owner: set to bluegr
Resolution: fixed
Status: newclosed

comment:14 by bluegr, 15 years ago

Applied madm00se's patch with some minor modifications. I've tested this with the Smacker files of BS1, BS2, FF, FTA2 and Dinotopia, and they all seem to be working, with 0 loading times, so the slow loading issue seems to be gone with this patch.

Thanks for your work once again, madm00se :)


comment:15 by digitall, 5 years ago

Component: --Unset--Video
Game: Broken Sword 2
Note: See TracTickets for help on using tickets.