Opened 17 years ago

Closed 17 years ago

Last modified 17 months ago

#8171 closed patch

CMI: Experimental language.tab optimization

Reported by: eriktorbjorn Owned by: SF/ender
Priority: normal Component: Engine: SCUMM
Keywords: Cc:
Game: Monkey Island 3

Description

This is an experimental patch to speed of the language.tab handling for CMI. Instead of keeping the whole language.tab file in memory and do a linear search every time a text has to be translated, I build an index (a sorted array) of it and use binary search to look up the offset.

At first I had planned to use a balanced tree, but I abandoned that idea when I realized it would easily double the size of the index. Since it's already 118 KB I don't want to do that. Besides, qsort() and bsearch() are almost certainly both less error-prone and better optimized than anything I could come up with.

I guess it shouldn't be too hard to extend this mechanism to also cover The Dig, but I don't have any way to test that.

The difference is particularly noticeable when the game is displaying a conversation menu. The CPU usage still rises alarmingly, leading me to believe that there are further optimizations to do there.

Ticket imported from: #682981. Ticket imported from: patches/276.

Attachments (2)

cmi-language.diff (4.8 KB ) - added by eriktorbjorn 17 years ago.
Patch against a February 7 CVS snapshot
cmi-language2.diff (4.8 KB ) - added by eriktorbjorn 17 years ago.
Updated patch against a February 8 CVS snapshot

Download all attachments as: .zip

Change History (11)

by eriktorbjorn, 17 years ago

Attachment: cmi-language.diff added

Patch against a February 7 CVS snapshot

comment:1 by fingolfin, 17 years ago

If you used the Map<> template, the code would probably be simpler & shorter, wouldn't it?

comment:2 by fingolfin, 17 years ago

Of course that would be just the balanced tree you chose to avoid, although I question the memory savings.<shrug> Endy will disagree because he still believes in the illusion that we can one day make a GBA port, but I think that we have many more pressing issues than that. Oh well.

comment:3 by eriktorbjorn, 17 years ago

Quite possible, but I don't understand templates and I don't have any C++ reference book. Feel free to improve it. ;-)

comment:4 by fingolfin, 17 years ago

Fair enough, though the usage is pretty easy, I am sure you could pick it up.. for an example, look at scumm.h:

typedef ScummVM::Map<ScummVM::String, int> ObjectIDMap;

-> we just declared a new type, ObjectIDMap, which maps ScummVM::Strings (the keys) to ints. The member var _objectIDMap is of that type. In resource.cpp, we insert entries by writing: _objectIDMap[buffer] = i; And in object.cpp we do lookups this way: id3 = _objectIDMap[imhd->v8.name];

That's it. The lookup will assert if you specify a key which is not contained in the map, to find out if a key is in the map you can write _objectIDMap.contains(key). See also common/map.h for other methods.

The biggest drawback of the current implementation: it doesn't rebalance the tree when you insert/add keys, ouch. That's not a problem if the keys are in semi-random order, but if they are sorted, you end up of course with an extremly unbalanced tree. That should be rewritten... though of course life would be much easier if we used std::map which uses a highly optimized red-black tree.

comment:5 by eriktorbjorn, 17 years ago

I didn't realize the Map<> template was our own code.

The balancing may be a problem. At least in the English version, language.tab is "semi-sorted", i.e. there are blocks of sorted keys, but the blocks aren't necessarily sorted.

Gah! I just noticed that some of the keys in language.tab are *eight* characters long. I thought max was *seven*. That means the node structure has to be modified to store a nine-byte tag (since I store null-terminated strings).

The way I figured a tree would double the size of the node structure is that to implement an AVL tree (because unlike red-black trees, I own a book which describes AVL trees reasonably well :-) I'd have to add three pointers (left, right and parent) and balancing information. With four-byte pointers that adds up to at least 13 extra bytes, in addition to the 12 there already are.

Of course, this could be optimized. It should be possible to replace the pointers with two-byte array indexes, use a hash function for the tag, etc. But I wanted to keep it simple, at least to begin with.

by eriktorbjorn, 17 years ago

Attachment: cmi-language2.diff added

Updated patch against a February 8 CVS snapshot

comment:6 by eriktorbjorn, 17 years ago

I think this version fixes the error I mentioned in my previoius comment. I've changed the definition of struct langIndexNode, and loadLanguageBundle().

Looks like I accidentally already had eight (instead of seven) as the upper limit in translateText(), so I've left that unchanged.

comment:7 by SF/ender, 17 years ago

Owner: set to SF/ender
Status: newclosed

comment:8 by SF/ender, 17 years ago

Second version commited, so we have something to work by. Closing - reopen it if you have a newer version :)

comment:9 by digitall, 17 months ago

Component: Engine: SCUMM
Game: Monkey Island 3
Note: See TracTickets for help on using tickets.