Opened 16 years ago

Closed 16 years ago

Last modified 8 months ago

#8171 closed patch

CMI: Experimental language.tab optimization

Reported by: eriktorbjorn Owned by: SF/ender
Priority: normal Component: Engine: SCUMM
Keywords: Cc:
Game: Monkey Island 3

Description

This is an experimental patch to speed of the
language.tab handling for CMI. Instead of keeping the
whole language.tab file in memory and do a linear
search every time a text has to be translated, I build
an index (a sorted array) of it and use binary search
to look up the offset.

At first I had planned to use a balanced tree, but I
abandoned that idea when I realized it would easily
double the size of the index. Since it's already 118 KB
I don't want to do that. Besides, qsort() and bsearch()
are almost certainly both less error-prone and better
optimized than anything I could come up with.

I guess it shouldn't be too hard to extend this
mechanism to also cover The Dig, but I don't have any
way to test that.

The difference is particularly noticeable when the game
is displaying a conversation menu. The CPU usage still
rises alarmingly, leading me to believe that there are
further optimizations to do there.

Ticket imported from: #682981. Ticket imported from: patches/276.

Attachments (2)

cmi-language.diff (4.8 KB) - added by eriktorbjorn 16 years ago.
Patch against a February 7 CVS snapshot
cmi-language2.diff (4.8 KB) - added by eriktorbjorn 16 years ago.
Updated patch against a February 8 CVS snapshot

Download all attachments as: .zip

Change History (11)

Changed 16 years ago by eriktorbjorn

Attachment: cmi-language.diff added

Patch against a February 7 CVS snapshot

comment:1 Changed 16 years ago by fingolfin

If you used the Map<> template, the code would probably be simpler & shorter, wouldn't it?

comment:2 Changed 16 years ago by fingolfin

Of course that would be just the balanced tree you chose to avoid, although I question the memory savings.<shrug> Endy will disagree because he still believes in the illusion that we can one day make a GBA port, but I think that we have many more pressing issues than that. Oh well.

comment:3 Changed 16 years ago by eriktorbjorn

Quite possible, but I don't understand templates and I don't
have any C++ reference book. Feel free to improve it. ;-)

comment:4 Changed 16 years ago by fingolfin

Fair enough, though the usage is pretty easy, I am sure you could pick it up.. for an example, look at scumm.h:

typedef ScummVM::Map<ScummVM::String, int> ObjectIDMap;

-> we just declared a new type, ObjectIDMap, which maps ScummVM::Strings (the keys) to ints. The member var _objectIDMap is of that type.
In resource.cpp, we insert entries by writing:
_objectIDMap[buffer] = i;
And in object.cpp we do lookups this way:
id3 = _objectIDMap[imhd->v8.name];

That's it. The lookup will assert if you specify a key which is not contained in the map, to find out if a key is in the map you can write _objectIDMap.contains(key). See also common/map.h for other methods.

The biggest drawback of the current implementation: it doesn't rebalance the tree when you insert/add keys, ouch. That's not a problem if the keys are in semi-random order, but if they are sorted, you end up of course with an extremly unbalanced tree. That should be rewritten... though of course life would be much easier if we used std::map which uses a highly optimized red-black tree.

comment:5 Changed 16 years ago by eriktorbjorn

I didn't realize the Map<> template was our own code.

The balancing may be a problem. At least in the English
version, language.tab is "semi-sorted", i.e. there are
blocks of sorted keys, but the blocks aren't necessarily sorted.

Gah! I just noticed that some of the keys in language.tab
are *eight* characters long. I thought max was *seven*. That
means the node structure has to be modified to store a
nine-byte tag (since I store null-terminated strings).

The way I figured a tree would double the size of the node
structure is that to implement an AVL tree (because unlike
red-black trees, I own a book which describes AVL trees
reasonably well :-) I'd have to add three pointers (left,
right and parent) and balancing information. With four-byte
pointers that adds up to at least 13 extra bytes, in
addition to the 12 there already are.

Of course, this could be optimized. It should be possible to
replace the pointers with two-byte array indexes, use a hash
function for the tag, etc. But I wanted to keep it simple,
at least to begin with.

Changed 16 years ago by eriktorbjorn

Attachment: cmi-language2.diff added

Updated patch against a February 8 CVS snapshot

comment:6 Changed 16 years ago by eriktorbjorn

I think this version fixes the error I mentioned in my
previoius comment. I've changed the definition of struct
langIndexNode, and loadLanguageBundle().

Looks like I accidentally already had eight (instead of
seven) as the upper limit in translateText(), so I've left
that unchanged.

comment:7 Changed 16 years ago by SF/ender

Owner: set to SF/ender
Status: newclosed

comment:8 Changed 16 years ago by SF/ender

Second version commited, so we have something to work by.
Closing - reopen it if you have a newer version :)

comment:9 Changed 8 months ago by digitall

Component: Engine: SCUMM
Game: Monkey Island 3
Note: See TracTickets for help on using tickets.