Opened 12 years ago

Closed 12 years ago

Last modified 8 months ago

#8660 closed patch (outdated)

ARM: Inline assembly for saturated addition of sound

Reported by: SF/robinwatts Owned by: SF/knakos
Priority: normal Component: Audio
Keywords: Cc:
Game:

Description

The "clampedAdd" function is used to do saturated addition of sound sample values. There is a trick that can be used on ARM chips to do this operation that cannot easily be expressed in C.

This patch uses gcc's inline assembler to implement the trick.

Robin

Ticket imported from: #1717959. Ticket imported from: patches/765.

Attachments (1)

diff (672 bytes) - added by SF/robinwatts 12 years ago.
Inline assembly ARM version of clampedAdd

Download all attachments as: .zip

Change History (14)

Changed 12 years ago by SF/robinwatts

Attachment: diff added

Inline assembly ARM version of clampedAdd

comment:1 Changed 12 years ago by fingolfin

Owner: set to SF/knakos

comment:2 Changed 12 years ago by SF/knakos

Summary: WINCE Inline ARM assembly for saturated addition of soundARM: Inline assembly for saturated addition of sound

comment:3 Changed 12 years ago by SF/knakos

1) I'm dropping the 'WINCE' summary prefix as this applies to all arm ports.

2) The routine looks good, implementing the clamped addition by means of the overflow flag and conditional execution trickery. Note that the ST_SAMPLE_MAX and ST_SAMPLE_MIN are implied since all calculations take place in the 16 msbits of the regs. Aside from the code compactness, the main gain here is the non-conditional branching way this is coded, which results in no instruction pipeline stalls. The compiler won't implement such stuff this efficiently although I haven't check it to make sure.

So two questions arise: 1) Does this make so much a difference in speed of execution? I get that it's real-time etc, but do we really need it? Don't get me wrong, I get anything I can take and would incorporate such a routine immediately, but considering also question 2) Should we clutter the scummvm core with port-specific routines? This is not something I can answer and would expect a response from the team leaders before going through with it. Also to be taken into account is the fact that this code resides in a header file, which gets less in the way when browsing code.

comment:4 Changed 12 years ago by SF/robinwatts

1) Does it make that much difference to the speed of execution - well, staring at the assembler output, it looks to save a couple of cycles per sample. Normally, I'd reckon that was well worth having in a sound fill routine - but looking at how the rest of the routine is compiled by gcc, I'm not sure it'll actually make that much difference :(

2) Should we clutter the core with port specific routines - I understand your reservations here, and I agree with them in principle - nothing can be more nasty to read than a huge mass of ifdefs. If the scummvm team wish to refuse the patch for this reason I'd entirely understand.

Other parts of the system (such as scalers) do appear to have assembler versions, but these are mostly in separate files - maybe I should consider just ARM coding the complete sound fill routines?

comment:5 Changed 12 years ago by SF/robinwatts

I have ARM code versions of the Simple and Copy rate controllers working here. I intend to look at an ARM version of the Linear rate controller this weekend. So this patch is probably irrelavent now.

comment:6 Changed 12 years ago by SF/knakos

Deassigning myself until:

1) Robin submits another patch with more functions (in which case we can close this), or
2) Until we get a response from the team leaders about their intention include such code fragments in the core (see cluttering argument below)

comment:7 Changed 12 years ago by SF/knakos

Owner: SF/knakos deleted

comment:8 Changed 12 years ago by sev-

Well, clobbering our core code with assembly code is not acceptable.

It could be addressed at least this way:

#if !defined(ARM)
static inline void clampedAdd(int16& a, int b) {
...
#else
#include "my/arm/specific/code.cpp"
#endif

At least this will be less distractive.

comment:9 Changed 12 years ago by SF/robinwatts

Status: newclosed

comment:10 Changed 12 years ago by SF/robinwatts

I have raised a new patch (1721826) with fully ARM code optimised versions of the rate conversion/mixing code. These work on WinCE (for all the games I have tested them with), and should work on other ARM based platforms too.

Hopefully the way that patch is structured (all changes in new files) will be acceptable to your 'neatness' criteria.

I am therefore closing this patch. I trust this is OK?

Thanks,

Robin

comment:11 Changed 12 years ago by sev-

Yes, this is fine.

comment:12 Changed 12 years ago by sev-

Owner: set to SF/knakos
Resolution: outdated

comment:13 Changed 8 months ago by digitall

Component: Audio
Note: See TracTickets for help on using tickets.