Opened 13 years ago

Closed 13 years ago

Last modified 13 months ago

#8660 closed patch (outdated)

ARM: Inline assembly for saturated addition of sound

Reported by: SF/robinwatts Owned by: SF/knakos
Priority: normal Component: Audio
Keywords: Cc:
Game:

Description

The "clampedAdd" function is used to do saturated addition of sound sample values. There is a trick that can be used on ARM chips to do this operation that cannot easily be expressed in C.

This patch uses gcc's inline assembler to implement the trick.

Robin

Ticket imported from: #1717959. Ticket imported from: patches/765.

Attachments (1)

diff (672 bytes ) - added by SF/robinwatts 13 years ago.
Inline assembly ARM version of clampedAdd

Download all attachments as: .zip

Change History (14)

by SF/robinwatts, 13 years ago

Attachment: diff added

Inline assembly ARM version of clampedAdd

comment:1 by fingolfin, 13 years ago

Owner: set to SF/knakos

comment:2 by SF/knakos, 13 years ago

Summary: WINCE Inline ARM assembly for saturated addition of soundARM: Inline assembly for saturated addition of sound

comment:3 by SF/knakos, 13 years ago

1) I'm dropping the 'WINCE' summary prefix as this applies to all arm ports.

2) The routine looks good, implementing the clamped addition by means of the overflow flag and conditional execution trickery. Note that the ST_SAMPLE_MAX and ST_SAMPLE_MIN are implied since all calculations take place in the 16 msbits of the regs. Aside from the code compactness, the main gain here is the non-conditional branching way this is coded, which results in no instruction pipeline stalls. The compiler won't implement such stuff this efficiently although I haven't check it to make sure.

So two questions arise: 1) Does this make so much a difference in speed of execution? I get that it's real-time etc, but do we really need it? Don't get me wrong, I get anything I can take and would incorporate such a routine immediately, but considering also question 2) Should we clutter the scummvm core with port-specific routines? This is not something I can answer and would expect a response from the team leaders before going through with it. Also to be taken into account is the fact that this code resides in a header file, which gets less in the way when browsing code.

comment:4 by SF/robinwatts, 13 years ago

1) Does it make that much difference to the speed of execution - well, staring at the assembler output, it looks to save a couple of cycles per sample. Normally, I'd reckon that was well worth having in a sound fill routine - but looking at how the rest of the routine is compiled by gcc, I'm not sure it'll actually make that much difference :(

2) Should we clutter the core with port specific routines - I understand your reservations here, and I agree with them in principle - nothing can be more nasty to read than a huge mass of ifdefs. If the scummvm team wish to refuse the patch for this reason I'd entirely understand.

Other parts of the system (such as scalers) do appear to have assembler versions, but these are mostly in separate files - maybe I should consider just ARM coding the complete sound fill routines?

comment:5 by SF/robinwatts, 13 years ago

I have ARM code versions of the Simple and Copy rate controllers working here. I intend to look at an ARM version of the Linear rate controller this weekend. So this patch is probably irrelavent now.

comment:6 by SF/knakos, 13 years ago

Deassigning myself until:

1) Robin submits another patch with more functions (in which case we can close this), or
2) Until we get a response from the team leaders about their intention include such code fragments in the core (see cluttering argument below)

comment:7 by SF/knakos, 13 years ago

Owner: SF/knakos removed

comment:8 by sev-, 13 years ago

Well, clobbering our core code with assembly code is not acceptable.

It could be addressed at least this way:

#if !defined(ARM)
static inline void clampedAdd(int16& a, int b) {
...
#else
#include "my/arm/specific/code.cpp"
#endif

At least this will be less distractive.

comment:9 by SF/robinwatts, 13 years ago

Status: newclosed

comment:10 by SF/robinwatts, 13 years ago

I have raised a new patch (1721826) with fully ARM code optimised versions of the rate conversion/mixing code. These work on WinCE (for all the games I have tested them with), and should work on other ARM based platforms too.

Hopefully the way that patch is structured (all changes in new files) will be acceptable to your 'neatness' criteria.

I am therefore closing this patch. I trust this is OK?

Thanks,

Robin

comment:11 by sev-, 13 years ago

Yes, this is fine.

comment:12 by sev-, 13 years ago

Owner: set to SF/knakos
Resolution: outdated

comment:13 by digitall, 13 months ago

Component: Audio
Note: See TracTickets for help on using tickets.