[4/5] msmpeg$: Drop inline assembly from ff_msmpeg4_pred_dc

Message ID 1405977670-749434-5-git-send-email-diego@biurrun.de
State New
Headers show

Commit Message

Diego Biurrun July 21, 2014, 9:21 p.m.
Inline assembly has no place in the general C code and the utility
of this microoptimization is doubtful.
---

This is a troll patch, I did not bother to benchmark.

 libavcodec/msmpeg4.c | 27 +--------------------------
 1 file changed, 1 insertion(+), 26 deletions(-)

Comments

Vittorio Giovara July 23, 2014, 7:12 a.m. | #1
On Mon, Jul 21, 2014 at 10:21 PM, Diego Biurrun <diego@biurrun.de> wrote:
> Inline assembly has no place in the general C code and the utility
> of this microoptimization is doubtful.
> ---
this might be ok, a very small benchmark would be appreciated anyway.
also typo in commit, it should be m$mpeg4 ;)
Diego Biurrun Aug. 8, 2014, 8:38 p.m. | #2
On Mon, Jul 21, 2014 at 02:21:09PM -0700, Diego Biurrun wrote:
> Inline assembly has no place in the general C code and the utility
> of this microoptimization is doubtful.
> ---
> 
> This is a troll patch, I did not bother to benchmark.

I had another look at this, the function is only used in other functions
that are part of init code and in vc1_decode_frame under
'if (!context_initialized)', i.e. only once.

This is a silly microoptimization and thus not worth benchmarking IMO...

Diego
Luca Barbato Aug. 8, 2014, 9:39 p.m. | #3
On 08/08/14 22:38, Diego Biurrun wrote:
> On Mon, Jul 21, 2014 at 02:21:09PM -0700, Diego Biurrun wrote:
>> Inline assembly has no place in the general C code and the utility
>> of this microoptimization is doubtful.
>> ---
>>
>> This is a troll patch, I did not bother to benchmark.

Since you are on linux you should use perf, it makes such stuff
non-issues. (as in, DO benchmark even silly stuff).

> I had another look at this, the function is only used in other functions
> that are part of init code and in vc1_decode_frame under
> 'if (!context_initialized)', i.e. only once.
> 
> This is a silly microoptimization and thus not worth benchmarking IMO...

context_initalized is one of those strange stuff from mpegvideo...

lu
Diego Biurrun Aug. 8, 2014, 9:58 p.m. | #4
On Fri, Aug 08, 2014 at 10:38:01PM +0200, Diego Biurrun wrote:
> On Mon, Jul 21, 2014 at 02:21:09PM -0700, Diego Biurrun wrote:
> > Inline assembly has no place in the general C code and the utility
> > of this microoptimization is doubtful.
> > ---
> > 
> > This is a troll patch, I did not bother to benchmark.
> 
> I had another look at this, the function is only used in other functions
> that are part of init code and in vc1_decode_frame under
> 'if (!context_initialized)', i.e. only once.
> 
> This is a silly microoptimization and thus not worth benchmarking IMO...

Scratch that, what was I smoking ...

Diego
Kieran Kunhya Aug. 9, 2014, 12:09 a.m. | #5
>> This is a silly microoptimization and thus not worth benchmarking IMO...
>
> Scratch that, what was I smoking ...

Why are you not benchmarking using the timer functions like we would normally?
Luca Barbato Aug. 9, 2014, 1:25 a.m. | #6
On 09/08/14 02:09, Kieran Kunhya wrote:
>>> This is a silly microoptimization and thus not worth benchmarking IMO...
>>
>> Scratch that, what was I smoking ...
> 
> Why are you not benchmarking using the timer functions like we would normally?

Perf doesn't require hack the code and is incredibly simple to use.

lu
Kieran Kunhya Aug. 9, 2014, 1:31 a.m. | #7
> Perf doesn't require hack the code and is incredibly simple to use.

It doesn't give you a per function measurement as far as I know.
Luca Barbato Aug. 9, 2014, 1:44 a.m. | #8
On 09/08/14 03:31, Kieran Kunhya wrote:
>> Perf doesn't require hack the code and is incredibly simple to use.
> 
> It doesn't give you a per function measurement as far as I know.

perf annotate can go down to the single instructions that I recall.

perf report is enough for the purpose at hand.

lu

Patch

diff --git a/libavcodec/msmpeg4.c b/libavcodec/msmpeg4.c
index 95b5c93..caad6a4 100644
--- a/libavcodec/msmpeg4.c
+++ b/libavcodec/msmpeg4.c
@@ -218,31 +218,6 @@  int ff_msmpeg4_pred_dc(MpegEncContext *s, int n,
         b=c=1024;
     }
 
-    /* XXX: the following solution consumes divisions, but it does not
-       necessitate to modify mpegvideo.c. The problem comes from the
-       fact they decided to store the quantized DC (which would lead
-       to problems if Q could vary !) */
-#if ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE
-    __asm__ volatile(
-        "movl %3, %%eax         \n\t"
-        "shrl $1, %%eax         \n\t"
-        "addl %%eax, %2         \n\t"
-        "addl %%eax, %1         \n\t"
-        "addl %0, %%eax         \n\t"
-        "mull %4                \n\t"
-        "movl %%edx, %0         \n\t"
-        "movl %1, %%eax         \n\t"
-        "mull %4                \n\t"
-        "movl %%edx, %1         \n\t"
-        "movl %2, %%eax         \n\t"
-        "mull %4                \n\t"
-        "movl %%edx, %2         \n\t"
-        : "+b" (a), "+c" (b), "+D" (c)
-        : "g" (scale), "S" (ff_inverse[scale])
-        : "%eax", "%edx"
-    );
-#else
-    /* Divisions are costly everywhere; optimize the most common case. */
     if (scale == 8) {
         a = (a + (8 >> 1)) / 8;
         b = (b + (8 >> 1)) / 8;
@@ -252,7 +227,7 @@  int ff_msmpeg4_pred_dc(MpegEncContext *s, int n,
         b = FASTDIV((b + (scale >> 1)), scale);
         c = FASTDIV((c + (scale >> 1)), scale);
     }
-#endif
+
     /* XXX: WARNING: they did not choose the same test as MPEG4. This
        is very important ! */
     if(s->msmpeg4_version>3){