h264: change deblock_h_chroma_8_mmxext() to prevent valgrind confusion.

Message ID 1329523001-96942-1-git-send-email-martin@martin.st
State Superseded
Headers show

Commit Message

Martin Storsjö Feb. 17, 2012, 11:56 p.m.
From: Michael Niedermayer <michaelni@gmx.at>

This fixes valgrind warnings about using uninitialized data
(bug 181).
---
 libavcodec/x86/h264_deblock.asm |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

Comments

Luca Barbato Feb. 18, 2012, 3:13 a.m. | #1
On 2/18/12 12:56 AM, Martin Storsjö wrote:
> From: Michael Niedermayer<michaelni@gmx.at>
>
> This fixes valgrind warnings about using uninitialized data
> (bug 181).

Why that happens? And more importantly, if ff_chroma_inter_body_mmxext 
gets inlined it should be stated in the comment.

I doubt that function will be changed but still, looks strange.

lu

> ---
>   libavcodec/x86/h264_deblock.asm |    6 +++++-
>   1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
> index f264edb..8335177 100644
> --- a/libavcodec/x86/h264_deblock.asm
> +++ b/libavcodec/x86/h264_deblock.asm
> @@ -835,7 +835,11 @@ cglobal deblock_h_chroma_8_mmxext, 5,7
>       TRANSPOSE4x8_LOAD  bw, wd, dq, PASS8ROWS(t5, r0, r1, t6)
>       movq  buf0, m0
>       movq  buf1, m3
> -    call ff_chroma_inter_body_mmxext
> +    LOAD_MASK  r2d, r3d
> +    movd       m6, [r4] ; tc0
> +    punpcklbw  m6, m6
> +    pand       m7, m6
> +    DEBLOCK_P0_Q0
>       movq  m0, buf0
>       movq  m3, buf1
>       TRANSPOSE8x4B_STORE PASS8ROWS(t5, r0, r1, t6)
Diego Biurrun Feb. 18, 2012, 10:44 a.m. | #2
On Sat, Feb 18, 2012 at 01:56:41AM +0200, Martin Storsjö wrote:
> 
> This fixes valgrind warnings about using uninitialized data
> (bug 181).

Does bug 181 refer to our Bugzilla?

Diego
Martin Storsjö Feb. 18, 2012, 10:45 a.m. | #3
On Sat, 18 Feb 2012, Diego Biurrun wrote:

> On Sat, Feb 18, 2012 at 01:56:41AM +0200, Martin Storsjö wrote:
>>
>> This fixes valgrind warnings about using uninitialized data
>> (bug 181).
>
> Does bug 181 refer to our Bugzilla?

Yes.

// Martin
Martin Storsjö Feb. 18, 2012, 10:46 a.m. | #4
On Sat, 18 Feb 2012, Luca Barbato wrote:

> On 2/18/12 12:56 AM, Martin Storsjö wrote:
>> From: Michael Niedermayer<michaelni@gmx.at>
>> 
>> This fixes valgrind warnings about using uninitialized data
>> (bug 181).
>
> Why that happens?

Not sure exactly, but this patch seems to fix it.

> And more importantly, if ff_chroma_inter_body_mmxext gets inlined it 
> should be stated in the comment.

Ok, I can add it to the commit message.

// Martin
Ronald Bultje Feb. 19, 2012, 12:15 a.m. | #5
Hi,

On Fri, Feb 17, 2012 at 3:56 PM, Martin Storsjö <martin@martin.st> wrote:
> From: Michael Niedermayer <michaelni@gmx.at>
>
> This fixes valgrind warnings about using uninitialized data
> (bug 181).
> ---
>  libavcodec/x86/h264_deblock.asm |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
> index f264edb..8335177 100644
> --- a/libavcodec/x86/h264_deblock.asm
> +++ b/libavcodec/x86/h264_deblock.asm
> @@ -835,7 +835,11 @@ cglobal deblock_h_chroma_8_mmxext, 5,7
>     TRANSPOSE4x8_LOAD  bw, wd, dq, PASS8ROWS(t5, r0, r1, t6)
>     movq  buf0, m0
>     movq  buf1, m3
> -    call ff_chroma_inter_body_mmxext
> +    LOAD_MASK  r2d, r3d
> +    movd       m6, [r4] ; tc0
> +    punpcklbw  m6, m6
> +    pand       m7, m6
> +    DEBLOCK_P0_Q0


[8:42am] <BBB>
wbs: I wish Jumpyshoes would comment
[8:42am] <BBB>
Jumpyshoes: ping?
[8:42am] <BBB>
or perhaps pengvado/dark_shikari
[8:53am] <BBB>
wbs: I don't mind the inlining of that function, but it'd help if we
knew what exactly was different between the two and if it's worth
spending space on
[9:30am] <pengvado>
BBB: valgrind false positive on unix64, but real error on win64.
[9:30am] <pengvado>
in particular, valgrind detects that deblock_h_chroma_8_mmxext stores
data on the stack below rsp across a function call
[9:30am] <pengvado>
which is a violation of calling convention, but just fine in asm
[9:30am]
wm4 left the chat room. (Quit: Leaving)
[9:31am] <pengvado>
<BBB> it's not some important optimization <-- as compared to xmm?
there is no xmm version of this function, because it's only 8 bytes
wide
[9:48am] <BBB>
pengvado: "real error on win64" - why? I thought redzone was legal on any win64?
[10:23am] <pengvado>
BBB: no

So in other words, the valgrind warning is a false positive on unix64,
since yes we use red zone and call at the same time, but that's ok
since the called function is aware of it, since both are handwritten
asm.

Win64, on the other hand, has no redzone and thus the behaviour is
illegal. Therefore, the patch above needs a %if WIN64 around it.

Ronald
Jason Garrett-Glaser Feb. 19, 2012, 2:36 a.m. | #6
%if WIN64ing that is incredibly ugly, I'd rather just deinline it
period, or %if WIN64 the stack padding.

Jason
Loren Merritt Feb. 19, 2012, 2:37 a.m. | #7
On Sat, 18 Feb 2012, Ronald S. Bultje wrote:

> Win64, on the other hand, has no redzone and thus the behaviour is
> illegal. Therefore, the patch above needs a %if WIN64 around it.

Or instead of inlining, you could change the spill on win64.

--Loren Merritt

Patch

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index f264edb..8335177 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -835,7 +835,11 @@  cglobal deblock_h_chroma_8_mmxext, 5,7
     TRANSPOSE4x8_LOAD  bw, wd, dq, PASS8ROWS(t5, r0, r1, t6)
     movq  buf0, m0
     movq  buf1, m3
-    call ff_chroma_inter_body_mmxext
+    LOAD_MASK  r2d, r3d
+    movd       m6, [r4] ; tc0
+    punpcklbw  m6, m6
+    pand       m7, m6
+    DEBLOCK_P0_Q0
     movq  m0, buf0
     movq  m3, buf1
     TRANSPOSE8x4B_STORE PASS8ROWS(t5, r0, r1, t6)