[2/2] arm: vp9itxfm: Use cbz for a short jump forward

Message ID 1479456399-18568-2-git-send-email-martin@martin.st
State Rejected
Headers show

Commit Message

Martin Storsjö Nov. 18, 2016, 8:06 a.m.
---
The uses of cmp rX, #0, beq in the vp9 loop filter can't be converted
to cbz, since the branch targets are too far away.
---
 libavcodec/arm/vp9itxfm_neon.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Janne Grunau Nov. 18, 2016, 7:32 p.m. | #1
On 2016-11-18 10:06:39 +0200, Martin Storsjö wrote:
> ---
> The uses of cmp rX, #0, beq in the vp9 loop filter can't be converted
> to cbz, since the branch targets are too far away.
> ---
>  libavcodec/arm/vp9itxfm_neon.S | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
> index 46d91b7..17db69b 100644
> --- a/libavcodec/arm/vp9itxfm_neon.S
> +++ b/libavcodec/arm/vp9itxfm_neon.S
> @@ -724,8 +724,7 @@ function \txfm\()16_1d_4x16_pass2_neon
>  .irp i, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27
>          vld1.16         {d\i}, [r2,:64], r12
>  .endr
> -        cmp             r3,  #0
> -        beq             1f
> +        cbz             r3,  1f

change the 'mov r3, #\i' into 'movs r3, #\i'. Since 
\txfm\()16_1d_4x16_pass2_neon are not public we can pass arguments in 
CPSR bits

Janne
Martin Storsjö Nov. 18, 2016, 9:15 p.m. | #2
On Fri, 18 Nov 2016, Janne Grunau wrote:

> On 2016-11-18 10:06:39 +0200, Martin Storsjö wrote:
>> ---
>> The uses of cmp rX, #0, beq in the vp9 loop filter can't be converted
>> to cbz, since the branch targets are too far away.
>> ---
>>  libavcodec/arm/vp9itxfm_neon.S | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
>> index 46d91b7..17db69b 100644
>> --- a/libavcodec/arm/vp9itxfm_neon.S
>> +++ b/libavcodec/arm/vp9itxfm_neon.S
>> @@ -724,8 +724,7 @@ function \txfm\()16_1d_4x16_pass2_neon
>>  .irp i, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27
>>          vld1.16         {d\i}, [r2,:64], r12
>>  .endr
>> -        cmp             r3,  #0
>> -        beq             1f
>> +        cbz             r3,  1f
>
> change the 'mov r3, #\i' into 'movs r3, #\i'. Since
> \txfm\()16_1d_4x16_pass2_neon are not public we can pass arguments in
> CPSR bits

Thanks for the pointer. I've got some other changes potentially coming up 
that touch the same area, which might clobber the CPSR bits though, so 
I'll hold off this part for now, at least until after I'm done with those 
parts.

// Martin

Patch

diff --git a/libavcodec/arm/vp9itxfm_neon.S b/libavcodec/arm/vp9itxfm_neon.S
index 46d91b7..17db69b 100644
--- a/libavcodec/arm/vp9itxfm_neon.S
+++ b/libavcodec/arm/vp9itxfm_neon.S
@@ -724,8 +724,7 @@  function \txfm\()16_1d_4x16_pass2_neon
 .irp i, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27
         vld1.16         {d\i}, [r2,:64], r12
 .endr
-        cmp             r3,  #0
-        beq             1f
+        cbz             r3,  1f
 .irp i, 28, 29, 30, 31
         vld1.16         {d\i}, [r2,:64], r12
 .endr