movenc: factor initial_padding into edit lists

Message ID 20170221150428.28031-1-stebbins@jetheaddev.com
State New
Headers show

Commit Message

John Stebbins Feb. 21, 2017, 3:04 p.m.
initial_padding was getting added to the edit list indirectly due to
initial negative dts.  But in cases where the audio is delayed,
all or part of initial_padding would be unaccounted for.  This patch
makes initial_padding explicit.
---
 libavformat/movenc.c | 58 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 42 insertions(+), 16 deletions(-)

Comments

Martin Storsjö Feb. 21, 2017, 9:54 p.m. | #1
On Tue, 21 Feb 2017, John Stebbins wrote:

> initial_padding was getting added to the edit list indirectly due to
> initial negative dts.  But in cases where the audio is delayed,
> all or part of initial_padding would be unaccounted for.  This patch
> makes initial_padding explicit.
> ---
> libavformat/movenc.c | 58 +++++++++++++++++++++++++++++++++++++---------------
> 1 file changed, 42 insertions(+), 16 deletions(-)
>
> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
> index 689291d..d60e194 100644
> --- a/libavformat/movenc.c
> +++ b/libavformat/movenc.c
> @@ -1698,9 +1698,31 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>                                       track->timescale, AV_ROUND_UP);
>     int version = duration < INT32_MAX ? 0 : 1;
>     int entry_size, entry_count, size;
> -    int64_t delay, start_ct = track->start_cts;
> -    delay = av_rescale_rnd(track->start_dts + start_ct, MOV_TIMESCALE,
> -                           track->timescale, AV_ROUND_DOWN);
> +    int64_t delay;
> +    int64_t mediatime;
> +    int64_t skip = 0;
> +
> +    delay = track->start_dts + track->start_cts;
> +
> +    if (track->par->codec_type == AVMEDIA_TYPE_AUDIO &&
> +        track->par->initial_padding > 0) {
> +        /* Adjust delay so that initial_padding gets recorded in the
> +         * MediaTime of an edit list entry even in the case that
> +         * delay is positive. I.e. we don't want initial_padding to be
> +         * absorbed and hidden in the delay. MediaTime must contain
> +         * initial_padding in order to know where the actual media
> +         * timeline begins. A player should drop samples until MediaTime
> +         * is reached */
> +        delay += av_rescale_rnd(track->par->initial_padding, track->timescale,
> +                                track->par->sample_rate, AV_ROUND_UP);
> +        skip = av_rescale_rnd(track->par->initial_padding,
> +                              track->timescale,
> +                              track->par->sample_rate, AV_ROUND_DOWN);
> +    }
> +    /* rescale delay, this was not done earlier to minimize rounding errors */
> +    delay = av_rescale_rnd(delay, MOV_TIMESCALE,
> +                       track->timescale, AV_ROUND_NEAR_INF);

Nice; I thought about pointing this out in the previous review but 
skipped, since it would at most mean an 1 ms rounding error. But this way 
it's even better, especially since the timescale in most cases will be 
equal to the sample rate.

> +
>     version |= delay < INT32_MAX ? 0 : 1;
>
>     entry_size = (version == 1) ? 20 : 12;
> @@ -1731,33 +1753,37 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>         }
>         avio_wb32(pb, 0x00010000);
>     } else {
> -        /* Avoid accidentally ending up with start_ct = -1 which has got a
> -         * special meaning. Normally start_ct should end up positive or zero
> -         * here, but use FFMIN in case dts is a a small positive integer
> -         * rounded to 0 when represented in MOV_TIMESCALE units. */
> -        start_ct  = -FFMIN(track->start_dts, 0);
> -        /* Note, this delay is calculated from the pts of the first sample,
> -         * ensuring that we don't reduce the duration for cases with
> -         * dts<0 pts=0. */
> -        duration += delay;
> +        /* Skip the larger of initial_padding or the initial -PTS.

This is good, this makes it very clear what it does.

> +         * Skipping -PTS when PTS is negative ensures that timestamps
> +         * can be represented with positive values in stts and ctts. */

This still has me confused though - I don't see what that has to do with 
this code here.

Also see the patchset I posted the other day, about adding the 
negative_cts_offsets flag. In mov/mp4, it's ok with negative ctts values, 
which makes it possible to omit the edit lists for normal video with 
b-frames.

The rest of this patch seems fine (still) - you seems to be doing a lot of 
practical testing of it, which gives me confidence. I assume/hope you've 
tested (either manually or later via the movenc unit test) that it 
preserves the current behaviour for video tracks that start with a 
positive offset.

// Martin
John Stebbins Feb. 21, 2017, 11:21 p.m. | #2
On 02/21/2017 02:54 PM, Martin Storsjö wrote:
> On Tue, 21 Feb 2017, John Stebbins wrote:
>
>> initial_padding was getting added to the edit list indirectly due to
>> initial negative dts.  But in cases where the audio is delayed,
>> all or part of initial_padding would be unaccounted for.  This patch
>> makes initial_padding explicit.
>> ---
>> libavformat/movenc.c | 58 +++++++++++++++++++++++++++++++++++++---------------
>> 1 file changed, 42 insertions(+), 16 deletions(-)
>>
>> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
>> index 689291d..d60e194 100644
>> --- a/libavformat/movenc.c
>> +++ b/libavformat/movenc.c
>> @@ -1698,9 +1698,31 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>                                       track->timescale, AV_ROUND_UP);
>>     int version = duration < INT32_MAX ? 0 : 1;
>>     int entry_size, entry_count, size;
>> -    int64_t delay, start_ct = track->start_cts;
>> -    delay = av_rescale_rnd(track->start_dts + start_ct, MOV_TIMESCALE,
>> -                           track->timescale, AV_ROUND_DOWN);
>> +    int64_t delay;
>> +    int64_t mediatime;
>> +    int64_t skip = 0;
>> +
>> +    delay = track->start_dts + track->start_cts;
>> +
>> +    if (track->par->codec_type == AVMEDIA_TYPE_AUDIO &&
>> +        track->par->initial_padding > 0) {
>> +        /* Adjust delay so that initial_padding gets recorded in the
>> +         * MediaTime of an edit list entry even in the case that
>> +         * delay is positive. I.e. we don't want initial_padding to be
>> +         * absorbed and hidden in the delay. MediaTime must contain
>> +         * initial_padding in order to know where the actual media
>> +         * timeline begins. A player should drop samples until MediaTime
>> +         * is reached */
>> +        delay += av_rescale_rnd(track->par->initial_padding, track->timescale,
>> +                                track->par->sample_rate, AV_ROUND_UP);
>> +        skip = av_rescale_rnd(track->par->initial_padding,
>> +                              track->timescale,
>> +                              track->par->sample_rate, AV_ROUND_DOWN);
>> +    }
>> +    /* rescale delay, this was not done earlier to minimize rounding errors */
>> +    delay = av_rescale_rnd(delay, MOV_TIMESCALE,
>> +                       track->timescale, AV_ROUND_NEAR_INF);
> Nice; I thought about pointing this out in the previous review but 
> skipped, since it would at most mean an 1 ms rounding error. But this way 
> it's even better, especially since the timescale in most cases will be 
> equal to the sample rate.
>
>> +
>>     version |= delay < INT32_MAX ? 0 : 1;
>>
>>     entry_size = (version == 1) ? 20 : 12;
>> @@ -1731,33 +1753,37 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>         }
>>         avio_wb32(pb, 0x00010000);
>>     } else {
>> -        /* Avoid accidentally ending up with start_ct = -1 which has got a
>> -         * special meaning. Normally start_ct should end up positive or zero
>> -         * here, but use FFMIN in case dts is a a small positive integer
>> -         * rounded to 0 when represented in MOV_TIMESCALE units. */
>> -        start_ct  = -FFMIN(track->start_dts, 0);
>> -        /* Note, this delay is calculated from the pts of the first sample,
>> -         * ensuring that we don't reduce the duration for cases with
>> -         * dts<0 pts=0. */
>> -        duration += delay;
>> +        /* Skip the larger of initial_padding or the initial -PTS.
> This is good, this makes it very clear what it does.
>
>> +         * Skipping -PTS when PTS is negative ensures that timestamps
>> +         * can be represented with positive values in stts and ctts. */
> This still has me confused though - I don't see what that has to do with 
> this code here.

Better wording is still required I guess.  stts and ctts will be positive regardless, but if the edit list entry is
*not* set to -PTS the resulting timestamps during decode will be wrong.  The only way to get correct timestamps without
setting -PTS as MediaTime would be something like your negative CTS patch.

> Also see the patchset I posted the other day, about adding the 
> negative_cts_offsets flag. In mov/mp4, it's ok with negative ctts values, 
> which makes it possible to omit the edit lists for normal video with 
> b-frames.

Yes, I did see this patch set.

>
> The rest of this patch seems fine (still) - you seems to be doing a lot of 
> practical testing of it, which gives me confidence. I assume/hope you've 
> tested (either manually or later via the movenc unit test) that it 
> preserves the current behaviour for video tracks that start with a 
> positive offset.

I assume you mean delay here where an empty edit is inserted. I've tested video with positive delay, audio with positive
delay and both with 0 delay. There is also another related patch on the mailing list now that extracts initial_padding
from the file while reading.  So I've tested writing->reading->writing->reading ... multiple generations and verified
for every generation that streams stay in sync, start consistently at the same time and remain the same duration.

> // Martin
> _______________________________________________
> libav-devel mailing list
> libav-devel@libav.org
> https://lists.libav.org/mailman/listinfo/libav-devel
Martin Storsjö Feb. 22, 2017, 10:41 a.m. | #3
On Tue, 21 Feb 2017, John Stebbins wrote:

> On 02/21/2017 02:54 PM, Martin Storsjö wrote:
>> On Tue, 21 Feb 2017, John Stebbins wrote:
>>
>>> initial_padding was getting added to the edit list indirectly due to
>>> initial negative dts.  But in cases where the audio is delayed,
>>> all or part of initial_padding would be unaccounted for.  This patch
>>> makes initial_padding explicit.
>>> ---
>>> libavformat/movenc.c | 58 +++++++++++++++++++++++++++++++++++++---------------
>>> 1 file changed, 42 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
>>> index 689291d..d60e194 100644
>>> --- a/libavformat/movenc.c
>>> +++ b/libavformat/movenc.c
>>> @@ -1698,9 +1698,31 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>>                                       track->timescale, AV_ROUND_UP);
>>>     int version = duration < INT32_MAX ? 0 : 1;
>>>     int entry_size, entry_count, size;
>>> -    int64_t delay, start_ct = track->start_cts;
>>> -    delay = av_rescale_rnd(track->start_dts + start_ct, MOV_TIMESCALE,
>>> -                           track->timescale, AV_ROUND_DOWN);
>>> +    int64_t delay;
>>> +    int64_t mediatime;
>>> +    int64_t skip = 0;
>>> +
>>> +    delay = track->start_dts + track->start_cts;
>>> +
>>> +    if (track->par->codec_type == AVMEDIA_TYPE_AUDIO &&
>>> +        track->par->initial_padding > 0) {
>>> +        /* Adjust delay so that initial_padding gets recorded in the
>>> +         * MediaTime of an edit list entry even in the case that
>>> +         * delay is positive. I.e. we don't want initial_padding to be
>>> +         * absorbed and hidden in the delay. MediaTime must contain
>>> +         * initial_padding in order to know where the actual media
>>> +         * timeline begins. A player should drop samples until MediaTime
>>> +         * is reached */
>>> +        delay += av_rescale_rnd(track->par->initial_padding, track->timescale,
>>> +                                track->par->sample_rate, AV_ROUND_UP);
>>> +        skip = av_rescale_rnd(track->par->initial_padding,
>>> +                              track->timescale,
>>> +                              track->par->sample_rate, AV_ROUND_DOWN);
>>> +    }
>>> +    /* rescale delay, this was not done earlier to minimize rounding errors */
>>> +    delay = av_rescale_rnd(delay, MOV_TIMESCALE,
>>> +                       track->timescale, AV_ROUND_NEAR_INF);
>> Nice; I thought about pointing this out in the previous review but
>> skipped, since it would at most mean an 1 ms rounding error. But this way
>> it's even better, especially since the timescale in most cases will be
>> equal to the sample rate.
>>
>>> +
>>>     version |= delay < INT32_MAX ? 0 : 1;
>>>
>>>     entry_size = (version == 1) ? 20 : 12;
>>> @@ -1731,33 +1753,37 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>>         }
>>>         avio_wb32(pb, 0x00010000);
>>>     } else {
>>> -        /* Avoid accidentally ending up with start_ct = -1 which has got a
>>> -         * special meaning. Normally start_ct should end up positive or zero
>>> -         * here, but use FFMIN in case dts is a a small positive integer
>>> -         * rounded to 0 when represented in MOV_TIMESCALE units. */
>>> -        start_ct  = -FFMIN(track->start_dts, 0);
>>> -        /* Note, this delay is calculated from the pts of the first sample,
>>> -         * ensuring that we don't reduce the duration for cases with
>>> -         * dts<0 pts=0. */
>>> -        duration += delay;
>>> +        /* Skip the larger of initial_padding or the initial -PTS.
>> This is good, this makes it very clear what it does.
>>
>>> +         * Skipping -PTS when PTS is negative ensures that timestamps
>>> +         * can be represented with positive values in stts and ctts. */
>> This still has me confused though - I don't see what that has to do with
>> this code here.
>
> Better wording is still required I guess.  stts and ctts will be positive regardless, but if the edit list entry is
> *not* set to -PTS the resulting timestamps during decode will be wrong.  The only way to get correct timestamps without
> setting -PTS as MediaTime would be something like your negative CTS patch.

Yup. Given this, I'd be ok with reducing this comment just to the first 
sentence - that seems like enough to me.

>> Also see the patchset I posted the other day, about adding the
>> negative_cts_offsets flag. In mov/mp4, it's ok with negative ctts values,
>> which makes it possible to omit the edit lists for normal video with
>> b-frames.
>
> Yes, I did see this patch set.

Ok, great. If this goes in first, I'll need to update patch 2/3 to check 
for "start_dts || initial_padding".

>> The rest of this patch seems fine (still) - you seems to be doing a lot of
>> practical testing of it, which gives me confidence. I assume/hope you've
>> tested (either manually or later via the movenc unit test) that it
>> preserves the current behaviour for video tracks that start with a
>> positive offset.
>
> I assume you mean delay here where an empty edit is inserted. I've tested video with positive delay, audio with positive
> delay and both with 0 delay. There is also another related patch on the mailing list now that extracts initial_padding
> from the file while reading.  So I've tested writing->reading->writing->reading ... multiple generations and verified
> for every generation that streams stay in sync, start consistently at the same time and remain the same duration.

Ok, then it seems like this is tested enough to me, so no objection from 
me for pushing it.

// Martin
John Stebbins Feb. 22, 2017, 4:06 p.m. | #4
On 02/22/2017 03:41 AM, Martin Storsjö wrote:
> On Tue, 21 Feb 2017, John Stebbins wrote:
>
>> On 02/21/2017 02:54 PM, Martin Storsjö wrote:
>>> On Tue, 21 Feb 2017, John Stebbins wrote:
>>>
>>>> initial_padding was getting added to the edit list indirectly due to
>>>> initial negative dts.  But in cases where the audio is delayed,
>>>> all or part of initial_padding would be unaccounted for.  This patch
>>>> makes initial_padding explicit.
>>>> ---
>>>> libavformat/movenc.c | 58 +++++++++++++++++++++++++++++++++++++---------------
>>>> 1 file changed, 42 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
>>>> index 689291d..d60e194 100644
>>>> --- a/libavformat/movenc.c
>>>> +++ b/libavformat/movenc.c
>>>> @@ -1698,9 +1698,31 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>>>                                       track->timescale, AV_ROUND_UP);
>>>>     int version = duration < INT32_MAX ? 0 : 1;
>>>>     int entry_size, entry_count, size;
>>>> -    int64_t delay, start_ct = track->start_cts;
>>>> -    delay = av_rescale_rnd(track->start_dts + start_ct, MOV_TIMESCALE,
>>>> -                           track->timescale, AV_ROUND_DOWN);
>>>> +    int64_t delay;
>>>> +    int64_t mediatime;
>>>> +    int64_t skip = 0;
>>>> +
>>>> +    delay = track->start_dts + track->start_cts;
>>>> +
>>>> +    if (track->par->codec_type == AVMEDIA_TYPE_AUDIO &&
>>>> +        track->par->initial_padding > 0) {
>>>> +        /* Adjust delay so that initial_padding gets recorded in the
>>>> +         * MediaTime of an edit list entry even in the case that
>>>> +         * delay is positive. I.e. we don't want initial_padding to be
>>>> +         * absorbed and hidden in the delay. MediaTime must contain
>>>> +         * initial_padding in order to know where the actual media
>>>> +         * timeline begins. A player should drop samples until MediaTime
>>>> +         * is reached */
>>>> +        delay += av_rescale_rnd(track->par->initial_padding, track->timescale,
>>>> +                                track->par->sample_rate, AV_ROUND_UP);
>>>> +        skip = av_rescale_rnd(track->par->initial_padding,
>>>> +                              track->timescale,
>>>> +                              track->par->sample_rate, AV_ROUND_DOWN);
>>>> +    }
>>>> +    /* rescale delay, this was not done earlier to minimize rounding errors */
>>>> +    delay = av_rescale_rnd(delay, MOV_TIMESCALE,
>>>> +                       track->timescale, AV_ROUND_NEAR_INF);
>>> Nice; I thought about pointing this out in the previous review but
>>> skipped, since it would at most mean an 1 ms rounding error. But this way
>>> it's even better, especially since the timescale in most cases will be
>>> equal to the sample rate.
>>>
>>>> +
>>>>     version |= delay < INT32_MAX ? 0 : 1;
>>>>
>>>>     entry_size = (version == 1) ? 20 : 12;
>>>> @@ -1731,33 +1753,37 @@ static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
>>>>         }
>>>>         avio_wb32(pb, 0x00010000);
>>>>     } else {
>>>> -        /* Avoid accidentally ending up with start_ct = -1 which has got a
>>>> -         * special meaning. Normally start_ct should end up positive or zero
>>>> -         * here, but use FFMIN in case dts is a a small positive integer
>>>> -         * rounded to 0 when represented in MOV_TIMESCALE units. */
>>>> -        start_ct  = -FFMIN(track->start_dts, 0);
>>>> -        /* Note, this delay is calculated from the pts of the first sample,
>>>> -         * ensuring that we don't reduce the duration for cases with
>>>> -         * dts<0 pts=0. */
>>>> -        duration += delay;
>>>> +        /* Skip the larger of initial_padding or the initial -PTS.
>>> This is good, this makes it very clear what it does.
>>>
>>>> +         * Skipping -PTS when PTS is negative ensures that timestamps
>>>> +         * can be represented with positive values in stts and ctts. */
>>> This still has me confused though - I don't see what that has to do with
>>> this code here.
>> Better wording is still required I guess.  stts and ctts will be positive regardless, but if the edit list entry is
>> *not* set to -PTS the resulting timestamps during decode will be wrong.  The only way to get correct timestamps without
>> setting -PTS as MediaTime would be something like your negative CTS patch.
> Yup. Given this, I'd be ok with reducing this comment just to the first 
> sentence - that seems like enough to me.
>
>>> Also see the patchset I posted the other day, about adding the
>>> negative_cts_offsets flag. In mov/mp4, it's ok with negative ctts values,
>>> which makes it possible to omit the edit lists for normal video with
>>> b-frames.
>> Yes, I did see this patch set.
> Ok, great. If this goes in first, I'll need to update patch 2/3 to check 
> for "start_dts || initial_padding".

I actually should have added initial_padding to this test since it's possible for an audio track to be delayed by
exactly the initial_padding which would result in start_dts = 0. So I'll update my patch for this.

Patch

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 689291d..d60e194 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -1698,9 +1698,31 @@  static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
                                       track->timescale, AV_ROUND_UP);
     int version = duration < INT32_MAX ? 0 : 1;
     int entry_size, entry_count, size;
-    int64_t delay, start_ct = track->start_cts;
-    delay = av_rescale_rnd(track->start_dts + start_ct, MOV_TIMESCALE,
-                           track->timescale, AV_ROUND_DOWN);
+    int64_t delay;
+    int64_t mediatime;
+    int64_t skip = 0;
+
+    delay = track->start_dts + track->start_cts;
+
+    if (track->par->codec_type == AVMEDIA_TYPE_AUDIO &&
+        track->par->initial_padding > 0) {
+        /* Adjust delay so that initial_padding gets recorded in the
+         * MediaTime of an edit list entry even in the case that
+         * delay is positive. I.e. we don't want initial_padding to be
+         * absorbed and hidden in the delay. MediaTime must contain
+         * initial_padding in order to know where the actual media
+         * timeline begins. A player should drop samples until MediaTime
+         * is reached */
+        delay += av_rescale_rnd(track->par->initial_padding, track->timescale,
+                                track->par->sample_rate, AV_ROUND_UP);
+        skip = av_rescale_rnd(track->par->initial_padding,
+                              track->timescale,
+                              track->par->sample_rate, AV_ROUND_DOWN);
+    }
+    /* rescale delay, this was not done earlier to minimize rounding errors */
+    delay = av_rescale_rnd(delay, MOV_TIMESCALE,
+                       track->timescale, AV_ROUND_NEAR_INF);
+
     version |= delay < INT32_MAX ? 0 : 1;
 
     entry_size = (version == 1) ? 20 : 12;
@@ -1731,33 +1753,37 @@  static int mov_write_edts_tag(AVIOContext *pb, MOVMuxContext *mov,
         }
         avio_wb32(pb, 0x00010000);
     } else {
-        /* Avoid accidentally ending up with start_ct = -1 which has got a
-         * special meaning. Normally start_ct should end up positive or zero
-         * here, but use FFMIN in case dts is a a small positive integer
-         * rounded to 0 when represented in MOV_TIMESCALE units. */
-        start_ct  = -FFMIN(track->start_dts, 0);
-        /* Note, this delay is calculated from the pts of the first sample,
-         * ensuring that we don't reduce the duration for cases with
-         * dts<0 pts=0. */
-        duration += delay;
+        /* Skip the larger of initial_padding or the initial -PTS.
+         * Skipping -PTS when PTS is negative ensures that timestamps
+         * can be represented with positive values in stts and ctts. */
+        skip = FFMAX(skip, -track->start_dts - track->start_cts);
     }
+    mediatime = skip + track->start_cts;
+
+    /* skip is the duration of the media segment that will be dropped
+     * during playback when an edit entry is applied.  The edit entry
+     * duration must be reduced by this amount. */
+    duration -= av_rescale_rnd(skip, MOV_TIMESCALE,
+                               track->timescale, AV_ROUND_UP);
 
     /* For fragmented files, we don't know the full length yet. Setting
      * duration to 0 allows us to only specify the offset, including
      * the rest of the content (from all future fragments) without specifying
      * an explicit duration. */
-    if (mov->flags & FF_MOV_FLAG_FRAGMENT)
+    if (mov->flags & FF_MOV_FLAG_FRAGMENT || duration < 0)
         duration = 0;
 
-    /* duration */
+    /* add edit entry that defines the presentation time of the first
+     * sample to render during playback and the duration of the segment */
     if (version == 1) {
         avio_wb64(pb, duration);
-        avio_wb64(pb, start_ct);
+        avio_wb64(pb, mediatime);
     } else {
         avio_wb32(pb, duration);
-        avio_wb32(pb, start_ct);
+        avio_wb32(pb, mediatime);
     }
     avio_wb32(pb, 0x00010000);
+
     return size;
 }