[1/3] flvdec: Export flv flags for nellymoser through side data

Message ID 1322819839-34912-1-git-send-email-martin@martin.st
State Superseded
Headers show

Commit Message

Martin Storsjö Dec. 2, 2011, 9:57 a.m.
From: Michael Niedermayer <michaelni@gmx.at>

---
 libavcodec/avcodec.h  |    1 +
 libavformat/flvdec.c  |    2 ++
 libavformat/version.h |    2 +-
 3 files changed, 4 insertions(+), 1 deletions(-)

Comments

Luca Barbato Dec. 2, 2011, 10:24 a.m. | #1
On 12/2/11 10:57 AM, Martin Storsjö wrote:
> +        av_packet_new_side_data(pkt, AV_PKT_DATA_FLV_FLAGS, 1)[0] = flags;

Could you please use a less creative syntax?
Luca Barbato Dec. 2, 2011, 1:13 p.m. | #2
On 12/2/11 10:57 AM, Martin Storsjö wrote:

Possibly we could export this information as a defined and generic 
struct, I guess there is more than this container-codec interaction that 
could signal configuration parameters change.

lu
Martin Storsjö Dec. 2, 2011, 1:26 p.m. | #3
On Fri, 2 Dec 2011, Luca Barbato wrote:

> On 12/2/11 10:57 AM, Martin Storsjö wrote:
>
> Possibly we could export this information as a defined and generic struct, I 
> guess there is more than this container-codec interaction that could signal 
> configuration parameters change.

Possibly, but I'm not sure if this particular approach is generic enough 
for all of them. E.g., for AAC in FLV, extradata is in a separate packet. 
If a new packet of that kind appears, I'm not sure if the demuxer should 
try to decode it and parse it to some info struct - instead I'd just pass 
that extradata on e.g. as side data to the next packet, so the decoder can 
notice it. In that sense, this flv flags field could be viewed as one kind 
of extradata for nellymoser/flv.

// Martin
Luca Barbato Dec. 2, 2011, 1:56 p.m. | #4
On 12/2/11 2:26 PM, Martin Storsjö wrote:
> On Fri, 2 Dec 2011, Luca Barbato wrote:
>
>> On 12/2/11 10:57 AM, Martin Storsjö wrote:
>>
>> Possibly we could export this information as a defined and generic
>> struct, I guess there is more than this container-codec interaction
>> that could signal configuration parameters change.
>
> Possibly, but I'm not sure if this particular approach is generic enough
> for all of them. E.g., for AAC in FLV, extradata is in a separate
> packet. If a new packet of that kind appears, I'm not sure if the
> demuxer should try to decode it and parse it to some info struct -
> instead I'd just pass that extradata on e.g. as side data to the next
> packet, so the decoder can notice it. In that sense, this flv flags
> field could be viewed as one kind of extradata for nellymoser/flv.

Ideally we should be able to remux nellymoser in something different 
than flv and preserve the information. I know that's close to nitpicking 
but having too many different kinds of side data might be annoying.

lu
Martin Storsjö Dec. 2, 2011, 10:07 p.m. | #5
On Fri, 2 Dec 2011, Luca Barbato wrote:

> On 12/2/11 2:26 PM, Martin Storsjö wrote:
>> On Fri, 2 Dec 2011, Luca Barbato wrote:
>> 
>>> On 12/2/11 10:57 AM, Martin Storsjö wrote:
>>> 
>>> Possibly we could export this information as a defined and generic
>>> struct, I guess there is more than this container-codec interaction
>>> that could signal configuration parameters change.
>> 
>> Possibly, but I'm not sure if this particular approach is generic enough
>> for all of them. E.g., for AAC in FLV, extradata is in a separate
>> packet. If a new packet of that kind appears, I'm not sure if the
>> demuxer should try to decode it and parse it to some info struct -
>> instead I'd just pass that extradata on e.g. as side data to the next
>> packet, so the decoder can notice it. In that sense, this flv flags
>> field could be viewed as one kind of extradata for nellymoser/flv.
>
> Ideally we should be able to remux nellymoser in something different than flv 
> and preserve the information. I know that's close to nitpicking but having 
> too many different kinds of side data might be annoying.

Indeed - that's a really valid point.

So, should we apply this patchset, or do we wait for a better solution?

// Martin
Martin Storsjö Dec. 13, 2011, 10:14 a.m. | #6
On Sat, 3 Dec 2011, Martin Storsjö wrote:

> On Fri, 2 Dec 2011, Luca Barbato wrote:
>
>> On 12/2/11 2:26 PM, Martin Storsjö wrote:
>>> On Fri, 2 Dec 2011, Luca Barbato wrote:
>>> 
>>>> On 12/2/11 10:57 AM, Martin Storsjö wrote:
>>>> 
>>>> Possibly we could export this information as a defined and generic
>>>> struct, I guess there is more than this container-codec interaction
>>>> that could signal configuration parameters change.
>>> 
>>> Possibly, but I'm not sure if this particular approach is generic enough
>>> for all of them. E.g., for AAC in FLV, extradata is in a separate
>>> packet. If a new packet of that kind appears, I'm not sure if the
>>> demuxer should try to decode it and parse it to some info struct -
>>> instead I'd just pass that extradata on e.g. as side data to the next
>>> packet, so the decoder can notice it. In that sense, this flv flags
>>> field could be viewed as one kind of extradata for nellymoser/flv.
>> 
>> Ideally we should be able to remux nellymoser in something different than 
>> flv and preserve the information. I know that's close to nitpicking but 
>> having too many different kinds of side data might be annoying.

Are there any other containers that support nellymoser and support this 
metadata per frame (that is, supported elsewhere than just what we come up 
with ourselves)? If not, this isn't really doable, even if we consider 
these flv packet flags as extradata for nellymoser.

But generally, a way to pass updated extradata for e.g. AAC might be good, 
too, because that can sure happen in FLV, but then it'd be a different 
kind of side data packet, a plain "new extradata" side data packet.

> So, should we apply this patchset, or do we wait for a better solution?

Any suggestions on how to move forward with this, Justin, others?

// Martin
Justin Ruggles Dec. 13, 2011, 2:03 p.m. | #7
On 12/13/2011 05:14 AM, Martin Storsjö wrote:

> On Sat, 3 Dec 2011, Martin Storsjö wrote:
> 
>> On Fri, 2 Dec 2011, Luca Barbato wrote:
>>
>>> On 12/2/11 2:26 PM, Martin Storsjö wrote:
>>>> On Fri, 2 Dec 2011, Luca Barbato wrote:
>>>>
>>>>> On 12/2/11 10:57 AM, Martin Storsjö wrote:
>>>>>
>>>>> Possibly we could export this information as a defined and generic
>>>>> struct, I guess there is more than this container-codec interaction
>>>>> that could signal configuration parameters change.
>>>>
>>>> Possibly, but I'm not sure if this particular approach is generic enough
>>>> for all of them. E.g., for AAC in FLV, extradata is in a separate
>>>> packet. If a new packet of that kind appears, I'm not sure if the
>>>> demuxer should try to decode it and parse it to some info struct -
>>>> instead I'd just pass that extradata on e.g. as side data to the next
>>>> packet, so the decoder can notice it. In that sense, this flv flags
>>>> field could be viewed as one kind of extradata for nellymoser/flv.
>>>
>>> Ideally we should be able to remux nellymoser in something different than 
>>> flv and preserve the information. I know that's close to nitpicking but 
>>> having too many different kinds of side data might be annoying.
> 
> Are there any other containers that support nellymoser and support this 
> metadata per frame (that is, supported elsewhere than just what we come up 
> with ourselves)? If not, this isn't really doable, even if we consider 
> these flv packet flags as extradata for nellymoser.
> 
> But generally, a way to pass updated extradata for e.g. AAC might be good, 
> too, because that can sure happen in FLV, but then it'd be a different 
> kind of side data packet, a plain "new extradata" side data packet.
> 
>> So, should we apply this patchset, or do we wait for a better solution?
> 
> Any suggestions on how to move forward with this, Justin, others?


I can't think of anything better right now, but it does seem a little
odd to me. Before decoding, the sample_rate is set by the user (via the
flv demuxer). The nellymoser decoder doesn't currently touch
sample_rate. If the point of the flag is to have the change come from
the decoder, what advantage does that have over the demuxer changing the
sample_rate directly? I know that the user changing certain fields after
init() will lead to bad things sometimes, but we don't have any
documented rules about such things do we?

-Justin
Martin Storsjö Dec. 13, 2011, 2:06 p.m. | #8
On Tue, 13 Dec 2011, Justin Ruggles wrote:

> I can't think of anything better right now, but it does seem a little
> odd to me. Before decoding, the sample_rate is set by the user (via the
> flv demuxer). The nellymoser decoder doesn't currently touch
> sample_rate. If the point of the flag is to have the change come from
> the decoder, what advantage does that have over the demuxer changing the
> sample_rate directly? I know that the user changing certain fields after
> init() will lead to bad things sometimes, but we don't have any
> documented rules about such things do we?

The advantage is that sample_rate is updated only once you actually decode 
that packet - you might demux and fill a buffer of packets, but the sample 
rate change isn't made effective until you actually decode them, otherwise 
you'd get wrong sample rate for a few of the preceding packets.

// Martin
Justin Ruggles Dec. 13, 2011, 2:47 p.m. | #9
On 12/13/2011 09:06 AM, Martin Storsjö wrote:

> On Tue, 13 Dec 2011, Justin Ruggles wrote:
> 
>> I can't think of anything better right now, but it does seem a little
>> odd to me. Before decoding, the sample_rate is set by the user (via the
>> flv demuxer). The nellymoser decoder doesn't currently touch
>> sample_rate. If the point of the flag is to have the change come from
>> the decoder, what advantage does that have over the demuxer changing the
>> sample_rate directly? I know that the user changing certain fields after
>> init() will lead to bad things sometimes, but we don't have any
>> documented rules about such things do we?
> 
> The advantage is that sample_rate is updated only once you actually decode 
> that packet - you might demux and fill a buffer of packets, but the sample 
> rate change isn't made effective until you actually decode them, otherwise 
> you'd get wrong sample rate for a few of the preceding packets.


indeed. sounds like an alright solution then. making the side data type
"new extradata" seems ok too.

-Justin
Martin Storsjö Dec. 13, 2011, 2:52 p.m. | #10
On Tue, 13 Dec 2011, Justin Ruggles wrote:

> On 12/13/2011 09:06 AM, Martin Storsjö wrote:
>
>> On Tue, 13 Dec 2011, Justin Ruggles wrote:
>>
>>> I can't think of anything better right now, but it does seem a little
>>> odd to me. Before decoding, the sample_rate is set by the user (via the
>>> flv demuxer). The nellymoser decoder doesn't currently touch
>>> sample_rate. If the point of the flag is to have the change come from
>>> the decoder, what advantage does that have over the demuxer changing the
>>> sample_rate directly? I know that the user changing certain fields after
>>> init() will lead to bad things sometimes, but we don't have any
>>> documented rules about such things do we?
>>
>> The advantage is that sample_rate is updated only once you actually decode
>> that packet - you might demux and fill a buffer of packets, but the sample
>> rate change isn't made effective until you actually decode them, otherwise
>> you'd get wrong sample rate for a few of the preceding packets.
>
>
> indeed. sounds like an alright solution then.

Ok, I'll push this in a little while then.

> making the side data type "new extradata" seems ok too.

Calling the side data "new extradata" probably is better for those codecs 
that use extradata normally, e.g. AAC. I'll try implementing that approach 
at some later time if I manage to produce an FLV file with changed AAC 
extradata inline.

// Martin
Luca Barbato Dec. 13, 2011, 3:02 p.m. | #11
On 13/12/11 15:52, Martin Storsjö wrote:
> On Tue, 13 Dec 2011, Justin Ruggles wrote:
>
>> On 12/13/2011 09:06 AM, Martin Storsjö wrote:
>>
>>> On Tue, 13 Dec 2011, Justin Ruggles wrote:
>>>
>>>> I can't think of anything better right now, but it does seem a little
>>>> odd to me. Before decoding, the sample_rate is set by the user (via the
>>>> flv demuxer). The nellymoser decoder doesn't currently touch
>>>> sample_rate. If the point of the flag is to have the change come from
>>>> the decoder, what advantage does that have over the demuxer changing
>>>> the
>>>> sample_rate directly? I know that the user changing certain fields
>>>> after
>>>> init() will lead to bad things sometimes, but we don't have any
>>>> documented rules about such things do we?
>>>
>>> The advantage is that sample_rate is updated only once you actually
>>> decode
>>> that packet - you might demux and fill a buffer of packets, but the
>>> sample
>>> rate change isn't made effective until you actually decode them,
>>> otherwise
>>> you'd get wrong sample rate for a few of the preceding packets.
>>
>>
>> indeed. sounds like an alright solution then.
>
> Ok, I'll push this in a little while then.

Just make so the flag is generic enough to be reused. new extradata 
sounds generic enough ^^; (maybe a bit too much).

lu
Martin Storsjö Dec. 13, 2011, 3:08 p.m. | #12
On Tue, 13 Dec 2011, Luca Barbato wrote:

> On 13/12/11 15:52, Martin Storsjö wrote:
>> On Tue, 13 Dec 2011, Justin Ruggles wrote:
>>
>>> On 12/13/2011 09:06 AM, Martin Storsjö wrote:
>>>
>>>> On Tue, 13 Dec 2011, Justin Ruggles wrote:
>>>>
>>>>> I can't think of anything better right now, but it does seem a little
>>>>> odd to me. Before decoding, the sample_rate is set by the user (via the
>>>>> flv demuxer). The nellymoser decoder doesn't currently touch
>>>>> sample_rate. If the point of the flag is to have the change come from
>>>>> the decoder, what advantage does that have over the demuxer changing
>>>>> the
>>>>> sample_rate directly? I know that the user changing certain fields
>>>>> after
>>>>> init() will lead to bad things sometimes, but we don't have any
>>>>> documented rules about such things do we?
>>>>
>>>> The advantage is that sample_rate is updated only once you actually
>>>> decode
>>>> that packet - you might demux and fill a buffer of packets, but the
>>>> sample
>>>> rate change isn't made effective until you actually decode them,
>>>> otherwise
>>>> you'd get wrong sample rate for a few of the preceding packets.
>>>
>>>
>>> indeed. sounds like an alright solution then.
>>
>> Ok, I'll push this in a little while then.
>
> Just make so the flag is generic enough to be reused. new extradata 
> sounds generic enough ^^; (maybe a bit too much).

New extradata is generic enough yes, but nellymoser doesn't have extradata 
as such, and I don't want to start packing flv flag fields as nellymoser 
extradata. So these patches mark it simply as what it is - the flv flags 
field passed along with the data. If the source doesn't have flv flags, 
there's no data to pass along (and no standard way of conveying this 
info).

As I said, if I later get to implementing the same for AAC in FLV, I'll do 
that as "new extradata", since the info for AAC actually is passed as 
extradata, and there it is much more clear that new extradata is passed 
along only occasionally when it has changed, not for every packet like it 
is here.

// Martin
Luca Barbato Dec. 13, 2011, 3:15 p.m. | #13
On 13/12/11 16:08, Martin Storsjö wrote:
> As I said, if I later get to implementing the same for AAC in FLV, I'll
> do that as "new extradata", since the info for AAC actually is passed as
> extradata, and there it is much more clear that new extradata is passed
> along only occasionally when it has changed, not for every packet like
> it is here.

I'm against proliferation of random and overly specific side data. If 
there are impelling reasons to introduce it _NOW_ so be it.

lu
Martin Storsjö Dec. 13, 2011, 3:25 p.m. | #14
On Tue, 13 Dec 2011, Luca Barbato wrote:

> On 13/12/11 16:08, Martin Storsjö wrote:
>> As I said, if I later get to implementing the same for AAC in FLV, I'll
>> do that as "new extradata", since the info for AAC actually is passed as
>> extradata, and there it is much more clear that new extradata is passed
>> along only occasionally when it has changed, not for every packet like
>> it is here.
>
> I'm against proliferation of random and overly specific side data. If 
> there are impelling reasons to introduce it _NOW_ so be it.

This is a missing feature I need at $dayjob, that's one reason.

The alternative would be, as far as I see it, to invent some form of 
extradata for nellymoser, check the flags field in the flv demuxer and 
then emit that as "new extradata" side data if it changes.

Since there is no specification of what that extradata would be, it could 
e.g. simply be the flv flags field for nellymoser.

The alternative that you suggested at some point, a struct for "audio 
format change", that could be passed along as side data (which of course 
would have to be byte packed with bytestream reading/writing functions 
instead of just memcpying a struct blindly) - that would of course also be 
an option, where we wouldn't need to invent a new form of extradata for 
nellymoser, and would be easier to interpret e.g. when stream copying. But 
for e.g. AAC, where this info isn't conveyed by the FLV flags but in the 
AAC extradata packets in the stream, the "new extradata" approach feels 
better than "audio format change" side data packets, since we don't have 
to parse the AAC extradata in the FLV demuxer if we just pass it along.

So, what do people think is the best approach?

// Martin
Justin Ruggles Dec. 13, 2011, 3:40 p.m. | #15
On 12/13/2011 10:25 AM, Martin Storsjö wrote:

> On Tue, 13 Dec 2011, Luca Barbato wrote:
> 
>> On 13/12/11 16:08, Martin Storsjö wrote:
>>> As I said, if I later get to implementing the same for AAC in FLV, I'll
>>> do that as "new extradata", since the info for AAC actually is passed as
>>> extradata, and there it is much more clear that new extradata is passed
>>> along only occasionally when it has changed, not for every packet like
>>> it is here.
>>
>> I'm against proliferation of random and overly specific side data. If 
>> there are impelling reasons to introduce it _NOW_ so be it.
> 
> This is a missing feature I need at $dayjob, that's one reason.
> 
> The alternative would be, as far as I see it, to invent some form of 
> extradata for nellymoser, check the flags field in the flv demuxer and 
> then emit that as "new extradata" side data if it changes.
> 
> Since there is no specification of what that extradata would be, it could 
> e.g. simply be the flv flags field for nellymoser.
> 
> The alternative that you suggested at some point, a struct for "audio 
> format change", that could be passed along as side data (which of course 
> would have to be byte packed with bytestream reading/writing functions 
> instead of just memcpying a struct blindly) - that would of course also be 
> an option, where we wouldn't need to invent a new form of extradata for 
> nellymoser, and would be easier to interpret e.g. when stream copying. But 
> for e.g. AAC, where this info isn't conveyed by the FLV flags but in the 
> AAC extradata packets in the stream, the "new extradata" approach feels 
> better than "audio format change" side data packets, since we don't have 
> to parse the AAC extradata in the FLV demuxer if we just pass it along.
> 
> So, what do people think is the best approach?


both?

I like the option of having a generic side data type for basic parameter
changes that isn't container-specific or codec-specific. But I also like
the idea of a new extradata side data type for more complex mid-stream
changes that require codec-specific parsing like for AAC.

-Justin
Martin Storsjö Dec. 13, 2011, 3:45 p.m. | #16
On Tue, 13 Dec 2011, Justin Ruggles wrote:

> On 12/13/2011 10:25 AM, Martin Storsjö wrote:
>
>> On Tue, 13 Dec 2011, Luca Barbato wrote:
>>
>>> On 13/12/11 16:08, Martin Storsjö wrote:
>>>> As I said, if I later get to implementing the same for AAC in FLV, I'll
>>>> do that as "new extradata", since the info for AAC actually is passed as
>>>> extradata, and there it is much more clear that new extradata is passed
>>>> along only occasionally when it has changed, not for every packet like
>>>> it is here.
>>>
>>> I'm against proliferation of random and overly specific side data. If
>>> there are impelling reasons to introduce it _NOW_ so be it.
>>
>> This is a missing feature I need at $dayjob, that's one reason.
>>
>> The alternative would be, as far as I see it, to invent some form of
>> extradata for nellymoser, check the flags field in the flv demuxer and
>> then emit that as "new extradata" side data if it changes.
>>
>> Since there is no specification of what that extradata would be, it could
>> e.g. simply be the flv flags field for nellymoser.
>>
>> The alternative that you suggested at some point, a struct for "audio
>> format change", that could be passed along as side data (which of course
>> would have to be byte packed with bytestream reading/writing functions
>> instead of just memcpying a struct blindly) - that would of course also be
>> an option, where we wouldn't need to invent a new form of extradata for
>> nellymoser, and would be easier to interpret e.g. when stream copying. But
>> for e.g. AAC, where this info isn't conveyed by the FLV flags but in the
>> AAC extradata packets in the stream, the "new extradata" approach feels
>> better than "audio format change" side data packets, since we don't have
>> to parse the AAC extradata in the FLV demuxer if we just pass it along.
>>
>> So, what do people think is the best approach?
>
>
> both?
>
> I like the option of having a generic side data type for basic parameter
> changes that isn't container-specific or codec-specific. But I also like
> the idea of a new extradata side data type for more complex mid-stream
> changes that require codec-specific parsing like for AAC.

Hmm, ok then. I'll try to implement them both and see what it looks like.

// Martin
Luca Barbato Dec. 13, 2011, 3:50 p.m. | #17
On 13/12/11 16:45, Martin Storsjö wrote:
>> both?
>>
>> I like the option of having a generic side data type for basic parameter
>> changes that isn't container-specific or codec-specific. But I also like
>> the idea of a new extradata side data type for more complex mid-stream
>> changes that require codec-specific parsing like for AAC.
>
> Hmm, ok then. I'll try to implement them both and see what it looks like.
>

Thank you for the patience.

lu

Patch

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 43abcd8..b6c47f5 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -851,6 +851,7 @@  typedef struct AVPanScan{
 
 enum AVPacketSideDataType {
     AV_PKT_DATA_PALETTE,
+    AV_PKT_DATA_FLV_FLAGS,
 };
 
 typedef struct AVPacket {
diff --git a/libavformat/flvdec.c b/libavformat/flvdec.c
index 5d19dd8..e36cbdf 100644
--- a/libavformat/flvdec.c
+++ b/libavformat/flvdec.c
@@ -565,6 +565,8 @@  static int flv_read_packet(AVFormatContext *s, AVPacket *pkt)
     pkt->dts = dts;
     pkt->pts = pts == AV_NOPTS_VALUE ? dts : pts;
     pkt->stream_index = st->index;
+    if (st->codec->codec_id == CODEC_ID_NELLYMOSER)
+        av_packet_new_side_data(pkt, AV_PKT_DATA_FLV_FLAGS, 1)[0] = flags;
 
     if (is_audio || ((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_KEY))
         pkt->flags |= AV_PKT_FLAG_KEY;
diff --git a/libavformat/version.h b/libavformat/version.h
index ba1254f..8512c79 100644
--- a/libavformat/version.h
+++ b/libavformat/version.h
@@ -24,7 +24,7 @@ 
 #include "libavutil/avutil.h"
 
 #define LIBAVFORMAT_VERSION_MAJOR 53
-#define LIBAVFORMAT_VERSION_MINOR 16
+#define LIBAVFORMAT_VERSION_MINOR 17
 #define LIBAVFORMAT_VERSION_MICRO  0
 
 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \