Message ID | 1322781844-31711-1-git-send-email-martin@martin.st |
---|---|
State | Superseded |
Headers | show |
On 12/01/2011 06:24 PM, Martin Storsjö wrote: > Earlier, bits per sample was defined as 8, which wasn't chosen > with any thought, only chosen since it didn't seem to change > anything compared to not defining it at all. > > g722 encodes 2 samples into one byte codeword, therefore the > bits per sample is 4. By changing this, the generated timestamps > for streams encoded with g722 become correct. While technically true, this causes problems with WAV. In WAV files, the bits-per-sample field in the header is supposed to be 8 (or 7 or 6) not 4. So we need an override for g722 in the WAV header writing function in lavf to write the correct value. See Bug #54 > Due to the way avcodec_encode_audio is defined for non frame > based codecs, calculating the number of input samples in the > encode function has to be changed once the bits per samples > definition is changed. It isn't defined that way for non-frame-based codecs, it's defined that way for PCM. The g722 encoder is just wrong. All other ADPCM encoders set frame_size and use it accordingly. -Justin
On Thu, 1 Dec 2011, Justin Ruggles wrote: > On 12/01/2011 06:24 PM, Martin Storsjö wrote: > >> Earlier, bits per sample was defined as 8, which wasn't chosen >> with any thought, only chosen since it didn't seem to change >> anything compared to not defining it at all. >> >> g722 encodes 2 samples into one byte codeword, therefore the >> bits per sample is 4. By changing this, the generated timestamps >> for streams encoded with g722 become correct. > > While technically true, this causes problems with WAV. In WAV files, the > bits-per-sample field in the header is supposed to be 8 (or 7 or 6) not > 4. So we need an override for g722 in the WAV header writing function in > lavf to write the correct value. Yeah, I came to think of this detail right after posting the patch - I guess I chose 8 since it was the default for the "number of bits per codeword/sample pair", which the decoder uses - my patch also makes the raw g722 decoder set bits_per_coded_sample to 4, which gives warnings in the decoder, which I realized after posting it. As for WAV files, I don't have any samples, it'd be good to know what the 6/7 bits per codeword samples look like, if the words are packed tightly, or the extra bits just are to be ignored. (Currently the decoder doesn't assume tightly packed codewords.) > See Bug #54 Oh, didn't know about this issue. >> Due to the way avcodec_encode_audio is defined for non frame >> based codecs, calculating the number of input samples in the >> encode function has to be changed once the bits per samples >> definition is changed. > > > It isn't defined that way for non-frame-based codecs, it's defined that > way for PCM. The g722 encoder is just wrong. All other ADPCM encoders > set frame_size and use it accordingly. Hmm, so are there any other non-PCM encoders that don't set a frame size? avcodec.h says this behaviour is for PCM codecs only, while avcodec.c does it for any codec that has a frame_size <= 1. My issue (that pts isn't generated properly) is fixed by setting any frame_size in the encoder init function (and changing the encode_frame function accordingly), but does that make sense for a codec that can use any frame size? (For this codec, the user could set any other frame size he wants after initializing the encoder.) And if we'd want to change this to use a default frame size, what would be a suitable default? // Martin
On Fri, 2 Dec 2011, Martin Storsjö wrote: > On Thu, 1 Dec 2011, Justin Ruggles wrote: > >> On 12/01/2011 06:24 PM, Martin Storsjö wrote: >> >>> Earlier, bits per sample was defined as 8, which wasn't chosen >>> with any thought, only chosen since it didn't seem to change >>> anything compared to not defining it at all. >>> >>> g722 encodes 2 samples into one byte codeword, therefore the >>> bits per sample is 4. By changing this, the generated timestamps >>> for streams encoded with g722 become correct. >> >> While technically true, this causes problems with WAV. In WAV files, the >> bits-per-sample field in the header is supposed to be 8 (or 7 or 6) not >> 4. So we need an override for g722 in the WAV header writing function in >> lavf to write the correct value. > > Yeah, I came to think of this detail right after posting the patch - I guess > I chose 8 since it was the default for the "number of bits per > codeword/sample pair", which the decoder uses - my patch also makes the raw > g722 decoder set bits_per_coded_sample to 4, which gives warnings in the > decoder, which I realized after posting it. FWIW, changing it to 4 makes the raw g722 demuxer generate proper timestamps for input data, too. But I'm not sure any such data really exists in the wild other than for testing the codec, so it's not an important usecase. The raw g722 demuxer's requirement of a non-zero bits_per_coded_sample was what triggered the addition of the definition in the first place (in 029f966c3aa73531a90cb14ca95057f2fb3f0a26). // Martin
On 12/02/2011 04:32 AM, Martin Storsjö wrote: > On Thu, 1 Dec 2011, Justin Ruggles wrote: > >> On 12/01/2011 06:24 PM, Martin Storsjö wrote: >> >>> Earlier, bits per sample was defined as 8, which wasn't chosen >>> with any thought, only chosen since it didn't seem to change >>> anything compared to not defining it at all. >>> >>> g722 encodes 2 samples into one byte codeword, therefore the >>> bits per sample is 4. By changing this, the generated timestamps >>> for streams encoded with g722 become correct. >> >> While technically true, this causes problems with WAV. In WAV files, the >> bits-per-sample field in the header is supposed to be 8 (or 7 or 6) not >> 4. So we need an override for g722 in the WAV header writing function in >> lavf to write the correct value. > > Yeah, I came to think of this detail right after posting the patch - I > guess I chose 8 since it was the default for the "number of bits per > codeword/sample pair", which the decoder uses - my patch also makes the > raw g722 decoder set bits_per_coded_sample to 4, which gives warnings in > the decoder, which I realized after posting it. > > As for WAV files, I don't have any samples, it'd be good to know what the > 6/7 bits per codeword samples look like, if the words are packed tightly, > or the extra bits just are to be ignored. (Currently the decoder doesn't > assume tightly packed codewords.) Well, the only sample I have been able to find is http://samples.libav.org/A-codecs/g722/msn-messenger-g722.wav And it actually has bits-per-sample as 16. :/ I also could not find a G.722 codec for Windows to see how WMP handles it. I've never seen a 6/7 bits sample either, other than the raw ITU-T testsuite. So maybe until we actually come across such samples in a container to see how it's handled, we could just assume 4 bits per sample and provide a decoder option for code_size. >> See Bug #54 > > Oh, didn't know about this issue. > >>> Due to the way avcodec_encode_audio is defined for non frame >>> based codecs, calculating the number of input samples in the >>> encode function has to be changed once the bits per samples >>> definition is changed. >> >> >> It isn't defined that way for non-frame-based codecs, it's defined that >> way for PCM. The g722 encoder is just wrong. All other ADPCM encoders >> set frame_size and use it accordingly. > > Hmm, so are there any other non-PCM encoders that don't set a frame size? > avcodec.h says this behaviour is for PCM codecs only, while avcodec.c does > it for any codec that has a frame_size <= 1. AFAIK, PCM and G.722 are the only encoders which do not set a frame_size > 1. But I haven't checked them all yet. > My issue (that pts isn't generated properly) is fixed by setting any > frame_size in the encoder init function (and changing the encode_frame > function accordingly), but does that make sense for a codec that can use > any frame size? (For this codec, the user could set any other frame size > he wants after initializing the encoder.) And if we'd want to change this > to use a default frame size, what would be a suitable default? If the user changes frame_size after init with almost any encoder (with the exception of the last frame for CODEC_CAP_SMALL_LAST_FRAME) it will generally break horribly. I'm actually debating about how to handle PCM and other such codecs in the new encoding API that I'm working on. One option would be a CODEC_CAP that indicates that the encoder will take any number of samples from the user (via AVFrame.nb_samples) and encode them. Another option would be to have them behave like all the others and set some arbitrary frame_size. -Justin
On Fri, 2 Dec 2011, Justin Ruggles wrote: > On 12/02/2011 04:32 AM, Martin Storsjö wrote: > >> On Thu, 1 Dec 2011, Justin Ruggles wrote: >> >>> On 12/01/2011 06:24 PM, Martin Storsjö wrote: >>> >>>> Earlier, bits per sample was defined as 8, which wasn't chosen >>>> with any thought, only chosen since it didn't seem to change >>>> anything compared to not defining it at all. >>>> >>>> g722 encodes 2 samples into one byte codeword, therefore the >>>> bits per sample is 4. By changing this, the generated timestamps >>>> for streams encoded with g722 become correct. >>> >>> While technically true, this causes problems with WAV. In WAV files, the >>> bits-per-sample field in the header is supposed to be 8 (or 7 or 6) not >>> 4. So we need an override for g722 in the WAV header writing function in >>> lavf to write the correct value. >> >> Yeah, I came to think of this detail right after posting the patch - I >> guess I chose 8 since it was the default for the "number of bits per >> codeword/sample pair", which the decoder uses - my patch also makes the >> raw g722 decoder set bits_per_coded_sample to 4, which gives warnings in >> the decoder, which I realized after posting it. >> >> As for WAV files, I don't have any samples, it'd be good to know what the >> 6/7 bits per codeword samples look like, if the words are packed tightly, >> or the extra bits just are to be ignored. (Currently the decoder doesn't >> assume tightly packed codewords.) > > Well, the only sample I have been able to find is > http://samples.libav.org/A-codecs/g722/msn-messenger-g722.wav > > And it actually has bits-per-sample as 16. :/ I also could not find a > G.722 codec for Windows to see how WMP handles it. I've never seen a 6/7 > bits sample either, other than the raw ITU-T testsuite. > > So maybe until we actually come across such samples in a container to > see how it's handled, we could just assume 4 bits per sample and provide > a decoder option for code_size. Hmm, that might make sense. One one hand, the bits per codeword is something that should be passed from demuxer to decoder in the cases where the container format actually knows that (which we don't really have any good cases of anyway), and for that, a codec private option doesn't really work. But for now, an option might be good enough until we have real samples that need the more detailed handling, to decouple this from the bits per sample definition. >>> See Bug #54 >> >> Oh, didn't know about this issue. >> >>>> Due to the way avcodec_encode_audio is defined for non frame >>>> based codecs, calculating the number of input samples in the >>>> encode function has to be changed once the bits per samples >>>> definition is changed. >>> >>> >>> It isn't defined that way for non-frame-based codecs, it's defined that >>> way for PCM. The g722 encoder is just wrong. All other ADPCM encoders >>> set frame_size and use it accordingly. >> >> Hmm, so are there any other non-PCM encoders that don't set a frame size? >> avcodec.h says this behaviour is for PCM codecs only, while avcodec.c does >> it for any codec that has a frame_size <= 1. > > AFAIK, PCM and G.722 are the only encoders which do not set a frame_size >> 1. But I haven't checked them all yet. > >> My issue (that pts isn't generated properly) is fixed by setting any >> frame_size in the encoder init function (and changing the encode_frame >> function accordingly), but does that make sense for a codec that can use >> any frame size? (For this codec, the user could set any other frame size >> he wants after initializing the encoder.) And if we'd want to change this >> to use a default frame size, what would be a suitable default? > > > If the user changes frame_size after init with almost any encoder (with > the exception of the last frame for CODEC_CAP_SMALL_LAST_FRAME) it will > generally break horribly. I'm actually debating about how to handle PCM > and other such codecs in the new encoding API that I'm working on. One > option would be a CODEC_CAP that indicates that the encoder will take > any number of samples from the user (via AVFrame.nb_samples) and encode > them. Another option would be to have them behave like all the others > and set some arbitrary frame_size. Ok - a more straightforward handling of these things in the new API really is a welcome addition :-) // Martin
diff --git a/libavcodec/g722enc.c b/libavcodec/g722enc.c index bdc30d5..706a7ae 100644 --- a/libavcodec/g722enc.c +++ b/libavcodec/g722enc.c @@ -280,12 +280,12 @@ static int g722_encode_frame(AVCodecContext *avctx, { G722Context *c = avctx->priv_data; const int16_t *samples = data; - int i; + int i, nsamples = buf_size * 8 / av_get_bits_per_sample(avctx->codec->id); if (avctx->trellis) return g722_encode_trellis(avctx, dst, buf_size, data); - for (i = 0; i < buf_size >> 1; i++) { + for (i = 0; i < nsamples >> 1; i++) { int xlow, xhigh, ihigh, ilow; filter_samples(c, &samples[2*i], &xlow, &xhigh); ihigh = encode_high(&c->band[1], xhigh); diff --git a/libavcodec/utils.c b/libavcodec/utils.c index 53440e0..59588d7 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -1151,8 +1151,8 @@ int av_get_bits_per_sample(enum CodecID codec_id){ case CODEC_ID_ADPCM_SWF: case CODEC_ID_ADPCM_MS: case CODEC_ID_ADPCM_YAMAHA: - return 4; case CODEC_ID_ADPCM_G722: + return 4; case CODEC_ID_PCM_ALAW: case CODEC_ID_PCM_MULAW: case CODEC_ID_PCM_S8: