如何从 AVCaptureSession 产生的 CMSampleBuffer 中获取 Y 分量?
-
28-09-2019 - |
题
嘿,我正在尝试使用 AVCaptureSession 访问 iPhone 相机的原始数据。我遵循 Apple 提供的指南(链接在这里).
来自样本缓冲区的原始数据采用 YUV 格式(我对原始视频帧格式的理解正确吗??),如何从samplebuffer中存储的原始数据中直接获取Y分量的数据。
解决方案
当设置返回原始相机帧的 AVCaptureVideoDataOutput 时,您可以使用如下代码设置帧的格式:
[videoOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
在本例中,指定了 BGRA 像素格式(我使用它来匹配 OpenGL ES 纹理的颜色格式)。该格式的每个像素都有一个字节,按顺序表示蓝色、绿色、红色和 alpha。这样做可以很容易地提取颜色分量,但由于需要从相机本机 YUV 颜色空间进行转换,您确实会牺牲一点性能。
其他支持的色彩空间是 kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange
和 kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
在较新的设备上和 kCVPixelFormatType_422YpCbCr8
在 iPhone 3G 上。这 VideoRange
或者 FullRange
后缀仅指示是否返回 Y 的 16 - 235 之间的字节和 UV 的 16 - 240 之间的字节或每个组件的完整 0 - 255 之间的字节。
我相信 AVCaptureVideoDataOutput 实例使用的默认色彩空间是 YUV 4:2:0 平面色彩空间(iPhone 3G 除外,它是 YUV 4:2:2 交错)。这意味着视频帧中包含两个图像数据平面,Y 平面在前。对于生成的图像中的每个像素,该像素处的 Y 值都有一个字节。
您可以通过在委托回调中实现类似的操作来获取原始 Y 数据:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
unsigned char *rawPixelBase = (unsigned char *)CVPixelBufferGetBaseAddress(pixelBuffer);
// Do something with the raw pixels here
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
}
然后,您可以找出图像上每个 X、Y 坐标在帧数据中的位置,并拉出与该坐标处的 Y 分量相对应的字节。
Apple 的 FindMyiCone 样本来自 2010 年全球开发者大会 (可与视频一起访问)展示了如何处理每帧的原始 BGRA 数据。我还创建了一个示例应用程序,您可以下载其代码 这里, ,执行 基于颜色的对象跟踪 使用 iPhone 摄像头的实时视频。两者都展示了如何处理原始像素数据,但它们都不能在 YUV 色彩空间中工作。
其他提示
除了Brad的答案和您自己的代码外,您还需要考虑以下内容:
由于您的图像有两架单独的飞机,因此功能 cvPixelBuffergetBaseadDress 不会返回平面的基础地址,而是返回附加数据结构的基础地址。可能是由于当前的实现,您将地址足够接近第一台平面,以便您可以看到图像。但这就是它被转移并在左上方有垃圾的原因。接收第一平面的正确方法是:
unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
图像中的一行可能比图像的宽度(由于舍入)更长。这就是为什么有单独的功能可以获取每行宽度和字节数的原因。目前您没有这个问题。但这可能会随着下一个版本的iOS改变。因此,您的代码应该是:
int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
int bufferWidth = CVPixelBufferGetWidth(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
int size = bufferHeight * bytesPerRow ;
unsigned char *pixel = (unsigned char*)malloc(size);
unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy (pixel, rowBase, size);
另请注意,您的代码会在iPhone 3G上惨败。
如果您只需要亮度频道,我建议您不要使用BGRA格式,因为它带有转换开销。苹果建议使用BGRA如果您正在做渲染工作,但是您不需要提取亮度信息。正如布拉德(Brad)已经提到的那样,最有效的格式是摄像机本地YUV格式。
但是,从示例缓冲区中提取正确的字节有点棘手,尤其是关于iPhone 3G的iPhone 3G,其交织已播种的YUV 422格式。因此,这是我的代码,它可以与iPhone 3G,3GS,iPod Touch 4和iPhone 4s一起使用。
#pragma mark -
#pragma mark AVCaptureVideoDataOutputSampleBufferDelegate Methods
#if !(TARGET_IPHONE_SIMULATOR)
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection;
{
// get image buffer reference
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// extract needed informations from image buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);
size_t bufferSize = CVPixelBufferGetDataSize(imageBuffer);
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
CGSize resolution = CGSizeMake(CVPixelBufferGetWidth(imageBuffer), CVPixelBufferGetHeight(imageBuffer));
// variables for grayscaleBuffer
void *grayscaleBuffer = 0;
size_t grayscaleBufferSize = 0;
// the pixelFormat differs between iPhone 3G and later models
OSType pixelFormat = CVPixelBufferGetPixelFormatType(imageBuffer);
if (pixelFormat == '2vuy') { // iPhone 3G
// kCVPixelFormatType_422YpCbCr8 = '2vuy',
/* Component Y'CbCr 8-bit 4:2:2, ordered Cb Y'0 Cr Y'1 */
// copy every second byte (luminance bytes form Y-channel) to new buffer
grayscaleBufferSize = bufferSize/2;
grayscaleBuffer = malloc(grayscaleBufferSize);
if (grayscaleBuffer == NULL) {
NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
return nil; }
memset(grayscaleBuffer, 0, grayscaleBufferSize);
void *sourceMemPos = baseAddress + 1;
void *destinationMemPos = grayscaleBuffer;
void *destinationEnd = grayscaleBuffer + grayscaleBufferSize;
while (destinationMemPos <= destinationEnd) {
memcpy(destinationMemPos, sourceMemPos, 1);
destinationMemPos += 1;
sourceMemPos += 2;
}
}
if (pixelFormat == '420v' || pixelFormat == '420f') {
// kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange = '420v',
// kCVPixelFormatType_420YpCbCr8BiPlanarFullRange = '420f',
// Bi-Planar Component Y'CbCr 8-bit 4:2:0, video-range (luma=[16,235] chroma=[16,240]).
// Bi-Planar Component Y'CbCr 8-bit 4:2:0, full-range (luma=[0,255] chroma=[1,255]).
// baseAddress points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct
// i.e.: Y-channel in this format is in the first third of the buffer!
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0);
baseAddress = CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0);
grayscaleBufferSize = resolution.height * bytesPerRow ;
grayscaleBuffer = malloc(grayscaleBufferSize);
if (grayscaleBuffer == NULL) {
NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
return nil; }
memset(grayscaleBuffer, 0, grayscaleBufferSize);
memcpy (grayscaleBuffer, baseAddress, grayscaleBufferSize);
}
// do whatever you want with the grayscale buffer
...
// clean-up
free(grayscaleBuffer);
}
#endif
这只是其他所有人在其他线程上和其他线程上的辛勤工作的高潮,对于发现它有用的任何人都转换为Swift 3。
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
if let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags.readOnly)
let pixelFormatType = CVPixelBufferGetPixelFormatType(pixelBuffer)
if pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
|| pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange {
let bufferHeight = CVPixelBufferGetHeight(pixelBuffer)
let bufferWidth = CVPixelBufferGetWidth(pixelBuffer)
let lumaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let size = bufferHeight * lumaBytesPerRow
let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let lumaByteBuffer = unsafeBitCast(lumaBaseAddress, to:UnsafeMutablePointer<UInt8>.self)
let releaseDataCallback: CGDataProviderReleaseDataCallback = { (info: UnsafeMutableRawPointer?, data: UnsafeRawPointer, size: Int) -> () in
// https://developer.apple.com/reference/coregraphics/cgdataproviderreleasedatacallback
// N.B. 'CGDataProviderRelease' is unavailable: Core Foundation objects are automatically memory managed
return
}
if let dataProvider = CGDataProvider(dataInfo: nil, data: lumaByteBuffer, size: size, releaseData: releaseDataCallback) {
let colorSpace = CGColorSpaceCreateDeviceGray()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.noneSkipFirst.rawValue)
let cgImage = CGImage(width: bufferWidth, height: bufferHeight, bitsPerComponent: 8, bitsPerPixel: 8, bytesPerRow: lumaBytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo, provider: dataProvider, decode: nil, shouldInterpolate: false, intent: CGColorRenderingIntent.defaultIntent)
let greyscaleImage = UIImage(cgImage: cgImage!)
// do what you want with the greyscale image.
}
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags.readOnly)
}
}