使用DPCM进行图像压缩的C++实现方法
我们知道,对于图像或视频的每一帧而言,相邻的像素之间有着较强的相关性——除了在边缘、轮廓等位置,相邻像素的像素值相差并不大,也就是说,图像或视频的一帧中存在着很大的空间冗余。而DPCM(Differential Pulse Code Modulation,差分脉冲编码调制)便是一种简单而高效的去冗余算法。
基本原理
DPCM
对于去除引言中提到的空间冗余,一种很自然的想法便是,使用前一个像素的像素值作为当前像素值的预测,然后只存储并传输当前像素值和预测值的差值。
以8 bit量化的图像为例,差值的范围在 [ − 255 , 255 ] [-255, 255] [−255,255],完全表示需要9 bit,但并不是在所有图像中都存在由纯黑突变到纯白的情况,因而如果实际的动态范围只有 [ − 63 , 63 ] [-63, 63] [−63,63],那么就可以只用7 bit表示,从而实现初步的压缩。
但除此之外,我们可以更多地依靠将差值量化,以进行更进一步的压缩。量化的原理实际上非常简单:例如对于一组取值为 [ 0 , 255 ] [0,255] [0,255]的数据,若要进行4 bit量化,我们只需要将它们分别除以 2 8 − 4 = 16 2^{8-4}=16 28−4=16并向下取整即可,这样数据范围变成了 [ 0 , 15 ] [0,15] [0,15],也就完成了量化。对于差值信号 d ∈ [ − 255 , 255 ] d\in[-255, 255] d∈[−255,255],若同样要进行4 bit量化,我们就要先将双极性数据转化为单极性,再重复前面的操作即可:
d ^ = ⌊ d + 255 2 9 − 4 ⌋ \hat d = \left\lfloor \dfrac{d+255} {2^{9-4}} \right\rfloor d^=⌊29−4d+255⌋
我们再将量化后的预测误差与预测值叠加,就得到了该像素的重建(reconstruction)值,同时作为下一个像素的预测值。
至此,我们就完成了DPCM编码。实际上,在编码器中已经包含了解码器,因为解码同样只需要将预测值和量化预测误差相加。
其对应的框图如下:
DPCM编解码器原理框图
PSNR
为了便于我们对实验结果进行定量分析,在这里再简单介绍一下PSNR(Peak Signal to Noise Ratio,峰值信号噪声比)。
首先计算均方误差 MSE(Mean Square Error):
M S E = ∑ i = 0 M ∑ j = 0 N [ f ( i , j ) − f ′ ( i , j ) ] 2 M N {\rm MSE} = \frac{\sum_{i = 0}^M \sum_{j=0}^N \left[f(i,j)-f'(i,j)\right]^{2}}{MN} MSE=MN∑i=0M∑j=0N[f(i,j)−f′(i,j)]2
其中 M , N M,N M,N为图像的宽和高, f ( i , j ) , f ’ ( i , j ) f(i,j),f’(i,j) f(i,j),f’(i,j)为原图像和重建图像在 ( i , j ) (i,j) (i,j)的像素值。PSNR(单位:dB)以如下方式计算:
P S N R = 10 lg ( 2 b i t s − 1 ) 2 M S E {\rm PSNR} = 10\lg \dfrac {\left(2^{\rm bits}-1\right)^2}{\rm MSE} PSNR=10lgMSE(2bits−1)2
其中 b i t s {\rm bits} bits为原图像地量化比特数,即8。
一般来说:
- PSNR ≥ 40 dB \text {PSNR} \ge 40 \text {dB} PSNR≥40dB时,图像质量非常好,接近于原图像;
- 30 dB ≤ PSNR < 40 dB 30 \text {dB} \le \text {PSNR} < 40 \text {dB} 30dB≤PSNR<40dB时,图像有可察觉的失真,但质量仍可接受;
- 20 dB ≤ PSNR < 30 dB 20 \text {dB} \le \text {PSNR} < 30 \text {dB} 20dB≤PSNR<30dB时,图像质量较差;
- PSNR < 20 dB \text {PSNR} < 20 \text {dB} PSNR<20dB时,图像质量已经无法接受。
Huffman编码
Huffman是一种无失真信源编码算法,编码后可以得到即时的最佳码。在这里我们直接调用现有的Huffman编码程序,将原图像及量化后的预测误差图像进行压缩编码。
核心代码
前面没有详细说明对于每一行的第一个像素的预测值取值的问题,对于该问题有两种处理方法:
- 每行第一个像素的预测值均取128;
- 每一行第一个像素的预测误差直接取0,并将该像素值赋给重建图像对应像素。
这里我们选择以一种方法,并根据前面的原理可以写出如下的代码:
int w = 256;
int h = 256;
int PixelOverflow(int value, int thLower, int thUpper) {
if (value < thLower) {
return thLower;
} else if (value > thUpper) {
return thUpper;
} else {
return unsigned char(value);
}
}
void DpcmEncoding(unsigned char* yBuff, unsigned char* qPredErrBuff, unsigned char* reconBuff, int qBits) {
int prediction;
int predErr; // Prediction error
int invPredErr; // Inverse quantised value of quantised prediction error
for (int i = 0; i < h; i++) {
prediction = 128; // The prediction of the first pixel of each row set to be 128
predErr = yBuff[i * w] - prediction; // predErr with the domain of [-128, 128] (8-bit)
int temp = (predErr + 128) / pow(2, 8 - qBits); // qBits-bit quantisation
// (predErr + 128) with the domain of [0, 256]
qPredErrBuff[i * w] = PixelOverflow(temp, 0, pow(2, qBits) - 1);
invPredErr = qPredErrBuff[i * w] * pow(2, 8 - qBits) - 128; // Inverse quantisation
reconBuff[i * w] = PixelOverflow(invPredErr + prediction, 0, 255); // Reconstruction level
for (int j = 1; j < w; j++) { // Strat from the second pixel of each row
prediction = reconBuff[i * w + j - 1]; // The previous pixel value set as prediction
predErr = yBuff[i * w + j] - prediction; // predErr with the domain of [-255, 255] (9-bit)
int temp = (predErr + 255) / pow(2, 9 - qBits); // qBits-bit quantisation
// (predErr + 255) with the domain of [0, 510]; [0, 2^(qBits) - 1] after division
qPredErrBuff[i * w + j] = PixelOverflow(temp, 0, (pow(2, qBits) - 1)); // (predErr + 255) with the domain of [0, 255]
invPredErr = qPredErrBuff[i * w + j] * pow(2, 9 - qBits) - 255;
reconBuff[i * w + j] = PixelOverflow(invPredErr + prediction, 0, 255); // Reconstruction level
}
}
}
这里需要说明的是,由于量化后得到的是一组索引值,因而需要通过反量化恢复正确的灰度值范围。
实验结果分析
量化预测误差图像、重建图像
以8 bit Lena灰度图作为测试图像,进行DPCM + 8 bit量化时的原图、预测误差图像和重建图像如下:
原图、预测误差图像、重建图像(8 bit量化)
下面再对比一下7 bit—1 bit量化的效果:
7 bit—1 bit量化的重建图像
在5 bit量化时,不仔细分辨不会看出较大差异;4 bit时,已经出现可察觉的失真;3 bit及以下出现较大的失真,图像质量难以接受。
由于量化预测误差范围为 [ 0 , 2 b i t s ] \left[0,2^{{\rm bits}}\right] [0,2bits],因而随着量化比特数降低,预测误差图像中的人物边缘将越来越不明显,因此这里就不再展示。
在这里我们也可以发现一点,例如对于量化误差进行1 bit量化,并不代表重建图像只有0和255两个亮度电平,这也是相比于直接对原图像进行量化的优势之一。
定量分析
计算PMF和熵
首先可以计算原图和预测误差的概率密度函数以及对应的信源熵。具体方法可参考求RGB图像各分量的概率分布和熵。
可以看到,原图像中存在较大的相关性,进行了DPCM编码后,相关性得到了较好的去除,从而实现了压缩。
计算PSNR
我们可以通过计算重建图像的PSNR来定量计算图像的压缩质量,相应的代码如下:
void PrintPSNR(unsigned char* oriBuffer, unsigned char* recBuffer, int qBits, const char* psnrFileName) {
double mse;
double sum = 0;
double temp;
double psnr;
for (int i = 0; i < w * h; i++) {
temp = pow((oriBuffer[i] - recBuffer[i]), 2);
sum += temp;
}
mse = sum / (w * h);
psnr = 10 * log10(255 * 255 / mse);
FILE* outFilePtr;
if (fopen_s(&outFilePtr, psnrFileName, "ab") == 0) {
cout << "Successfully opened \"" << psnrFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << psnrFileName << "\".\n";
exit(-1);
}
fprintf(outFilePtr, "%d,%lf\n", qBits, psnr);
fclose(outFilePtr);
}
计算结果如下:
可以看出,从客观评价(定量)角度对图像质量的分析与前面从主观评价角度对图像质量的分析基本吻合。
计算压缩比
最后我们可以将经过了Huffman编码的原图像,与进行了DPCM和Huffman两种编码的图像(量化后的预测误差图像)进行对比,并计算压缩比,结果如图。
完整代码
最后附上完整代码,供大家参考,如有问题欢迎评论区交流、指正。
declarations.h
#pragma once
extern int w; // Width of image
extern int h; // Height of image
void DpcmEncoding(unsigned char* yBuff, unsigned char* qPredErrBuff, unsigned char* reconBuff, int qBits);
void PrintPMF_Entropy(unsigned char* buffer, int qBits, const char* pmfFileName, const char* entrFileName);
void PrintPSNR(unsigned char* oriBuffer, unsigned char* recBuffer, int qBits, const char* psnrFileName);
DPCM.cpp
#include <iostream>
#include "declarations.h"
int w = 256;
int h = 256;
int PixelOverflow(int value, int thLower, int thUpper) {
if (value < thLower) {
return thLower;
} else if (value > thUpper) {
return thUpper;
} else {
return unsigned char(value);
}
}
void DpcmEncoding(unsigned char* yBuff, unsigned char* qPredErrBuff, unsigned char* reconBuff, int qBits) {
int prediction;
int predErr; // Prediction error
int invPredErr; // Inverse quantised value of quantised prediction error
for (int i = 0; i < h; i++) {
prediction = 128; // The prediction of the first pixel of each row set to be 128
predErr = yBuff[i * w] - prediction; // predErr with the domain of [-128, 128] (8-bit)
int temp = (predErr + 128) / pow(2, 8 - qBits); // qBits-bit quantisation
// (predErr + 128) with the domain of [0, 256]
qPredErrBuff[i * w] = PixelOverflow(temp, 0, pow(2, qBits) - 1);
invPredErr = qPredErrBuff[i * w] * pow(2, 8 - qBits) - 128; // Inverse quantisation
reconBuff[i * w] = PixelOverflow(invPredErr + prediction, 0, 255); // Reconstruction level
for (int j = 1; j < w; j++) { // Strat from the second pixel of each row
prediction = reconBuff[i * w + j - 1]; // The previous pixel value set as prediction
predErr = yBuff[i * w + j] - prediction; // predErr with the domain of [-255, 255] (9-bit)
int temp = (predErr + 255) / pow(2, 9 - qBits); // qBits-bit quantisation
// (predErr + 255) with the domain of [0, 510]; [0, 2^(qBits) - 1] after division
qPredErrBuff[i * w + j] = PixelOverflow(temp, 0, (pow(2, qBits) - 1)); // (predErr + 255) with the domain of [0, 255]
invPredErr = qPredErrBuff[i * w + j] * pow(2, 9 - qBits) - 255;
reconBuff[i * w + j] = PixelOverflow(invPredErr + prediction, 0, 255); // Reconstruction level
}
}
}
Stats.cpp
#include <iostream>
#include "declarations.h"
using namespace std;
void PrintPMF_Entropy(unsigned char* buffer, int qBits, const char* pmfFileName, const char* entrFileName) {
int count[256] = { 0 }; // Counter
double freq[256] = { 0 }; // Frequency
double entropy = 0;
for (int i = 0; i < w * h; i++) {
int index = (int)buffer[i];
count[index]++;
}
for (int i = 0; i < 256; i++) {
freq[i] = (double)count[i] / (w * h);
if (freq[i] != 0) {
entropy += (-freq[i]) * log(freq[i]) / log(2);
}
}
FILE* pmfFilePtr;
FILE* entrFilePtr;
if (fopen_s(&pmfFilePtr, pmfFileName, "wb") == 0) {
cout << "Successfully opened \"" << pmfFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << pmfFileName << "\".\n";
exit(-1);
}
if (fopen_s(&entrFilePtr, entrFileName, "ab") == 0) {
cout << "Successfully opened \"" << entrFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << entrFileName << "\".\n";
exit(-1);
}
fprintf(pmfFilePtr, "Symbol,Frequency\n");
for (int i = 0; i < 256; i++) {
fprintf(pmfFilePtr, "%-3d,%-8.2e\n", i, freq[i]); // 将数据输出到文件中(csv文件以“,”作为分隔符)
}
fprintf(entrFilePtr, "%d,%.4lf\n", qBits, entropy);
fclose(pmfFilePtr);
fclose(entrFilePtr);
}
void PrintPSNR(unsigned char* oriBuffer, unsigned char* recBuffer, int qBits, const char* psnrFileName) {
double mse;
double sum = 0;
double temp;
double psnr;
for (int i = 0; i < w * h; i++) {
temp = pow((oriBuffer[i] - recBuffer[i]), 2);
sum += temp;
}
mse = sum / (w * h);
psnr = 10 * log10(255 * 255 / mse);
FILE* outFilePtr;
if (fopen_s(&outFilePtr, psnrFileName, "ab") == 0) {
cout << "Successfully opened \"" << psnrFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << psnrFileName << "\".\n";
exit(-1);
}
fprintf(outFilePtr, "%d,%lf\n", qBits, psnr);
fclose(outFilePtr);
}
main.cpp
#include <iostream>
#include <string>
#include <sstream>
#include "declarations.h"
using namespace std;
int main(int argc, char* argv[]) {
int qBits = 1;
const char* orFileName = "Lena.yuv";
const char* qpeFileName = "Lena_QPE (1 bit).yuv"; // Name of quantised prediction error file
const char* recFileName = "Lena_reconstruction (1 bit).yuv"; // Name of reconstruction level file
FILE* oriFilePtr;
FILE* qpeFilePtr;
FILE* recFilePtr;
if (fopen_s(&oriFilePtr, orFileName, "rb") == 0) {
cout << "Successfully opened \"" << orFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << orFileName << "\".\n";
exit(-1);
}
if (fopen_s(&qpeFilePtr, qpeFileName, "wb") == 0) {
cout << "Successfully opened \"" << qpeFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << qpeFileName << "\".\n";
exit(-1);
}
if (fopen_s(&recFilePtr, recFileName, "wb") == 0) {
cout << "Successfully opened \"" << recFileName << "\".\n";
} else {
cout << "WARNING!! Failed to open \"" << recFileName << "\".\n";
exit(-1);
}
unsigned char* oriYBuff = new unsigned char[w * h];
unsigned char* qpeYBuff = new unsigned char[w * h];
unsigned char* recYbuff = new unsigned char[w * h];
unsigned char* uBuff = new unsigned char[w * h / 4];
unsigned char* vBuff = new unsigned char[w * h / 4];
fread(oriYBuff, sizeof(unsigned char), w * h, oriFilePtr);
DpcmEncoding(oriYBuff, qpeYBuff, recYbuff, qBits);
memset(uBuff, 128, w * h / 4);
memset(vBuff, 128, w * h / 4);
fwrite(qpeYBuff, sizeof(unsigned char), w * h, qpeFilePtr);
fwrite(uBuff, sizeof(unsigned char), w * h / 4, qpeFilePtr); // Greyscale image
fwrite(vBuff, sizeof(unsigned char), w * h / 4, qpeFilePtr);
fwrite(recYbuff, sizeof(unsigned char), w * h, recFilePtr);
fwrite(uBuff, sizeof(unsigned char), w * h / 4, recFilePtr); // Greyscale image
fwrite(vBuff, sizeof(unsigned char), w * h / 4, recFilePtr);
PrintPMF_Entropy(oriYBuff, qBits, "Lena-PMF (1 bit).csv", "Lena-entropy.csv");
PrintPMF_Entropy(qpeYBuff, qBits, "Lena_QPE-PMF (1 bit).csv", "Lena_QPE-entropy.csv");
PrintPSNR(oriYBuff, recYbuff, qBits, "Lena_reconstruction-PSNR.csv");
fclose(oriFilePtr);
fclose(qpeFilePtr);
fclose(recFilePtr);
delete[]oriYBuff;
delete[]qpeYBuff;
delete[]recYbuff;
delete[]uBuff;
delete[]vBuff;
}