mpeg标准8x8DCT变换以及定点化实现
Mpeg1/2标准的DCT变换与H264的整数DCT不太一样,会有小数运算。硬件实现时会做定点化处理,因此也会产生误差,误差主要体现在某些数值小数部分在0.5左右。比如小数运算时数值为4.4999,四舍五入最终像素值为4;如果定点化结果为4.50001,实际上与小数运算差异不大,但最终像素值却相差1。1.DCT简介DCT和DCT反变换可用如下公式表示:其中,Xij是图像块 X 中第 i 行第 j
Mpeg1/2标准的DCT变换与H264的整数DCT不太一样,会有小数运算。硬件实现时会做定点化处理,因此也会产生误差,误差主要体现在某些数值小数部分在0.5左右。比如小数运算时数值为4.4999,四舍五入最终像素值为4;如果定点化结果为4.50001,实际上与小数运算差异不大,但最终像素值却相差1。
1.DCT简介
DCT和DCT反变换可用如下公式表示:
其中,Xij是图像块 X 中第 i 行第 j 列图像或残差值,Ymn 是变换结果矩阵 Y 相应频率点上的 DCT 系 数。可以用矩阵表示
Y=AXATY=AXA^TY=AXAT
X=ATYAX=A^TYAX=ATYA
Mpeg1/2标准的DCT是8x8大小的。
令a=1/8a=\sqrt{1/8}a=1/8,
b=12cos(π8)b=\frac{1}{2}\cos{\left(\frac{\pi}{8}\right)}b=21cos(8π),
c=12cos(3π8)c=\frac{1}{2}\cos{\left(\frac{3\pi}{8}\right)}c=21cos(83π) ,
d=12cos(π16)d=\frac{1}{2}\cos{\left(\frac{\pi}{16}\right)}d=21cos(16π),
e=12cos(316π)e=\frac{1}{2}\cos{\left(\frac{3}{16}\pi\right)}e=21cos(163π),
f=12cos516πf=\frac{1}{2}\cos{\frac{5}{16}\pi}f=21cos165π,
g=12cos716πg=\frac{1}{2}\cos{\frac{7}{16}\pi}g=21cos167π,
则 矩阵A可以如下表示
(aaaaaaaadefg−g−f−e−dbc−c−b−b−ccbe−g−d−ffdg−ea−a−aaa−a−aaf−dge−e−gd−fc−bb−c−cb−bcg−fe−dd−ef−g)\begin{pmatrix} a & a & a & a& a& a& a& a \\ d & e&f &g&-g&-f&-e&-d \\ b&c&-c&-b&-b&-c&c&b \\ e&-g&-d&-f&f&d&g&-e \\ a& -a&-a&a&a&-a&-a&a \\ f&-d&g&e&-e&-g&d&-f \\ c&-b&b&-c&-c&b&-b&c\\ g&-f&e&-d&d&-e&f&-g \end{pmatrix}⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛adbeafcgaec−g−a−d−b−faf−c−d−agbeag−b−fae−c−da−g−bfa−e−cda−f−cd−a−gb−ea−ecg−ad−bfa−db−ea−fc−g⎠⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎞
矩阵运算可以使用蝶形变换加速,以8x8反变换为例,分析ATYA^TYATY的第一列,
其中间结果用acc0-acc7表示,如下
(acc0acc1acc2acc3)=(abacac−a−ba−c−aba−ba−c)(x0x2x4x6)\begin{pmatrix} acc0 \\ acc1 \\ acc 2 \\ acc3 \\ \end{pmatrix} =\begin{pmatrix} a&b& a& c\\ a&c&-a&-b \\ a&-c&-a&b\\ a&-b&a&-c \end{pmatrix}\begin{pmatrix} x0\\x2 \\ x4\\ x6 \end{pmatrix}⎝⎜⎜⎛acc0acc1acc2acc3⎠⎟⎟⎞=⎝⎜⎜⎛aaaabc−c−ba−a−aac−bb−c⎠⎟⎟⎞⎝⎜⎜⎛x0x2x4x6⎠⎟⎟⎞
(acc4acc5acc6acc7)=(defge−g−d−ff−dgeg−fe−d)(x1x3x5x7)\begin{pmatrix} acc4 \\ acc5 \\ acc 6 \\ acc7 \\ \end{pmatrix} =\begin{pmatrix} d&e& f& g\\ e&-g&-d&-f \\ f&-d&g&e\\ g&-f&e&-d \end{pmatrix}\begin{pmatrix} x1\\x3 \\ x5\\ x7 \end{pmatrix}⎝⎜⎜⎛acc4acc5acc6acc7⎠⎟⎟⎞=⎝⎜⎜⎛defge−g−d−ff−dgeg−fe−d⎠⎟⎟⎞⎝⎜⎜⎛x1x3x5x7⎠⎟⎟⎞
最终结果为
y0=acc0+acc4y0 = acc0+acc4y0=acc0+acc4
y1=acc1+acc5y1 = acc1+acc5y1=acc1+acc5
y2=acc2+acc6y2 = acc2+acc6y2=acc2+acc6
y3=acc3+acc7y3 = acc3+acc7y3=acc3+acc7
y4=acc3−acc7y4 = acc3-acc7y4=acc3−acc7
y5=acc2−acc6y5 = acc2-acc6y5=acc2−acc6
y6=acc1−acc5y6= acc1-acc5y6=acc1−acc5
y7=acc0−acc4y7 = acc0-acc4y7=acc0−acc4
DCT反变换的几种python实现
- 按照公式直接计算
def normal_dct(block):
A = np.zeros((8, 8)) # 生成0矩阵
shape = src.shape[1] # 获取维数
for i in range(8):
for j in range(8):
if (i == 0):
x = sqrt(1 / shape)
else:
x = sqrt(2 / shape)
A[i][j] = x * cos(pi * (j + 0.5) * i / shape)
return np.dot(np.dot(A.T, src), A)
- 使用scipy的dct函数
from scipy.fftpack import dct, idct
def idct2(block):
return idct(idct(block.T, norm = 'ortho').T, norm = 'ortho')
2.定点化实现
定点主要是把小数转成整型数据范围,通过整形加减以及移位操作替换小数运算。DCT运算过程中矩阵AAA中的参数都是小数,需要转为整数进行运算。
小数可以用2的负数次幂相加来表示,以a=1/8a=\sqrt{1/8}a=1/8为例,可以表示为2−2+2−4+2−5+2−7+2−92^{-2} + 2^{-4} + 2^{-5} + 2^{-7} + 2^{-9}2−2+2−4+2−5+2−7+2−9。实现过程中以16个元素的数组来表示,每个数组元素所在数组下标对应2的幂次,则a的表项为a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]。
# a = sqrt(1/8) = 0.35355 = 2^-2 + 2^-4 + 2^-5 + 2^-7 + 2^-9
a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]
# b = (1/2*cos(PI/8)) = 0.46194 = 2^-1 - 2^-5 - 2^-7 + 2^-10
b_table = [0,1,0,0,0,-1,0,-1,0,0,1,0,0,0,0,0]
# c = (1/2*cos(PI*3/8)) = 0.19134 = 2^-3 + 2^-4 + 2^-8 - 2^-14
c_table = [0,0,0,1,1,0,0,0,1,0,0,0,0,0,-1,0]
# d = 1/2*cos(PI/16) = 0.49039 = 2^-1 - 2^-7 - 2^-9 + 2^-13
d_table = [0,1,0,0,0,0,0,-1,0,-1,0,0,0,1,0,0]
# e = 1/2*cos(PI*3/16) = 0.41573 = 2^-2 + 2^-3 + 2^-5 + 2^-7 + 2^-9 - 2^-12
e_table = [0,0,1,1,0,1,0,1,0,1,0,0,-1,0,0,0]
# f = 1/2*cos(PI*5/16) = 0.27779 = 2^-2 + 2^-5 - 2^-8 + 2^-11
f_table = [0,0,1,0,0,1,0,0,-1,0,0,1,0,0,0,0]
# g = 1/2*cos(PI*7/16) = 0.09755 = 2^-4 + 2^-5 + 2^-8 - 2^-13
g_table = [0,0,0,0,1,1,0,0,1,0,0,0,0,-1,0,0]
# 定点化IDCT
def fixpoint_idct(block):
input = block.reshape(64, 1)
idct_data1 = [0 for i in range(64)]
idct_data2 = [0 for i in range(64)]
tmp_data1 = idct_data1
tmp_data2 = idct_data2
for i in range(64):
tmp_data1[i] = input[i] << 10
#两次蝶形变换
for k in range(2):
for hor in range(8):
a_x0 = b_x2 = a_x4 = c_x6 = 0
c_x2 = b_x6 = 0
d_x1 = e_x3 = f_x5 = g_x7 = 0
e_x1 = g_x3 = d_x5 = f_x7 = 0
f_x1 = d_x3 = g_x5 = e_x7 = 0
g_x1 = f_x3 = e_x5 = d_x7 = 0
i = hor << 3
x0 = tmp_data1[i]
x1 = tmp_data1[i+1]
x2 = tmp_data1[i+2]
x3 = tmp_data1[i+3]
x4 = tmp_data1[i+4]
x5 = tmp_data1[i+5]
x6 = tmp_data1[i+6]
x7 = tmp_data1[i+7]
#遍历长度为16的查找表
for j in range(16):
a_x0 += (x0 >> j) * a_table[j]
b_x2 += (x2 >> j) * b_table[j]
a_x4 += (x4 >> j) * a_table[j]
c_x6 += (x6 >> j) * c_table[j]
c_x2 += (x2 >> j) * c_table[j]
b_x6 += (x6 >> j) * b_table[j]
d_x1 += (x1 >> j) * d_table[j]
e_x1 += (x1 >> j) * e_table[j]
f_x1 += (x1 >> j) * f_table[j]
g_x1 += (x1 >> j) * g_table[j]
d_x3 += (x3 >> j) * d_table[j]
e_x3 += (x3 >> j) * e_table[j]
f_x3 += (x3 >> j) * f_table[j]
g_x3 += (x3 >> j) * g_table[j]
d_x5 += (x5 >> j) * d_table[j]
e_x5 += (x5 >> j) * e_table[j]
f_x5 += (x5 >> j) * f_table[j]
g_x5 += (x5 >> j) * g_table[j]
d_x7 += (x7 >> j) * d_table[j]
e_x7 += (x7 >> j) * e_table[j]
f_x7 += (x7 >> j) * f_table[j]
g_x7 += (x7 >> j) * g_table[j]
acc0 = a_x0 + b_x2 + a_x4 + c_x6
acc1 = a_x0 + c_x2 - a_x4 - b_x6
acc2 = a_x0 - c_x2 - a_x4 + b_x6
acc3 = a_x0 - b_x2 + a_x4 - c_x6
acc4 = d_x1 + e_x3 + f_x5 + g_x7
acc5 = e_x1 - g_x3 - d_x5 - f_x7
acc6 = f_x1 - d_x3 + g_x5 + e_x7
acc7 = g_x1 - f_x3 + e_x5 - d_x7
tmp_data2[hor] = acc0 + acc4
tmp_data2[(1<<3)+hor] = acc1 + acc5
tmp_data2[(2<<3)+hor] = acc2 + acc6
tmp_data2[(3<<3)+hor] = acc3 + acc7
tmp_data2[(4<<3)+hor] = acc3 - acc7
tmp_data2[(5<<3)+hor] = acc2 - acc6
tmp_data2[(6<<3)+hor] = acc1 - acc5
tmp_data2[(7<<3)+hor] = acc0 - acc4
if k == 0:
tmp_data1 = idct_data2
tmp_data2 = idct_data1
for i in range(64):
tmp_data2[i] = (tmp_data2[i] + 512) >> 10
if tmp_data2[i] > 255:
tmp_data2[i] = 255
if tmp_data2[i] < -256:
tmp_data2[i] = -256
return np.asarray(tmp_data2).reshape(8, 8)
更多推荐


所有评论(0)