原文地址: http://www.learnopencv.com/average-face-opencv-c-python-tutorial/
先看實驗效果:最右邊的人臉是由左邊6幅人臉平均得到的。
采用dlib的關(guān)鍵點(diǎn)檢測器,獲得人臉的68個關(guān)鍵點(diǎn)。
由于輸入圖像的尺寸是大小不一的,人臉區(qū)域大小也不相同,所以要通過坐標(biāo)變換,對人臉圖像進(jìn)行歸一化操作。
在這里,我們把人臉區(qū)域warp成600*600大小,使得左眼(外眼角)位置在(180,200),右眼(外眼角)位置在(420,200)。
選擇這兩個位置的原因是:我們希望人臉中心在圖像高度的1/3位置,并且兩個眼睛保持水平,所以我們選擇左眼角位置為( 0.3*width, height / 3 ),右眼角位置為(0.7*width , height / 3) 。
dlib關(guān)鍵點(diǎn)檢測器,檢測到的左外眼角是第36個點(diǎn),右外眼角為第45個點(diǎn),如下圖:
利用這兩個點(diǎn)計算圖像的變換矩陣(similarity transform),該矩陣是一個2*3的矩陣,如下:
如果我們想對一個矩形進(jìn)行變換,其中x、y方向的縮放因為分別為sx,sy,同時旋轉(zhuǎn)一個角度 ,然后再在x方向平移tx, 在y方向平移ty, 則這樣的一個變換矩陣可以寫成:
第一列和第二列是縮放和旋轉(zhuǎn),第三列是平移。
Given, a point , the above similarity transform, moves it to point using the equation given below:
利用opencv的estimateRigidTransform方法,可以獲得這樣的變換矩陣,但遺憾的是,estimateRigidTransform至少需要三個點(diǎn),所以我們需要構(gòu)選第三個點(diǎn),構(gòu)造方法是用第三個點(diǎn)與已有的兩個點(diǎn)構(gòu)成等邊三角形,這樣第三個點(diǎn)的坐標(biāo)為:
相似變換矩陣計算出來之的一,就可以用它來對圖像和landmark進(jìn)行變換。The image is transformed using warpAffine and the points are transformed using transform.
經(jīng)過步驟2之后,所有的圖像都變成一樣大小,并且兩個眼睛的位置是保持一致。如果就這樣進(jìn)行平均臉計算的話,會得到下面這種效果,因為除了眼睛對齊了之外,其他點(diǎn)并沒有對齊,所以會出現(xiàn)這種效果。
其他點(diǎn)怎么對齊呢?我們已經(jīng)知道輸入圖像68個點(diǎn)的位置,如果還能知道輸出圖像68個點(diǎn)的位置,那自然是很容易對齊的。遺憾的是,除了眼睛的位置我們可以事先定義之外,其他點(diǎn)的位置一般很難給出合理的定義。
解決辦法是Delaunay Triangulation,具體如下:
(1)Calculate Mean Face Points
計算N張similarity transform之后的輸出圖像的所有關(guān)鍵點(diǎn)位置的平均值,即平均臉的第i個關(guān)鍵點(diǎn)位置,等于所有經(jīng)過similarity transform之后的圖像的第i個關(guān)鍵點(diǎn)位置的平均值。
(2)Calculate Delaunay Triangulation
利用平均臉的68個關(guān)鍵點(diǎn),以及圖像邊界的8個點(diǎn),計算Delaunay Triangulation,如下圖所示。The result of Delaunay triangulation is a list of triangles represented by the indices of points in the 76 points ( 68 face points + 8 boundary points ) array。
(3)Warp Triangles
對輸入圖像(similarity transform之后的圖像)和平均臉分別計算Delaunay Triangulation,如圖7的left image 和middle image,left image里面的triangle 1對應(yīng)middle image里面的triangle 1,通過這兩個三角形,可以計算一個從輸入圖像triangle 1到平均臉triangle 1的放射變換,從而把輸入圖像triangle 1里面的所有像素,變換成平均臉triangle 1的所有像素。其他triangle 也進(jìn)行同樣的操作,得到的結(jié)果就如right image所示。
經(jīng)過warp之后,將所有人臉的對應(yīng)像素的灰度值加起來求平均,即是平均臉圖像。
原作者的實驗效果:
還有一個有趣的實驗是,將Obama左右鏡像的兩張圖像進(jìn)行平均之后,得到一張對稱臉,如下圖:
完整代碼:
#!/usr/bin/env python# Copyright (c) 2016 Satya Mallick <spmallick@learnopencv.com># All rights reserved. No warranty, explicit or implicit, provided.import osimport cv2import numpy as npimport mathimport sys# Read points from text files in directorydef readPoints(path) : # Create an array of array of points. pointsArray = []; #List all files in the directory and read points from text files one by one for filePath in os.listdir(path): if filePath.endswith(".txt"): #Create an array of points. points = []; # Read points from filePath with open(os.path.join(path, filePath)) as file : for line in file : x, y = line.split() points.append((int(x), int(y))) # Store array of points pointsArray.append(points) return pointsArray;# Read all jpg images in folder.def readImages(path) : #Create array of array of images. imagesArray = []; #List all files in the directory and read points from text files one by one for filePath in os.listdir(path): if filePath.endswith(".jpg"): # Read image found. img = cv2.imread(os.path.join(path,filePath)); # Convert to floating point img = np.float32(img)/255.0; # Add to array of images imagesArray.append(img); return imagesArray;# Compute similarity transform given two sets of two points.# OpenCV requires 3 pairs of corresponding points.# We are faking the third one.def similarityTransform(inPoints, outPoints) : s60 = math.sin(60*math.pi/180); c60 = math.cos(60*math.pi/180); inPts = np.copy(inPoints).tolist(); outPts = np.copy(outPoints).tolist(); xin = c60*(inPts[0][0] - inPts[1][0]) - s60*(inPts[0][1] - inPts[1][1]) + inPts[1][0]; yin = s60*(inPts[0][0] - inPts[1][0]) + c60*(inPts[0][1] - inPts[1][1]) + inPts[1][1]; inPts.append([np.int(xin), np.int(yin)]); xout = c60*(outPts[0][0] - outPts[1][0]) - s60*(outPts[0][1] - outPts[1][1]) + outPts[1][0]; yout = s60*(outPts[0][0] - outPts[1][0]) + c60*(outPts[0][1] - outPts[1][1]) + outPts[1][1]; outPts.append([np.int(xout), np.int(yout)]); tform = cv2.estimateRigidTransform(np.array([inPts]), np.array([outPts]), False); return tform;# Check if a point is inside a rectangledef rectContains(rect, point) : if point[0] < rect[0] : return False elif point[1] < rect[1] : return False elif point[0] > rect[2] : return False elif point[1] > rect[3] : return False return True# Calculate delanauy triangledef calculateDelaunayTriangles(rect, points): # Create subdiv subdiv = cv2.Subdiv2D(rect); # Insert points into subdiv for p in points: subdiv.insert((p[0], p[1])); # List of triangles. Each triangle is a list of 3 points ( 6 numbers ) triangleList = subdiv.getTriangleList(); # Find the indices of triangles in the points array delaunayTri = [] for t in triangleList: pt = [] pt.append((t[0], t[1])) pt.append((t[2], t[3])) pt.append((t[4], t[5])) pt1 = (t[0], t[1]) pt2 = (t[2], t[3]) pt3 = (t[4], t[5]) if rectContains(rect, pt1) and rectContains(rect, pt2) and rectContains(rect, pt3): ind = [] for j in xrange(0, 3): for k in xrange(0, len(points)): if(abs(pt[j][0] - points[k][0]) < 1.0 and abs(pt[j][1] - points[k][1]) < 1.0): ind.append(k) if len(ind) == 3: delaunayTri.append((ind[0], ind[1], ind[2])) return delaunayTridef constrainPoint(p, w, h) : p = ( min( max( p[0], 0 ) , w - 1 ) , min( max( p[1], 0 ) , h - 1 ) ) return p;# Apply affine transform calculated using srcTri and dstTri to src and# output an image of size.def applyAffineTransform(src, srcTri, dstTri, size) : # Given a pair of triangles, find the affine transform. warpMat = cv2.getAffineTransform( np.float32(srcTri), np.float32(dstTri) ) # Apply the Affine Transform just found to the src image dst = cv2.warpAffine( src, warpMat, (size[0], size[1]), None, flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REFLECT_101 ) return dst# Warps and alpha blends triangular regions from img1 and img2 to imgdef warpTriangle(img1, img2, t1, t2) : # Find bounding rectangle for each triangle r1 = cv2.boundingRect(np.float32([t1])) r2 = cv2.boundingRect(np.float32([t2])) # Offset points by left top corner of the respective rectangles t1Rect = [] t2Rect = [] t2RectInt = [] for i in xrange(0, 3): t1Rect.append(((t1[i][0] - r1[0]),(t1[i][1] - r1[1]))) t2Rect.append(((t2[i][0] - r2[0]),(t2[i][1] - r2[1]))) t2RectInt.append(((t2[i][0] - r2[0]),(t2[i][1] - r2[1]))) # Get mask by filling triangle mask = np.zeros((r2[3], r2[2], 3), dtype = np.float32) cv2.fillConvexPoly(mask, np.int32(t2RectInt), (1.0, 1.0, 1.0), 16, 0); # Apply warpImage to small rectangular patches img1Rect = img1[r1[1]:r1[1] + r1[3], r1[0]:r1[0] + r1[2]] size = (r2[2], r2[3]) img2Rect = applyAffineTransform(img1Rect, t1Rect, t2Rect, size) img2Rect = img2Rect * mask # Copy triangular region of the rectangular patch to the output image img2[r2[1]:r2[1]+r2[3], r2[0]:r2[0]+r2[2]] = img2[r2[1]:r2[1]+r2[3], r2[0]:r2[0]+r2[2]] * ( (1.0, 1.0, 1.0) - mask ) img2[r2[1]:r2[1]+r2[3], r2[0]:r2[0]+r2[2]] = img2[r2[1]:r2[1]+r2[3], r2[0]:r2[0]+r2[2]] + img2Rectif __name__ == '__main__' : path = 'presidents/' # Dimensions of output image w = 600; h = 600; # Read points for all images allPoints = readPoints(path); # Read all images images = readImages(path); # Eye corners eyecornerDst = [ (np.int(0.3 * w ), np.int(h / 3)), (np.int(0.7 * w ), np.int(h / 3)) ]; imagesNorm = []; pointsNorm = []; # Add boundary points for delaunay triangulation boundaryPts = np.array([(0,0), (w/2,0), (w-1,0), (w-1,h/2), ( w-1, h-1 ), ( w/2, h-1 ), (0, h-1), (0,h/2) ]); # Initialize location of average points to 0s pointsAvg = np.array([(0,0)]* ( len(allPoints[0]) + len(boundaryPts) ), np.float32()); n = len(allPoints[0]); numImages = len(images) # Warp images and trasnform landmarks to output coordinate system, # and find average of transformed landmarks. for i in xrange(0, numImages): points1 = allPoints[i]; # Corners of the eye in input image eyecornerSrc = [ allPoints[i][36], allPoints[i][45] ] ; # Compute similarity transform tform = similarityTransform(eyecornerSrc, eyecornerDst); # Apply similarity transformation img = cv2.warpAffine(images[i], tform, (w,h)); # Apply similarity transform on points points2 = np.reshape(np.array(points1), (68,1,2)); points = cv2.transform(points2, tform); points = np.float32(np.reshape(points, (68, 2))); # Append boundary points. Will be used in Delaunay Triangulation points = np.append(points, boundaryPts, axis=0) # Calculate location of average landmark points. pointsAvg = pointsAvg + points / numImages; pointsNorm.append(points); imagesNorm.append(img); # Delaunay triangulation rect = (0, 0, w, h); dt = calculateDelaunayTriangles(rect, np.array(pointsAvg)); # Output image output = np.zeros((h,w,3), np.float32()); # Warp input images to average image landmarks for i in xrange(0, len(imagesNorm)) : img = np.zeros((h,w,3), np.float32()); # Transform triangles one by one for j in xrange(0, len(dt)) : tin = []; tout = []; for k in xrange(0, 3) : pIn = pointsNorm[i][dt[j][k]]; pIn = constrainPoint(pIn, w, h); pOut = pointsAvg[dt[j][k]]; pOut = constrainPoint(pOut, w, h); tin.append(pIn); tout.append(pOut); warpTriangle(imagesNorm[i], img, tin, tout); # Add image intensities for averaging output = output + img; # Divide by numImages to get average output = output / numImages; # Display result cv2.imshow('image', output); cv2.waitKey(0);