funMV: Bag of Visual Words

Bag of Words의 요약:

# BoW를 이해하기 위한 Toy example

# BoW algorithm analysis

# 2013/06/28, 2014/12/03 개선

# by funmv

from PIL import Image

from pylab import *

import os

from numpy import *

from scipy.cluster.vq import *

import sift

import vocabulary

# 물체의 class는 3개이다 (즉, 0~3/4~7/8~11의 4개씩 동일 물체를 다른 자세에서 찍었음. 아래 그림 참조)

imlist = ['ukbench00000.jpg', 'ukbench00001.jpg', 'ukbench00002.jpg', 'ukbench00003.jpg', 'ukbench00004.jpg', 'ukbench00005.jpg', 'ukbench00006.jpg', 'ukbench00007.jpg', 'ukbench00008.jpg', 'ukbench00009.jpg', 'ukbench00010.jpg', 'ukbench00011.jpg']

nbr_images=len(imlist)

featlist=[ imlist[i][:-3]+'sift' for i in range(nbr_images)]

"""for i in range(nbr_images):

sift.process_image(imlist[i], featlist[i])

"""

descr = []

descr.append(sift.read_features_from_file(featlist[0])[1])

descriptors = descr[0]

# sift.read_features_from_file(featlist[i])[0]:

# list of [pixel coord of each feature point, scale, rotation angle] for i-image

# size: (# of feature point x 4) for i-th image

# sift.read_features_from_file(featlist[i])[1]:

# list of [feature values] for i-th image

# size: (# of feature point x 128) for i-th image

for i in arange(1, nbr_images):

descr.append(sift.read_features_from_file(featlist[i])[1])

descriptors = vstack((descriptors, descr[i])) # stack of vector

#len(descr[0]): number of feature points -> 2276

#len(descr[0][1]): size of 1st feature vector -> 128

voc, distortion = kmeans(descriptors[::10,:],3,1) # select one per 10 rows, 3개의 word를 뽑아냄

len(voc) #3, voc = 3x128

len(voc[0]) #3

len(voc[1]) #128

nbr_words = voc.shape[0] #3

# (# of images, bins of histogram(= # of words))

# = (12x3)

imwords=zeros((nbr_images, nbr_words))

print imwords

words, distance = vq(descr[0],voc) # vector quantization

# len(words)->2276

# words: index vector of the cluster that each feature involves

# [1, 2, 1, 0, 1, 2, ...]

#voca = vocabulary.Vocabulary('ukbenchtest')

for i in range(nbr_images): # def project

hist = zeros((nbr_words))

words, distance = vq(descr[i],voc)

# 현재 im에 대해 각 feature가 속하는 word의 index와 이 word까지의 거리가 리턴

# index를 이용하여 해당 word에 보팅하여 histogram을 만듬

for w in words:

hist[w] += 1

imwords[i] = hist

print imwords # degree that each im involve to each cluster

"""

0 1 2 : cluster index (word가 3개이니까 index는 2까지)

[[ 766. 461. 1049.]: 1st image의 histogram의 모양

[ 725. 451. 1020.]: 2nd image의 "

[ 671. 461. 1133.]: ...

[ 1101. 630. 1403.]

[ 260. 317. 409.]

[ 267. 308. 370.]

[ 283. 394. 476.]

[ 239. 331. 410.]

[ 1105. 468. 1317.]

[ 116. 191. 390.]

[ 122. 251. 439.]

[ 1183. 597. 1475.]]: 12번째 im의 histogram모양

12번째 이미지의 모든 특징 중에서 word 0에 속하는 것은 1183개, 1번 597개, 2번 1475개이다. 3경우 합하면 특징의 개수이다. 따라서 test영상의 특징에 대한 histogram을 그리고 위 12개 중에서 hist모양이 비슷한 것을 찾으면 그것이 해당 영상이다.

"""

다음의 코드에서 voc의 내용을 알 수 있음.

funMV

2014년 11월 30일 일요일

Bag of Visual Words

댓글 없음:

댓글 쓰기

태그

프로필