을 찾기위한 알고리즘 유사한 이미지

https://stackoverflow.com/questions/75891

09-06-2019
|

문제

내가 필요로하는 알고리즘을 결정할 수 있습니다 두 개의 이미지가'비슷한'와 유사한 패턴을 인식하는 색상의 밝기,모양 등등....내가 필요할 수 있으로 일부 포인터를 매개 변수는 인간의 뇌를 사용하는'분류'이미지입니다...

나는 보았다 hausdorff 기반으로 일치하는지 보이는 주로 일치하는 변형체와 패턴의 모양입니다.

해결책

나는 비슷한 것,분해하여 이미지를 사용하여 서명을 웨이브렛 변환.

나의 접근 방식을 선택하는 가장 중요한 n 수에서 각 변형된 채널,그리고 녹화를 자신의 위치에 있습니다.이 이루어졌으로 목록을 정렬의(전원,위치)튜플에 따라 abs(힘).비슷한 이미지는 유사성을 공유하는 것이 중요한 계수에서 동일한 장소입니다.

나는 그것을 발견했다고 최고의를 변환하는 이미지에 YUV 형식,효과적으로 수량 유사한 모양 Y(채널)및 색상(UV 채널).

할 수 있습에서 찾을 나의 구현을 위서 mactorii, 는 불행하게도 나는이 작동되지 않았으로 많은 내가 있어야:-)

다른 방법은 일부 친구의 내용으로 놀라 울 정도로 좋은 결과는 단순히 이미지 크기를 조정,말 4x4 픽셀 저장하는 서명하십시오.어떻게 비슷한 2 개의 이미지를 획득 할 수있다라고 말하여 컴퓨팅 맨해튼 거리 사이 2 개의 이미지를 사용하여,해당 픽셀이 있습니다.나이 있지 않는의 세부사항들이 어떻게 수행되는 크기 조정,그래서 당신은 플레이와 함께 다양한 알고리즘을 사용할 수 있는 작업을 찾을 수 있는 하나 적당합니다.

다른 팁

pHash 할 수 있습니다.

지각 hash n.의 지문 오디오,비디오,이미지 파일을 수학적으로 기반의 오디오 또는 영상 콘텐츠에 포함.달리 암호화 해쉬 기능에 의존하는 사태의 효과 작은 변화에서 입력을 선도하는 격렬한 변화는 출력에서,지각적 해시는""가까이 다른 경우 입력이 시각적 또는 청각을 어지럽히 비슷합니다.

사 SIFT 을 다시 검 테 동일한 개체에 이미지와 다릅니다.그것은 정말 강력한지만 오히려 복잡한이 있을 수도 있습니입니다.는 경우 이미지 해야하는 매우 유사한 몇 가지 간단한 매개변수에 따라 사이의 차이는 두 개의 이미지를 당신에게 말할 수 있습니다.몇 가지 포인:

을 정규화 이미지,즉확인 평균 밝기 두 이미지의 동일한으로 계산하여 평균이의 밝기를 모두와 스케일링 가장 밝은 아래에 따라 비율(클리핑을 방지하기 위해서 최고 수준))특히 더 관심이 있는 경우에 형상보다 컬러에서입니다.
합의 색 차이를 통해 정상화된 이미지 당 채널입니다.
을 찾아 가장자리에서의 이미지와 거리를 측정하는 초래시키 가장자리에 있는 픽셀을 모두 이미지입니다.(대형)
를 나누는 이미지 집합의 개별 지역과 비교 평균 컬러의 각 지역입니다.
임계값 이미지 중 하나에서(또는 집합의)레벨(s)및 수 픽셀의 수는 결과 까만/백색이미지 다릅니다.

당신이 사용할 수 각 이미지 Diff

그것은 명령 라인 유틸리티 비교하는 두 개의 이미지를 사용하여 각 지표입니다.는 사용 전산 모델을 인간의 시각시스템을 결정하는 경우에 두 개의 이미지를 시각적으로 다르다,그래서 작은 변화에서 픽셀은 무시됩니다.게다가,그것은 크게 줄어듭양성의 수에 차이가 발생할 수 OS 또는 컴퓨터 아키텍처의 차이점이 있습니다.

그것이 어려운 문제!하는 방법에 따라 정확하게 당신이 필요하며,그것의 종류에 따라 달라집니다 이미지가 당신과 함께 노력하고 있습니다.당신이 사용할 수 있습 히스토그램을 비교하는 색상이지만,분명하지 않 계정으로 공간적 분포는 사람들의 색깔에 이미지(즉양).Edge 탐지에 의해 다음의 어떤 종류의 segmentation(i.e을 따기 모양)에 제공할 수 있는 패턴 매칭에 대한 다른 이미지입니다.당신이 사용할 수 있습 coocurence 매트릭스를 비교 텍스처 등을 고려하여 이미지로 매트릭스의 픽셀값과 비교하는 사람들의 행렬.거기에 몇 가지 좋은 책을 밖에서 이미지 일치고 기계-비전에서 검색 Amazon 일부를 찾을 것입니다.

희망이 도움이 됩니다!

내 lab 이 문제를 해결하기 위해 필요한뿐만 아니라,우리가 사용 Tensorflow.기 전체 응용 프로그램 구현을 위해 구상한 이미지는 유사성이 있습니다.

튜토리얼에 대한 벡에 이미지에 대한 유사성을 계산 확인 이 페이지.여기에는 파이썬(또는 게시물을 참조하십시오에 대한 전체 워크 플로우):

from __future__ import absolute_import, division, print_function

"""

This is a modification of the classify_images.py
script in Tensorflow. The original script produces
string labels for input images (e.g. you input a picture
of a cat and the script returns the string "cat"); this
modification reads in a directory of images and 
generates a vector representation of the image using
the penultimate layer of neural network weights.

Usage: python classify_images.py "../image_dir/*.jpg"

"""

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Simple image classification with Inception.

Run image classification with Inception trained on ImageNet 2012 Challenge data
set.

This program creates a graph from a saved GraphDef protocol buffer,
and runs inference on an input JPEG image. It outputs human readable
strings of the top 5 predictions along with their probabilities.

Change the --image_file argument to any jpg image to compute a
classification of that image.

Please see the tutorial and website for a detailed description of how
to use this script to perform image recognition.

https://tensorflow.org/tutorials/image_recognition/
"""

import os.path
import re
import sys
import tarfile
import glob
import json
import psutil
from collections import defaultdict
import numpy as np
from six.moves import urllib
import tensorflow as tf

FLAGS = tf.app.flags.FLAGS

# classify_image_graph_def.pb:
#   Binary representation of the GraphDef protocol buffer.
# imagenet_synset_to_human_label_map.txt:
#   Map from synset ID to a human readable string.
# imagenet_2012_challenge_label_map_proto.pbtxt:
#   Text representation of a protocol buffer mapping a label to synset ID.
tf.app.flags.DEFINE_string(
    'model_dir', '/tmp/imagenet',
    """Path to classify_image_graph_def.pb, """
    """imagenet_synset_to_human_label_map.txt, and """
    """imagenet_2012_challenge_label_map_proto.pbtxt.""")
tf.app.flags.DEFINE_string('image_file', '',
                           """Absolute path to image file.""")
tf.app.flags.DEFINE_integer('num_top_predictions', 5,
                            """Display this many predictions.""")

# pylint: disable=line-too-long
DATA_URL = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
# pylint: enable=line-too-long


class NodeLookup(object):
  """Converts integer node ID's to human readable labels."""

  def __init__(self,
               label_lookup_path=None,
               uid_lookup_path=None):
    if not label_lookup_path:
      label_lookup_path = os.path.join(
          FLAGS.model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt')
    if not uid_lookup_path:
      uid_lookup_path = os.path.join(
          FLAGS.model_dir, 'imagenet_synset_to_human_label_map.txt')
    self.node_lookup = self.load(label_lookup_path, uid_lookup_path)

  def load(self, label_lookup_path, uid_lookup_path):
    """Loads a human readable English name for each softmax node.

    Args:
      label_lookup_path: string UID to integer node ID.
      uid_lookup_path: string UID to human-readable string.

    Returns:
      dict from integer node ID to human-readable string.
    """
    if not tf.gfile.Exists(uid_lookup_path):
      tf.logging.fatal('File does not exist %s', uid_lookup_path)
    if not tf.gfile.Exists(label_lookup_path):
      tf.logging.fatal('File does not exist %s', label_lookup_path)

    # Loads mapping from string UID to human-readable string
    proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines()
    uid_to_human = {}
    p = re.compile(r'[n\d]*[ \S,]*')
    for line in proto_as_ascii_lines:
      parsed_items = p.findall(line)
      uid = parsed_items[0]
      human_string = parsed_items[2]
      uid_to_human[uid] = human_string

    # Loads mapping from string UID to integer node ID.
    node_id_to_uid = {}
    proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines()
    for line in proto_as_ascii:
      if line.startswith('  target_class:'):
        target_class = int(line.split(': ')[1])
      if line.startswith('  target_class_string:'):
        target_class_string = line.split(': ')[1]
        node_id_to_uid[target_class] = target_class_string[1:-2]

    # Loads the final mapping of integer node ID to human-readable string
    node_id_to_name = {}
    for key, val in node_id_to_uid.items():
      if val not in uid_to_human:
        tf.logging.fatal('Failed to locate: %s', val)
      name = uid_to_human[val]
      node_id_to_name[key] = name

    return node_id_to_name

  def id_to_string(self, node_id):
    if node_id not in self.node_lookup:
      return ''
    return self.node_lookup[node_id]


def create_graph():
  """Creates a graph from saved GraphDef file and returns a saver."""
  # Creates graph from saved graph_def.pb.
  with tf.gfile.FastGFile(os.path.join(
      FLAGS.model_dir, 'classify_image_graph_def.pb'), 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    _ = tf.import_graph_def(graph_def, name='')


def run_inference_on_images(image_list, output_dir):
  """Runs inference on an image list.

  Args:
    image_list: a list of images.
    output_dir: the directory in which image vectors will be saved

  Returns:
    image_to_labels: a dictionary with image file keys and predicted
      text label values
  """
  image_to_labels = defaultdict(list)

  create_graph()

  with tf.Session() as sess:
    # Some useful tensors:
    # 'softmax:0': A tensor containing the normalized prediction across
    #   1000 labels.
    # 'pool_3:0': A tensor containing the next-to-last layer containing 2048
    #   float description of the image.
    # 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
    #   encoding of the image.
    # Runs the softmax tensor by feeding the image_data as input to the graph.
    softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')

    for image_index, image in enumerate(image_list):
      try:
        print("parsing", image_index, image, "\n")
        if not tf.gfile.Exists(image):
          tf.logging.fatal('File does not exist %s', image)

        with tf.gfile.FastGFile(image, 'rb') as f:
          image_data =  f.read()

          predictions = sess.run(softmax_tensor,
                          {'DecodeJpeg/contents:0': image_data})

          predictions = np.squeeze(predictions)

          ###
          # Get penultimate layer weights
          ###

          feature_tensor = sess.graph.get_tensor_by_name('pool_3:0')
          feature_set = sess.run(feature_tensor,
                          {'DecodeJpeg/contents:0': image_data})
          feature_vector = np.squeeze(feature_set)        
          outfile_name = os.path.basename(image) + ".npz"
          out_path = os.path.join(output_dir, outfile_name)
          np.savetxt(out_path, feature_vector, delimiter=',')

          # Creates node ID --> English string lookup.
          node_lookup = NodeLookup()

          top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
          for node_id in top_k:
            human_string = node_lookup.id_to_string(node_id)
            score = predictions[node_id]
            print("results for", image)
            print('%s (score = %.5f)' % (human_string, score))
            print("\n")

            image_to_labels[image].append(
              {
                "labels": human_string,
                "score": str(score)
              }
            )

        # close the open file handlers
        proc = psutil.Process()
        open_files = proc.open_files()

        for open_file in open_files:
          file_handler = getattr(open_file, "fd")
          os.close(file_handler)
      except:
        print('could not process image index',image_index,'image', image)

  return image_to_labels


def maybe_download_and_extract():
  """Download and extract model tar file."""
  dest_directory = FLAGS.model_dir
  if not os.path.exists(dest_directory):
    os.makedirs(dest_directory)
  filename = DATA_URL.split('/')[-1]
  filepath = os.path.join(dest_directory, filename)
  if not os.path.exists(filepath):
    def _progress(count, block_size, total_size):
      sys.stdout.write('\r>> Downloading %s %.1f%%' % (
          filename, float(count * block_size) / float(total_size) * 100.0))
      sys.stdout.flush()
    filepath, _ = urllib.request.urlretrieve(DATA_URL, filepath, _progress)
    print()
    statinfo = os.stat(filepath)
    print('Succesfully downloaded', filename, statinfo.st_size, 'bytes.')
  tarfile.open(filepath, 'r:gz').extractall(dest_directory)


def main(_):
  maybe_download_and_extract()
  if len(sys.argv) < 2:
    print("please provide a glob path to one or more images, e.g.")
    print("python classify_image_modified.py '../cats/*.jpg'")
    sys.exit()

  else:
    output_dir = "image_vectors"
    if not os.path.exists(output_dir):
      os.makedirs(output_dir)

    images = glob.glob(sys.argv[1])
    image_to_labels = run_inference_on_images(images, output_dir)

    with open("image_to_labels.json", "w") as img_to_labels_out:
      json.dump(image_to_labels, img_to_labels_out)

    print("all done")
if __name__ == '__main__':
  tf.app.run()

일부 이미지 인식 소프트웨어 솔루션은 실제로 순수한 알고리즘에 기반하지만,사용 신경 네트워크 개념을 대신 합니다.체크아웃 http://en.wikipedia.org/wiki/Artificial_neural_network 고 즉 NeuronDotNet 는 또한 포함 흥미로운 예제: http://neurondotnet.freehostia.com/index.html

가 관련 연구를 사용하여 Kohonen 신경회로망/자기 조직도

모두 더 많은 학문적 체제(구글에 대한 PicSOM)또는 미만 학
( http://www.generation5.org/content/2004/aiSomPic.asp ,(가능하게 적당하지 않음 을 위해 모든 작업 환경))프레젠테이션이 존재합니다.

합산하는 사각형의 차이를 픽셀의 색상 값의 대폭 축소 버전(예를 들어:6x6 픽셀 단위)를 작동합니다.동일한 이미지 수율 0 유사한 이미지는 수익률은 숫자를,다른 이미지 수율이 큰 것들입니다.

다른 사람이 위의 아이디어로 YUV 첫 번째 흥미로운 소리는 내 생각이 좋은 작품,나는 나의 이미지를 계산 하는 것으로"다른"그래서 그것은 수익률은 정확한 결과도 관점에서의 색맹 observer.

이 같은 소리는 비전 문제입니다.당신이 볼 수도 있습으로 적응 강화뿐만 아니라 화상선 추출 알고리즘이 있습니다.개념에서 이러한 두 가지 도움이되어야로 접근하고 이 문제를 해결합니다.자 검출은 더 간단하게 시작하는 경우에 당신이 새로운 비전 알고리즘으로,그것은 위한 기본 사항에 대해 설명합니다.

지금까지 매개 변수로 분류:

색상 팔레트에 위치(그라데이션 계산,히스토그램의 색깔)
포함되는 모양(Ada.밀어/교육을 감지하는 모양)

에 따라 얼마나 정확한 결과,당신이 필요할 수 있는 단순히 휴식의 이미지 n x n 픽셀 블록을 분석합니다.는 경우 다른 결과를 얻을에서 첫번째 블록 당신이 중지 할 수 없 처리 결과,어떤 성능 개선.

분석하기 위한 사각형은 당신할 수 있는 예를 들어를 얻을 합의 색상 값입니다.

수행할 수 있습니다 어떤 종류의 블록은 일치하는 모션 추정 사이에 두개의 이미지를 측정한 전체적인 합의 잔류 및 운동 벡터 비용(훨씬 좋아 한 것이지에서 비디오 엔코더).이에 대한 보상하 motion;보너스 포인트,하 affine 변환 모션 추정(보상에 대한 확고 스트레칭과 비슷).수도 겹치 블록 또는 광학적 흐름입니다.

첫 단계로,당신이 시도할 수 있을 사용하여 컬러 히스토그램.그러나,당신은 정말 필요한 범위를 좁힐의 문제는 도메인입니다.일반적인 이미지 일치하는 것은 매우 어려운 문제입니다.

이 문서에서는 매우 도움이 되는 방법을 설명하는 작동:

http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

사과에 대한 가입에 늦게 토론입니다.

우리가 사용할 수 있습니 ORB 방법론을 감지하는 비슷한 기능 점 사이에 두 개의 이미지입니다.다음과 같은 링크를 직접 구현을 구 python

http://scikit-image.org/docs/dev/auto_examples/plot_orb.html

도 openCV 가 직접적인 구현을 합니다.는 경우에 당신은 더 많은 정보에 따라 연구 문서는 아래와 같습니다.

https://www.researchgate.net/publication/292157133_Image_Matching_Using_SIFT_SURF_BRIEF_and_ORB_Performance_Comparison_for_Distorted_Images

거기에 몇 가지 좋은 응답에 다른 스레드에서 이지만 내가 궁금해 뭔가를 포함하는 스펙트럼 분석 것이 일하는가?I.e., 휴식의 이미지로 그 위상과 진폭 정보와 비교입니다.이것을 피할 수 있는 문제의 일부는 자,변환 강도와 차이점이 있습니다.어쨌든,that's just me 때문에 이 같은 흥미로운 문제입니다.당신이 검색 http://scholar.google.com 나는 확실히 당신이 오를 수 있던 여러 가지에 대한 논문이다.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow