Задача поиска кода в массиве

https://stackoverflow.com/questions/1078770

21-08-2019
|

Вопрос

Вот моя задача (код гольфа):Возьмите два массива байтов и определите, является ли второй массив подстрокой первого.Если да, выведите индекс, по которому содержимое второго массива появляется в первом.Если вы не нашли второй массив в первом, то выведите -1.

Пример ввода:{ 63, 101, 245, 215, 0 } { 245, 215 }

Ожидаемый результат:2

Пример ввода 2:{ 24, 55, 74, 3, 1 } { 24, 56, 74 }

Ожидаемый результат 2:-1

Редактировать: Кто-то заметил, что bool является избыточным, поэтому все, что нужно сделать вашей функции, — это вернуть целое число, представляющее индекс значения, или -1, если оно не найдено.

Решение

Дж

37 символов для еще большей функциональности, чем требуется:он возвращает список все соответствующие индексы.

I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))

Использование:

   NB. Дайте этой функции имя
   я =: I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))
   NB. Тест №1
   245 215 я 63 101 245 215 0
2
   NB. Тест №2 – нет результатов
   24 56 74 я 24 55 74 3 1

   NB. Тест №3:совпадения в нескольких местах
   1 1 я 1 1 1 2 1 1 3
0 1 4
   NB. Тест №4:совпадают только точные подстроки
   1 2 я 0 1 2 3 1 0 2 1 2 0
1 7

NB. list[0 to end], list[1 to end], list[2 to end], ...
<@}."0 _~i.@#

NB. Does the LHS completely match the RHS (truncated to match LHS)?
[-:#@[{.>@]

NB. boolean list of match/no match
([-:#@[{.>@])"_ 0(<@}."0 _~i.@#)

NB. indices of *true* elements
I.@(([-:#@[{.>@])"_ 0(<@}."0 _~i.@#))

Другие советы

Обычная шепелявость:

(defun golf-code (master-seq sub-seq)
  (search sub-seq master-seq))

Постскриптум, ~~149~~ ~~146~~ ~~170~~ ~~166~~ ~~167~~ 159 символов (в части «выполняем работу»):

% define data
/A [63 101 245 215 0] def
/S [245 215] def

% do the work
/d{def}def/i{ifelse}d/l S length 1 sub d/p l d[/C{dup[eq{pop -1}{dup S p
get eq{pop p 0 eq{]length}{/p p 1 sub d C}i}{p l eq{pop}if/p l d C}i}i}d
A aload pop C

% The stack now contains -1 or the position

Обратите внимание, что это находит последний появление подмассива, если он содержится более одного раза.

Лист регистраций изменений:

Заменять false к [[ne и true к [[eq сохранить трех персонажей
Устранена ошибка, которая могла привести к ложному отрицательному результату, если последний элемент S появляется дважды в A.К сожалению, это исправление содержит 24 символа.
Исправление ошибки сделано немного дешевле, сэкономив четыре символа.
Пришлось снова вставить пробел, потому что тире — допустимый символ в имени.Эта синтаксическая ошибка не была обнаружена, поскольку тестовый пример не дошел до этой точки.
Перестал возвращать логические значения, поскольку ОП их больше не требует.Экономит 8 символов.

Объясненная версия:

К сожалению, подсветка синтаксиса SO не поддерживает PostScript, поэтому читаемость по-прежнему ограничена.

/A [63 101 245 215 0] def
/S [245 215 ] def

/Slast S length 1 sub def % save the index of the last element of S,
                          % i.e. length-1
/Spos Slast def % our current position in S; this will vary
[ % put a mark on the bottom of the stack, we need this later.

/check % This function recursively removes values from the stack
       % and compares them to the values in S
{
  dup [ 
  eq
  { % we found the mark on the bottom, i.e. we have no match
    pop -1 % remove the mark and push the results
  }
  { % we're not at the mark yet
    dup % save the top value (part of the bugfix)
    S Spos get
    eq 
    {  % the top element of the stack is equal to S[Spos]
       pop % remove the saved value, we don't need it
       Spos 0
       eq 
       { % we are at the beginning of S, so the whole thing matched.
         ] length % Construct an array from the remaining values
                  % on the stack. This is the part of A before the match,
                  % so its length is equal to the position of the match.
                  % Hence we push the result and we're done.
       }
       { % we're not at the beginning of S yet, so we have to keep comparing
         /Spos Spos 1 sub def % decrease Spos
         check % recurse
       }
       ifelse
    }
    { % the top element of the stack is different from S[Spos]
      Spos Slast eq {pop} if % leave the saved top value on the stack
                             % unless we're at the end of S, because in
                             % this case, we have to compare it to the
                             % last element of S (rest of the bugfix)
      /Spos Slast def % go back to the end of S
      check % recurse
    }
    ifelse
 }
 ifelse
}
def % end of the definition of check

A aload % put the contents of A onto the stack; this will also push A again,
        % so we have to ...
pop % ...remove it again
check % And here we go!

С99

#include <string.h>

void find_stuff(void const * const array1, const size_t array1length, /* Length in bytes, not elements */
                void const * const array2, const size_t array2length, /* Length in bytes, not elements */
                char * bReturnBool,
                int * bReturnIndex)
{
    void * found = memmem(array1, array1length, array2, array2length);
    *bReturnBool = found != NULL;
    *bReturnIndex = *bReturnBool ? found - array1 : -1;
}

Короче говоря, и ~~немного~~ НАМНОГО запутаннее:

#include <string.h>
#define f(a,b,c,d,e,f) { void * g = memmem(a, b, c, d); f = (e = !!g) ? g - a : -1; }

Питон 2 и 3, 73 68 58 персонажей

На основе ~~Нихил Челлия's отвечать~~ kaiser.se's отвечать:

>>> t=lambda l,s:''.join(map(chr,l)).find(''.join(map(chr,s)))
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

Питон 3, 41 36 персонажей

Отчасти благодаря грызун:

>>> t=lambda l,s:bytes(l).find(bytes(s))
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

Хаскелл, 68 64 персонажа

Порядок аргументов, указанный ОП:

import List;t l s=maybe(-1)id$findIndex id$map(isPrefixOf s)$tails l

Как эфемиентный указывает, мы можем поменять аргументы и сократить код на четыре символа:

import List;t s=maybe(-1)id.findIndex id.map(isPrefixOf s).tails

В Питоне:

def test(large, small):
    for i in range(len(large)):
        if large[i:i+len(small)] == small:
            return i
    return -1

Но поскольку люди хотят краткости, а не элегантности:

def f(l,s):
 for i in range(len(l)):
  if l[i:i+len(s)]==s:return i
 return -1

Это 75 символов, считая пробелы.

Ruby, используя Array#pack (тело из 41 символа):

def bytearray_search(a,b)
  (i=b.pack('C*').index(b.pack('C*')))?i:-1
end

Perl (тело из 36 символов, исключая обработку параметров):

sub bytearray_search {
  ($a,$b) = @_;
  index(pack('C*',@$a),pack('C*',@$b))
}

Я чувствую, что жульничаю, но при использовании Perl это будет делать то, что хочет ОП:

sub byte_substr {
    use bytes;
    index shift,shift
}

Обычно index() в Perl работает со строками с символьной семантикой, но прагма «использовать байты» заставляет вместо этого использовать байтовую семантику.Со страницы руководства:

Когда действуют «использование байтов», кодирование временно игнорируется, и каждая строка рассматривается как серия байтов.

Еще один на Python:

def subarray(large, small):
    strsmall = ' '.join([str(c).zfill(3) for c in small])
    strlarge = ' '.join([str(c).zfill(3) for c in large])
    pos = strlarge.find(strsmall)
    return  ((pos>=0), pos//4)

Рубин 1.9 (44Б)

_=->a,b{[*a.each_cons(b.size)].index(b)||-1}

p _[[63, 101, 245, 215, 0], [245, 215]]
p _[[24, 55, 74, 3, 1], [24, 56, 74]]

горубы (29Б)

_=->a,b{a.e_(b.sz).dx(b)||-1}

Питон:84 символа

def f(a,b):
 l=[a[i:i+len(b)]for i in range(len(a))]
 return b in l and l.index(b)or-1

Пролог:84 символа (вместо возврата -1 выводится «нет»):

s(X,[]).
s([H|T],[H|U]):-s(T,U).
f(X,Y,0):-s(X,Y).
f([_|T],Y,N):-f(T,Y,M),N is M+1.

Определение функции Python oneliner в 64 символах

def f(l,s): return ''.join(map(chr,l)).find(''.join(map(chr,s)))

Поскольку мы явно переданы массив байтов мы можем преобразовать это в собственный байтовый массив Python str и использовать str.find

Python3 36 байт

на основе Stephan202

>>> t=lambda l,s:bytes(l).find(bytes(s))
... 
>>> t([63, 101, 245, 215, 0], [245, 215])
2
>>> t([24, 55, 74, 3, 1], [24, 56, 74])
-1

В Питоне:

def SearchArray(input, search):
found = -1
for i in range(0, len(input) - len(search)):
    for j in range(0, len(search)):
        if input[i+j] == search[j]:
            found = i
        else:
            found = -1
            break
if  found >= 0:
    return True, found
else:
    return False, -1

Тестировать

print SearchArray([ 63, 101, 245, 215, 0 ], [ 245, 215 ])
print SearchArray([ 24, 55, 74, 3, 1 ], [ 24, 56, 74 ])

Что печатает:

(True, 2)
(False, -1)

Обратите внимание, что существует более короткое решение, но оно использует функции языка Python, которые на самом деле не являются переносимыми.

В С#:

private object[] test(byte[] a1, byte[] a2)
{
    string s1 = System.Text.Encoding.ASCII.GetString(a1);
    string s2 = System.Text.Encoding.ASCII.GetString(a2);
    int pos = s1.IndexOf(s2, StringComparison.Ordinal);
    return new object[] { (pos >= 0), pos };
}

Пример использования:

byte[] a1 = new byte[] { 24, 55, 74, 3, 1 };
byte[] a2 = new byte[] { 24, 56, 74 };
object[] result = test(a1, a2);
Console.WriteLine("{0}, {1}", result[0], result[1]); // prints "False, -1"

public class SubArrayMatch
{
    private bool _IsMatch;
    private int _ReturnIndex = -1;
    private List<byte> _Input;
    private List<byte> _SubArray;
    private bool _Terminate = false;
#region "Public Properties"
    public List<byte> Input {
        set { _Input = value; }
    }

    public List<byte> SubArray {
        set { _SubArray = value; }
    }

    public bool IsMatch {
        get { return _IsMatch; }
    }

    public int ReturnIndex {
        get { return _ReturnIndex; }
    }
#endregion
#region "Constructor"
    public SubArrayMatch(List<byte> parmInput, List<byte> parmSubArray)
    {
        this.Input = parmInput;
        this.SubArray = parmSubArray;
    }
#endregion
#region "Main Method"
    public void MatchSubArry()
    {
        int _MaxIndex;
        int _Index = -1;
        _MaxIndex = _Input.Count - 1;

        _IsMatch = false;

        foreach (byte itm in _Input) {
            _Index += 1;

            if (_Terminate == false) {
                if (SubMatch(_Index, _MaxIndex) == true) {
                    _ReturnIndex = _Index;
                    _IsMatch = true;
                    return;
                }
            }
            else {
                return;
            }
        }
    }

    private bool SubMatch(int BaseIndex, int MaxIndex)
    {
        int _MaxSubIndex;
        byte _cmpByte;
        int _itr = -1;

        _MaxSubIndex = _SubArray.Count - 1;
        _MaxSubIndex += 1;

        if (_MaxSubIndex > MaxIndex) {
            _Terminate = true;
            return false;
        }

        foreach (byte itm in _SubArray) {
            _itr += 1;

            _cmpByte = _Input(BaseIndex + _itr);

            if (!itm == _cmpByte) {
                return false;
            }
        }

        return true;
    }
#endregion

}

Анхар Хуссейн Миа, под редакцией:Анхар.Миа @:07.03.2009

PHP

В 105...

function a_m($h,$n){$m=strstr(join(",",$h),join(",",$n));return$m?(count($h)-substr_count($m,",")-1):-1;}

или более явно,

function array_match($haystack,$needle){
  $match = strstr (join(",",$haystack), join(",",$needle));
  return $match?(count($haystack)-substr_count($match,",")-1):-1;
}

ГНУ С:

int memfind(const char * haystack, size_t haystack_size, const char * needle,
    size_t needle_size)
{
    const char * match = memmem(haystack, hasystack_size, needle, needle_size);
    return match ? match - haystack : -1;
}

ANSI C, без библиотеки:

int memfind(const char * haystack, size_t haystack_size, const char * needle,
    size_t needle_size)
{
    size_t pos = 0;
    for(; pos < haystack_size; ++pos)
    {
        size_t i = 0;
        while(pos + i < haystack_size && i < needle_size &&
            haystack[pos + i] == needle[i]) ++i;

        if(i == needle_size) return pos;
    }

    return -1;
}

Рубин.Не совсем самый короткий в мире, но классный, поскольку является расширением Array.

class Array
  def contains other=[]
    index = 0
    begin
      matched = 0
      ndx = index
      while other[matched] == self[ndx]
        return index if (matched+1) == other.length
        matched += 1
        ndx += 1
      end
    end until (index+=1) == length
    -1
  end
end

puts [ 63, 101, 245, 215, 0 ].contains [245, 215]
# 2
puts [ 24, 55, 74, 3, 1 ].contains [24, 56, 74 ]
# -1

C#, списки под названием «a» и «b»:

Enumerable.Range(-1, a.Count).Where(n => n == -1 
    || a.Skip(n).Take(b.Count).SequenceEqual(b)).Take(2).Last();

Если вас не волнует возврат первого экземпляра, вы можете просто сделать:

Enumerable.Range(-1, a.Count).Last(n => n == -1 
    || a.Skip(n).Take(b.Count).SequenceEqual(b));

int m(byte[]a,int i,int y,byte[]b,int j,int z){return i<y?j<z?a[i]==b[j++]?m(a,++i,y,b,j,z):m(a,0,y,b,j,z):-1:j-y;}

Java, 116 символов.Имеет небольшой дополнительный функционал.Хорошо, значит, передавать условие запуска и длину массива в вызывающую программу — это сложная задача.Назовите это так:

m(byte[] substring, int substart, int sublength, byte[] bigstring, int bigstart, int biglength)

Как Фредрик уже опубликовал код, используя способ преобразования STRING.Вот еще один способ сделать это с помощью C#.

Джвулард опереди меня, кстати.Я тоже использовал тот же алгоритм, что и он.Это была одна из проблем, которую нам приходилось решать с помощью C++ в колледже.

public static bool Contains(byte[] parent, byte[] child, out int index)
{
    index = -1;

    for (int i = 0; i < parent.Length - child.Length; i++)
    {
        for (int j = 0; j < child.Length; j++)
        {
            if (parent[i + j] == child[j])
                index = i;
            else
            {
                index = -1;
                break;
            }
        }
    }

    return (index >= 0);
}

Лисп v1

(defun byte-array-subseqp (subarr arr)
  (let ((found (loop 
                  for start from 0 to (- (length arr) (length subarr))
                  when (loop 
                          for item across subarr
                          for index from start below (length arr)
                          for same = (= item (aref arr index))
                          while same
                          finally (return same))
                  do (return start))))
    (values (when found t) ; "real" boolean
            (or found -1))))

Lisp v2 (обратите внимание, подраздел создает копию

(defun byte-array-subseqp (subarr arr)
  (let* ((alength (length arr))
         (slength (length subarr))
         (found (loop 
                   for start from 0 to (- alength slength)
                   when (equalp subarr (subseq arr start (+ start slength)))
                   do (return start))))
    (values (when found t)
            (or found -1))))

С#:

public static object[] isSubArray(byte[] arr1, byte[] arr2) {
  int o = arr1.TakeWhile((x, i) => !arr1.Skip(i).Take(arr2.Length).SequenceEqual(arr2)).Count();
  return new object[] { o < arr1.Length, (o < arr1.Length) ? o : -1 };
}

В Руби:

def subset_match(array_one, array_two)
  answer = [false, -1]
  0.upto(array_one.length - 1) do |line|
    right_hand = []
    line.upto(line + array_two.length - 1) do |inner|
      right_hand << array_one[inner]
    end
    if right_hand == array_two then answer = [true, line] end
  end
  return answer
end

Пример:IRB (Main): 151: 0> subset_match ([[24, 55, 74, 3, 1], [24, 56, 74]) => [false, -1

IRB (Main): 152: 0> subset_match ([63, 101, 245, 215, 0], [245, 215]) => [true, 2

C# работает с любым типом, имеющим оператор равенства:

first
  .Select((index, item) => 
    first
     .Skip(index)
     .Take(second.Count())
     .SequenceEqual(second) 
    ? index : -1)
  .FirstOrDefault(i => i >= 0)
  .Select(i => i => 0 ? 
     new { Found = true, Index = i } 
    : 
     new { Found = false, Index - 1 });

(defun golf-code (master-seq sub-seq)
  (let ((x (search sub-seq master-seq)))
    (values (not (null x)) (or x -1))))

Хаскель (114 символов):

import Data.List
import Data.Maybe
g a b | elem b $ subsequences a = fromJust $ elemIndex (head b) a | otherwise = -1

Руби, мне стыдно после просмотра кода Лара.

def contains(a1, a2)
  0.upto(a1.length-a2.length) { |i| return i if a1[i, a2.length] == a2 }
  -1
end

Вот версия C#, использующая сравнение строк.Он работает правильно, но мне кажется немного хакерским.

int FindSubArray(byte[] super, byte[] sub)
{
    int i = BitConverter.ToString(super).IndexOf(BitConverter.ToString(sub));
    return i < 0 ? i : i / 3;
}

// 106 characters
int F(byte[]x,byte[]y){int i=BitConverter.ToString(x)
.IndexOf(BitConverter.ToString(y));return i<0?i:i/3;}

Вот немного более длинная версия, которая выполняет истинное сравнение каждого отдельного элемента массива.

int FindSubArray(byte[] super, byte[] sub)
{
    int i, j;
    for (i = super.Length - sub.Length; i >= 0; i--)
    {
        for (j = 0; j < sub.Length && super[i + j] == sub[j]; j++);
        if (j >= sub.Length) break;
    }
    return i;
}

// 135 characters
int F(byte[]x,byte[]y){int i,j;for(i=x.Length-y.Length;i>=0;i--){for
(j=0;j<y.Length&&x[i+j]==y[j];j++);if(j>=y.Length)break;}return i;}

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow