파이썬 - 초기 용량의 목록을 만듭니다

https://stackoverflow.com/questions/311775

10-07-2019
|

문제

이와 같은 코드는 종종 발생합니다.

l = []
while foo:
    #baz
    l.append(bar)
    #qux

목록이 새로운 요소에 맞게 지속적으로 크기를 조정해야하기 때문에 수천 개의 요소를 목록에 추가하려는 경우 정말 느립니다.

Java에서는 초기 용량의 배열리스트를 만들 수 있습니다. 목록이 얼마나 큰지 아이디어가 있다면, 이것은 훨씬 더 효율적일 것입니다.

나는 이와 같은 코드가 종종 목록 이해력으로 재현 될 수 있음을 이해합니다. for/while 루프가 매우 복잡하다면, 이것은 불가능합니다. 미국 파이썬 프로그래머와 동등한 것이 있습니까?

해결책

def doAppend( size=10000 ):
    result = []
    for i in range(size):
        message= "some unique object %d" % ( i, )
        result.append(message)
    return result

def doAllocate( size=10000 ):
    result=size*[None]
    for i in range(size):
        message= "some unique object %d" % ( i, )
        result[i]= message
    return result

결과. (각 기능을 144 배 평가하고 평균 지속 시간)

simple append 0.0102
pre-allocate  0.0098

결론. 간신히 중요합니다.

조기 최적화는 모든 악의 근원입니다.

다른 팁

Python 목록에는 사전 할당이 내장되어 있지 않습니다. 실제로 목록을 만들어야하고 부여의 오버 헤드를 피해야한다면 (그리고 귀하가하는지 확인해야합니다).

l = [None] * 1000 # Make a list of 1000 None's
for i in xrange(1000):
    # baz
    l[i] = bar
    # qux

아마도 발전기를 대신 사용하여 목록을 피할 수있을 것입니다.

def my_things():
    while foo:
        #baz
        yield bar
        #qux

for thing in my_things():
    # do something with thing

이런 식으로 목록이 모든 메모리에 전혀 저장된 것은 아니며 필요에 따라 생성됩니다.

짧은 버전 : 사용

pre_allocated_list = [None] * size

목록을 사전 할당하려면 (즉, 추가로 목록을 점차적으로 형성하는 대신 목록의 '크기'요소를 다룰 수 있도록). 이 작업은 큰 목록에서도 매우 빠릅니다. 나중에 나중에 목록 요소에 할당 될 새로운 개체를 할당하는 것은 훨씬 더 오래 걸리고 프로그램에서 병목 현상이 될 것입니다.

긴 버전 :

초기화 시간을 고려해야한다고 생각합니다. 파이썬에서 모든 것이 참조이므로 각 요소를 문자열로 설정했는지 여부는 중요하지 않습니다. 단지 참조 일뿐입니다. 각 요소에 대해 새 객체를 작성하려면 더 오래 걸리지 만 참조하십시오.

파이썬 3.2 :

import time
import copy

def print_timing (func):
  def wrapper (*arg):
    t1 = time.time ()
    res = func (*arg)
    t2 = time.time ()
    print ("{} took {} ms".format (func.__name__, (t2 - t1) * 1000.0))
    return res

  return wrapper

@print_timing
def prealloc_array (size, init = None, cp = True, cpmethod=copy.deepcopy, cpargs=(), use_num = False):
  result = [None] * size
  if init is not None:
    if cp:
      for i in range (size):
          result[i] = init
    else:
      if use_num:
        for i in range (size):
            result[i] = cpmethod (i)
      else:
        for i in range (size):
            result[i] = cpmethod (cpargs)
  return result

@print_timing
def prealloc_array_by_appending (size):
  result = []
  for i in range (size):
    result.append (None)
  return result

@print_timing
def prealloc_array_by_extending (size):
  result = []
  none_list = [None]
  for i in range (size):
    result.extend (none_list)
  return result

def main ():
  n = 1000000
  x = prealloc_array_by_appending(n)
  y = prealloc_array_by_extending(n)
  a = prealloc_array(n, None)
  b = prealloc_array(n, "content", True)
  c = prealloc_array(n, "content", False, "some object {}".format, ("blah"), False)
  d = prealloc_array(n, "content", False, "some object {}".format, None, True)
  e = prealloc_array(n, "content", False, copy.deepcopy, "a", False)
  f = prealloc_array(n, "content", False, copy.deepcopy, (), False)
  g = prealloc_array(n, "content", False, copy.deepcopy, [], False)

  print ("x[5] = {}".format (x[5]))
  print ("y[5] = {}".format (y[5]))
  print ("a[5] = {}".format (a[5]))
  print ("b[5] = {}".format (b[5]))
  print ("c[5] = {}".format (c[5]))
  print ("d[5] = {}".format (d[5]))
  print ("e[5] = {}".format (e[5]))
  print ("f[5] = {}".format (f[5]))
  print ("g[5] = {}".format (g[5]))

if __name__ == '__main__':
  main()

평가:

prealloc_array_by_appending took 118.00003051757812 ms
prealloc_array_by_extending took 102.99992561340332 ms
prealloc_array took 3.000020980834961 ms
prealloc_array took 49.00002479553223 ms
prealloc_array took 316.9999122619629 ms
prealloc_array took 473.00004959106445 ms
prealloc_array took 1677.9999732971191 ms
prealloc_array took 2729.999780654907 ms
prealloc_array took 3001.999855041504 ms
x[5] = None
y[5] = None
a[5] = None
b[5] = content
c[5] = some object blah
d[5] = some object 5
e[5] = a
f[5] = []
g[5] = ()

보시다시피, 동일한 객체에 대한 큰 참조 목록을 만드는 것만으로는 시간이 거의 걸리지 않습니다.

선입견 또는 연장이 더 오래 걸립니다 (평균은 아무것도 없었지만 몇 번 실행 한 후에는 확장 및 추가가 거의 동시에 걸린다고 말할 수 있습니다).

각 요소에 새 객체를 할당하는 것은 가장 많은 시간이 걸립니다. S.Lott의 답변은 매번 새로운 문자열을 형식화합니다. 엄격하게 필요하지 않습니다. 어떤 공간을 사전 할당하려면, 없음 목록을 작성한 다음 데이터를 마음대로 목록에 할당하십시오. 어느 쪽이든 목록을 추가/확장하는 것보다 데이터를 생성하는 데 더 많은 시간이 걸리거나 목록을 작성하는 동안 또는 그 후에는 목록을 생성하든 그 이후에 더 많은 시간이 걸립니다. 그러나 인구가 거의 인용 된 목록을 원한다면 없음 목록으로 시작하는 것이 확실히 빠릅니다.

이를위한 피시닉 방법은 다음과 같습니다.

x = [None] * numElements

또는 기본값이 무엇이든, 예를 들어

bottles = [Beer()] * 99
sea = [Fish()] * many
vegetarianPizzas = [None] * peopleOrderingPizzaNotQuiche

편집하다: 경고 emptor 그만큼 [Beer()] * 99 구문이 생성됩니다 하나 Beer 그런 다음 동일한 단일 인스턴스에 대한 99 개의 참조로 배열을 채 웁니다.

Python의 기본 접근 방식은 매우 효율적 일 수 있지만 요소 수를 늘리면 효율성이 떨어집니다.

비교하다

import time

class Timer(object):
    def __enter__(self):
        self.start = time.time()
        return self

    def __exit__(self, *args):
        end = time.time()
        secs = end - self.start
        msecs = secs * 1000  # millisecs
        print('%fms' % msecs)

Elements   = 100000
Iterations = 144

print('Elements: %d, Iterations: %d' % (Elements, Iterations))


def doAppend():
    result = []
    i = 0
    while i < Elements:
        result.append(i)
        i += 1

def doAllocate():
    result = [None] * Elements
    i = 0
    while i < Elements:
        result[i] = i
        i += 1

def doGenerator():
    return list(i for i in range(Elements))


def test(name, fn):
    print("%s: " % name, end="")
    with Timer() as t:
        x = 0
        while x < Iterations:
            fn()
            x += 1


test('doAppend', doAppend)
test('doAllocate', doAllocate)
test('doGenerator', doGenerator)

~와 함께

#include <vector>
typedef std::vector<unsigned int> Vec;

static const unsigned int Elements = 100000;
static const unsigned int Iterations = 144;

void doAppend()
{
    Vec v;
    for (unsigned int i = 0; i < Elements; ++i) {
        v.push_back(i);
    }
}

void doReserve()
{
    Vec v;
    v.reserve(Elements);
    for (unsigned int i = 0; i < Elements; ++i) {
        v.push_back(i);
    }
}

void doAllocate()
{
    Vec v;
    v.resize(Elements);
    for (unsigned int i = 0; i < Elements; ++i) {
        v[i] = i;
    }
}

#include <iostream>
#include <chrono>
using namespace std;

void test(const char* name, void(*fn)(void))
{
    cout << name << ": ";

    auto start = chrono::high_resolution_clock::now();
    for (unsigned int i = 0; i < Iterations; ++i) {
        fn();
    }
    auto end = chrono::high_resolution_clock::now();

    auto elapsed = end - start;
    cout << chrono::duration<double, milli>(elapsed).count() << "ms\n";
}

int main()
{
    cout << "Elements: " << Elements << ", Iterations: " << Iterations << '\n';

    test("doAppend", doAppend);
    test("doReserve", doReserve);
    test("doAllocate", doAllocate);
}

내 Windows 7 i7에서 64 비트 파이썬이 제공됩니다

Elements: 100000, Iterations: 144
doAppend: 3587.204933ms
doAllocate: 2701.154947ms
doGenerator: 1721.098185ms

C ++는 (MSVC, 64 비트, 최적화 활성화)를 제공하는 반면

Elements: 100000, Iterations: 144
doAppend: 74.0042ms
doReserve: 27.0015ms
doAllocate: 5.0003ms

C ++ 디버그 빌드가 생성됩니다.

Elements: 100000, Iterations: 144
doAppend: 2166.12ms
doReserve: 2082.12ms
doAllocate: 273.016ms

여기서 요점은 Python을 사용하면 7-8%의 성능 향상을 달성 할 수 있으며, 고성능 앱을 쓰고 있다고 생각되면 (또는 웹 서비스 나 무언가에 사용되는 것을 작성하는 경우) 그것은 스니핑되어서는 안되지만, 당신은 당신이 선택한 언어를 다시 생각해야 할 수도 있습니다.

또한 여기의 파이썬 코드는 실제로 Python 코드가 아닙니다. 여기에서 진정으로 Pythonesque 코드로 전환하면 성능이 향상됩니다.

import time

class Timer(object):
    def __enter__(self):
        self.start = time.time()
        return self

    def __exit__(self, *args):
        end = time.time()
        secs = end - self.start
        msecs = secs * 1000  # millisecs
        print('%fms' % msecs)

Elements   = 100000
Iterations = 144

print('Elements: %d, Iterations: %d' % (Elements, Iterations))


def doAppend():
    for x in range(Iterations):
        result = []
        for i in range(Elements):
            result.append(i)

def doAllocate():
    for x in range(Iterations):
        result = [None] * Elements
        for i in range(Elements):
            result[i] = i

def doGenerator():
    for x in range(Iterations):
        result = list(i for i in range(Elements))


def test(name, fn):
    print("%s: " % name, end="")
    with Timer() as t:
        fn()


test('doAppend', doAppend)
test('doAllocate', doAllocate)
test('doGenerator', doGenerator)

주는 것

Elements: 100000, Iterations: 144
doAppend: 2153.122902ms
doAllocate: 1346.076965ms
doGenerator: 1614.092112ms

(32 비트 Dogenerator에서는 DoAllocation보다 낫습니다).

여기서 doAppend와 doAllocate 사이의 격차는 상당히 큽니다.

분명히, 여기의 차이점은 몇 번 이상이 작업을 수행하는 경우에만 적용되거나, 그 숫자가 크기로 크기가 크게 줄어든 시스템에서 또는 처리중인 경우에만 적용됩니다. 상당히 큰 목록.

요점 : 최고의 성능을위한 피스닉 방식을 수행하십시오.

그러나 일반적인 높은 수준의 성능에 대해 걱정하고 있다면 Python은 잘못된 언어입니다. 가장 근본적인 문제는 파이썬 기능 호출이 전통적으로 데코레이터 등과 같은 파이썬 기능으로 인해 다른 언어보다 최대 300 배 느려 졌다는 것입니다.https://wiki.python.org/moin/pythonspeed/performancetips#data_aggregation#data_aggregation).

다른 사람들이 언급했듯이, 목록을 사전 시드하는 가장 간단한 방법은 NoneType 사물.

즉, 파이썬이 실제로 작동하는 방식을 이해해야합니다. CPYTHON 목록 구현에서 기본 배열은 항상 더 큰 크기로 오버 헤드 룸으로 만들어집니다. ( 4, 8, 16, 25, 35, 46, 58, 72, 88, 106, 126, 148, 173, 201, 233, 269, 309, 354, 405, 462, 526, 598, 679, 771, 874, 990, 1120, etc), 목록을 크기를 조정하는 것은 거의 자주 발생하지 않습니다.

이 행동 때문에 대부분 list.append() 기능은입니다 O(1) 이러한 경계 중 하나를 건너면 복잡성이 증가하는 경우에만 복잡성이 있으며,이 시점에서 복잡성은 O(n). 이 동작은 S. Lott의 답변에서 실행 시간이 최소화되는 것입니다.

원천: http://www.laurentluce.com/posts/python-list-implementation/

나는 @s.lott의 코드를 운영하고 사전 배치에 의해 동일한 10% 성반 증가를 생성했다. 발전기를 사용하여 @Jeremy의 아이디어를 시도했으며 유전자의 성능을 doallocate보다 더 잘 볼 수있었습니다. 내 프로젝트의 경우 10% 개선이 중요하므로 모든 분들께 감사드립니다.

def doAppend( size=10000 ):
    result = []
    for i in range(size):
        message= "some unique object %d" % ( i, )
        result.append(message)
    return result

def doAllocate( size=10000 ):
    result=size*[None]
    for i in range(size):
        message= "some unique object %d" % ( i, )
        result[i]= message
    return result

def doGen( size=10000 ):
    return list("some unique object %d" % ( i, ) for i in xrange(size))

size=1000
@print_timing
def testAppend():
    for i in xrange(size):
        doAppend()

@print_timing
def testAlloc():
    for i in xrange(size):
        doAllocate()

@print_timing
def testGen():
    for i in xrange(size):
        doGen()


testAppend()
testAlloc()
testGen()

testAppend took 14440.000ms
testAlloc took 13580.000ms
testGen took 13430.000ms

더 많은 C와 같은 어레이를 가진 Numpy와 함께 일하는 경우 파이썬의 사전 할당에 대한 우려가 발생합니다. 이 경우, 사전 할당 사전 관련 문제는 데이터의 형태와 기본값에 관한 것입니다.

거대한 목록에서 수치 계산을하고 성능을 원한다면 Numpy를 고려하십시오.

일부 응용 프로그램의 경우 사전이 귀하가 찾고있는 것일 수 있습니다. 예를 들어, find_totient 메소드에서는 인덱스가 제로가 없기 때문에 사전을 사용하는 것이 더 편리하다는 것을 알았습니다.

def totient(n):
    totient = 0

    if n == 1:
        totient = 1
    else:
        for i in range(1, n):
            if math.gcd(i, n) == 1:
                totient += 1
    return totient

def find_totients(max):
    totients = dict()
    for i in range(1,max+1):
        totients[i] = totient(i)

    print('Totients:')
    for i in range(1,max+1):
        print(i,totients[i])

이 문제는 Preallocated List로 해결 될 수도 있습니다.

def find_totients(max):
    totients = None*(max+1)
    for i in range(1,max+1):
        totients[i] = totient(i)

    print('Totients:')
    for i in range(1,max+1):
        print(i,totients[i])

나는 우아하게 사용하는 경우 예외를 던질 수있는 사람을 저장하지 않기 때문에 이것이 우아하고 버그가 발생하기 쉬운 것으로 생각하며,지도를 피할 수있는 Edge Cases에 대해 생각해야하기 때문에.

사전이 효율적이지는 않지만 다른 사람들이 언급했듯이 사실입니다. 작은 속도의 차이가 항상 가치가있는 것은 아닙니다 중요한 유지 보수 위험.

내가 이해 한 바에 따르면, Python 목록은 이미 Arraylist와 매우 유사합니다. 그러나 해당 매개 변수를 조정하려면이 게시물을 그물에서 흥미로울 수있는 것을 발견했습니다 (기본적으로 직접 만들어주세요. ScalableList 확대):

http://mail.python.org/pipermail/python-list/2000-may/035082.html

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow