Domanda

Here are two code samples

def hough_transform_1(active_points, size_trame, size_max_song):
    m = np.linspace(0.95, 1.05, 11)
    p = np.linspace(-size_trame, size_max_song, size_max_song + size_trame + 1)
    acc = np.zeros([m.size, p.size])
    for m_i in m:
        for x_i, y_i in active_points:
            p_i = y_i - m_i * x_i + size_trame
            if p_i >= 0 and p_i < p.size:
                acc[m_i * 100 - 95, p_i] += 1
    #Return some value



DTYPE_INT = np.int
DTYPE_FLOAT = np.float

ctypedef np.int_t DTYPE_INT_t
ctypedef np.float_t DTYPE_FLOAT_t

@cython.boundscheck(False)
def hough_transform_2(np.ndarray[DTYPE_FLOAT_t, ndim=2] activepoints, sizetrame, sizemaxsong):
    cdef size_trame = sizetrame
    cdef size_max_song = sizemaxsong
    cdef np.ndarray[DTYPE_FLOAT_t, ndim=2] active_points = activepoints
    cdef DTYPE_FLOAT_t x_i, y_i, m_i, p_i
    cdef float best_transformed
    cdef np.ndarray[DTYPE_FLOAT_t, ndim=1] m = np.linspace(0.95, 1.05, 11).astype(DTYPE_FLOAT)
    cdef np.ndarray[DTYPE_INT_t, ndim=2] acc = np.zeros([m.size, size_max_song + size_trame + 1], dtype=DTYPE_INT)
    cdef int i_range = m.size
    cdef int j_range = active_points.shape[0]
    for i in range(i_range):
        m_i = m[i]
        for j in range(j_range):
            x_i = active_points[j][0]
            x_i = active_points[j][1]
            p_i = y_i - m_i * x_i + size_trame
            if p_i >= 0 and p_i < size_max_song + size_trame + 1:
                acc[m_i * 100 - 95, p_i] += 1
    #Return some value

Those two functions (which detect lines with slope between 0.95 and 1.05 given a list of (x, y) input points) are equivalent, though the second one uses cython optimizations.

Testing their speed with (x is 1 or 2)

    time1 = time.time()
    for _ in range(100):
        hough_transform_x(points, self.length, self.length)
    time2 = time.time()

I get those results:

35s for hough_transform_1 ; 20s for hough_transform_2

Since using Cython on this type of function should result in a more significant speedup (I expected 100 times instead of 1.75 times), I think something is wrong in my cythonized code, but can't detect it. What did I miss?

È stato utile?

Soluzione

First, type everything. Secondly, actually type them.

These aren't typed(!) and should be typed in the argument list:

    cdef size_trame = sizetrame
    cdef size_max_song = sizemaxsong

This is redundant:

    cdef np.ndarray[DTYPE_FLOAT_t, ndim=2] active_points = activepoints

This is fine:

    cdef DTYPE_FLOAT_t x_i, y_i, m_i, p_i

You don't use this:

    cdef float best_transformed

This would possibly be better hardcoded as a C array (DTYPE_FLOAT_t[11]):

    cdef np.ndarray[DTYPE_FLOAT_t, ndim=1] m = np.linspace(0.95, 1.05, 11).astype(DTYPE_FLOAT)

These are fine:

    cdef np.ndarray[DTYPE_INT_t, ndim=2] acc = np.zeros([m.size, size_max_song + size_trame + 1], dtype=DTYPE_INT)
    cdef int i_range = m.size
    cdef int j_range = active_points.shape[0]

i is untyped:

    for i in range(i_range):
        m_i = m[i]

j is untyped:

        for j in range(j_range):

This is pointless:

            x_i = active_points[j][0]

You want active_points[j, 1]:

            x_i = active_points[j][1]                
            p_i = y_i - m_i * x_i + size_trame

0 <= pi < size_max_song + size_trame + 1

            if p_i >= 0 and p_i < size_max_song + size_trame + 1:
                acc[m_i * 100 - 95, p_i] += 1
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top