Question

I am using Coldfusion, to do a site search for my company. I have a stored procedure being called within a cfc, that returns all results for a keyword, no limits.

Then I do a subquery that either restricts what data I have access to based on security settings, or limits it based on me filtering the result to 1 of 5 subsets of data.

Now we are searching the database using a view, and that view is a union all query, allowing me to search multiple tables at once, and the result returns columns of "id, type, title, url, rank". With different rank values for how it found this specific results, which ends up with duplicate results with different rank values....

Now my boss, wants me to add up all the rank values, update 1 record, and remove the rest of the duplicates...

So for example, if i was searching for hte word business

And I had different results like, found in title, +500, exact match for title +1000, found in description +200

But the problem is that when I tried to loop thru all thru all the results, it added significant performance drain.

So I am looking for an alternative way to sum up the rank values, and then get rid of the results.

Below is my basic logic flow for the whole process

  1. Do Stored Procedure which searches the view for a keyword, searches in different fields in different ways generating duplicate results with different rank values.

  2. Eliminates search results of those that i do not have access to, and if i had chosen to filter results into a certain subset of results, like in books, groceries, plants, make up any categories you like :)

Here i will show you the code:

  <cfloop query="get_matches">
        <!--- check if this already exists --->
        <cfquery name="check_match" dbtype="query">
            select id, sum(rank) as total
            from get_matches
            where url = '#get_matches.url#'
            group by id
        </cfquery>

        <cfif check_match.total gt 1>

            <!--- add the two rank values --->
            <cfset my_rank = val(check_match.total) />

            <!--- update the search results --->
            <cfset get_matches.rank[get_matches.currentrow] = javacast("int",my_rank) />

            <!--- get a list of rows that has that url --->

            <!--- eliminate all other rows --->
            <cfquery name="get_matches_bot" dbtype="query">
                select id, type, title, url, rank
                from get_matches
                where url <> '#get_matches.url#'
                group by id, type, title, url, rank
            </cfquery>

            <cfquery name="get_matches_top" dbtype="query">
                select id, type, title, url, rank
                from get_matches
                where url = '#get_matches.url#'
                and rank = #my_rank#
                group by id, type, title, url, rank
            </cfquery>

            <cfquery name="get_matches" dbtype="query">
                select id, type, title, url, rank
                from get_matches_top
                union
                select id, type, title, url, rank
                from get_matches_bot
                group by id, type, title, url, rank
                order by rank desc, title asc
            </cfquery>


        </cfif>

    </cfloop>

Then after all that is done to help both get unique rows that have total number of rank values for where they had records in the #2 results.

    <cfquery name="get_matches" dbtype="query">
        select id, type, title, url, rank
        from get_matches
        group by id, type, title, url, rank
        order by rank desc, title asc
    </cfquery>

There has to be a better way, performance wise, to sum the rank values, get rid of duplicate row's without looping.

Any thoughts or suggestions?

Was it helpful?

Solution

I figured out the answer was to do a sum, around the rank field, and that made it fast.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top