Question

ProductName Profit

DellVostro--------15714

Nokia510---------13392

SonyE151----------10506

LGE15-------------10326

Nokia20------------8830

samsung1516--------7910

samsung1556-----7560

samsung1517-----7430

fordgt49--------7410

samsung 1560------7350

..............
..................

I have table like this two columns productname and profit so i want to display the top most products which gives 50 percent of the total profit or revenue for this I achieved the solution using Sql server stored procedure but I want the same solution to be done using Either R or GreenPlum

The acutal query is

"Find the top products that produced 50% of annual profit/revenue in the last year, since the beginning of this year, and altogether."

and the logic which i have implemented in sql server is

 create proc [dbo].[prod50pershare] as begin

CREATE TABLE #LocalTempTable(
productname varchar(150),[SumSold] int)

CREATE TABLE #LocalTempTable1(
productname varchar(150),[SumSold] int)

insert #LocalTempTable
select productname,sum(margin)As Profit  from sale
join product on product.productid=sale.productid
where datepurchase between '1/1/2012' and '1/1/2013' 
group by productname order by 2 desc

DECLARE @sum INT
Declare @count int
declare @50PerRevenue int
declare @pcount int
declare @i int

SET @sum = 0
set @i=1
select @50PerRevenue= sum(sumsold)/2 from #LocalTempTable
select @count=COUNT(*) from #LocalTempTable

WHILE (@i<=@count)
BEGIN
   insert #LocalTempTable1 select top (@i)productname,SumSold from #LocalTempTable  
   order  by [SumSold] desc
   select @sum=SUM(SumSold) from #LocalTempTable1

 if(@sum<=@50PerRevenue)
 begin
      set @pcount=@i
      delete from #LocalTempTable1
      set @sum=0
      Set @i=@i+1
 end
 else
     begin
     break
     end
End
select top (@i) Productname,Sumsold from #LocalTempTable1 
drop table #LocalTempTable
drop table #LocalTempTable1
end
Was it helpful?

Solution

This is one method -

df <- read.csv(textConnection(
'DellVostro,15714
Nokia510,13392
SonyE151,10506
LGE15,10326
Nokia20,8830
samsung1516,7910
samsung1556,7560
samsung1517,7430
fordgt49,7410
samsung156,07350'), header = FALSE)

df <- df[order(df$V2,decreasing=TRUE),]
df$V3 <- cumsum(df$V2)
df$V3 <- df$V3/sum(df$V2)

V3 tells you where to mark your 50% -

> df
            V1    V2        V3
1   DellVostro 15714 0.1629610
2     Nokia510 13392 0.3018418
3     SonyE151 10506 0.4107935
4        LGE15 10326 0.5178786
5      Nokia20  8830 0.6094495
6  samsung1516  7910 0.6914797
7  samsung1556  7560 0.7698801
8  samsung1517  7430 0.8469324
9     fordgt49  7410 0.9237773
10  samsung156  7350 1.0000000
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top