Click to See Complete Forum and Search --> : Find Top 10 Algorithm
aajiz
November 24th, 2005, 12:33 PM
Hi all
I have this interesting problem. Assume 20 items (A,B,C,D etc) are owned by a group of 10,000 people in various combinations (permutations are not important). Something like
OwnerId {Items}
1 {A,C}
2 {A,B,C}
3 {B,D}
4 {G}
and so on.
I need to know which top 10 items account for most number of people?
Thanks.
Yves M
November 24th, 2005, 05:45 PM
Well, go through the 10 000 people and keep an array from 1 to 20 which counts how many times each item appears. If person 1 has items A and C then after treating him, you'll have an array like [1,0,1,0,0,0,0,0...], after treating the second person, it will be [2, 1, 2, 0, 0, 0, ...]. Just add 1 whenever the person has that item.
At the end, take the top 10 from the array of results.
aajiz
November 24th, 2005, 05:51 PM
Yves,
Your solution will give me top 10 combinations. I need top 10 items owned by maximum number of people.
Yves M
November 24th, 2005, 05:51 PM
Please explain more clearly what you mean then. What does "top 10 items owned by maximum number of people" mean if not the top 10 of the items that people have?
And no, my solution will not give you the top ten of the subsets. If that is what you mean by "combinations".
aajiz
November 24th, 2005, 06:20 PM
Here is a simpler example:
Data:
Owner1 {B,C}
Owner2 {D,A,E}
Owner3 {A,B,E}
Owner4 {E,G,I}
Owner5 {A,B}
Owner6 {B,C}
Owner7 {A,B,C}
Owner8 {B,G,H}
The top 3 items from above data would be A,B,C because they are owned by 50% (4 out of 8) owners (Owner1, Owner5, Owner6, Owner7)
Hope this helps.
Yves M
November 24th, 2005, 06:28 PM
Still not clear. Why does it not matter that Owner3 has A and B as well? Is it exclusive in some way? If you can post a mathematical definition of what you call "top items" it would be clearer.
aajiz
November 24th, 2005, 07:01 PM
Hmm... Let's think about another analogy. Let's assume customers instead of owners and products instead of items.
So, Customer1{B,C} means that this customer will only buy from a shop if it has BOTH products B and C.
The question now becomes, which are the top 3 products that can satsify needs of maximum number of customers?
Customer3 needs products A,B,E and therefore is NOT satisfied by our top 3 products A,B,C.
Yves M
November 24th, 2005, 07:19 PM
So you need to find a subset of 10 items out of 20 that contains the most sets in the list of 10000? In that case, you could just work on the set of combinations C(10, 20) and do the counting similarly. C(10, 20) is decently big though, so it's not the greatest solution, I agree.
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.