Click to See Complete Forum and Search --> : Finding n-th largest element in two sorted lists
[volk]
March 2nd, 2008, 07:45 AM
I have a little optimization problem, I have two lists of sorted elements and I need to pick n-th largest element from both lists combined. Now, since the data in both lists changes and I do not have control over the changes, I cannot combine the two lists. Both lists can also be quite large. A naive solution would be to just do a linear search from the end and find n-th largest element. However, I have a feeling that I can do it faster since both lists are sorted. I should be able to apply some sort of binary search, but I can't wrap my mind around the idea of using binary search on two lists simultaneously. If I can use binary search then I would have the awesome complexity of log(n) and that would be a huge boost in performance for the application I use this in. Does anyone have any ideas on this?
ProgramThis
March 3rd, 2008, 02:48 PM
This is going to be a dumb questions but: Are the elements unique? If the elements in each list are unique, since they are sorted you merely have to grab the nth elements from them :) (or length - n th)
However, I have a feeling that they are not, or you would not have posted something here. Do you need to get the nth largest item overall between the two lists, or the nth largest from each list?
[volk]
March 3rd, 2008, 06:12 PM
Yes I need to get the nth largest from both lists combined and thats not as trivial as it might sound, at least not if you are gonna do it faster then O(n+m) where n and m are lengths of both lists.
The best idea I've been able to come up with was taking last element from one of the lists and searching where it will go in the other list, then discard all elements below that point and repeat this step untill I get to n'th element.. well, maybe it's better explained in this pseudocode:
IN:
array list1;
array list2;
number n;
OUT:
object largest_element;
BEGIN:
// We do not have to search within more then n elements of each list
// If any of the lists is smaller then n, then you have to extend the search
// region of the other list search region is defined by start and end
// variables assigned to each list
if list1.size > n
list1.start = list1.end-n
else
list2.start = list2.end-n-n+list1.size
if list2.size > n
list2.start = list2.end-n
else
list1.start = list1.end-n-n+list2.size
funtion find_nth_element()
index i
if list1.size == 0
return list2[-n]
else if list2.size == 0
return list1[-n]
// First find a solid tail between the two lists. A solid tail is the end of
// either one of the two lists that contains elements that are all larger
// then the last element in the opposite list - thus we can easily find n-th
// largest element within this solid tail
if list2[-1] > list1[-1]
// Find the position within list2 where the last element in list1 would fit in
i = list2.binary_search(list1[-1]);
solid_tail = list2[i..list2.end]
// Shorten the search interval
list2.end = i
else
i = list1.binary_search(list2[-1]);
solid_tail = list1[i..list1.end]
list1.end = i
if solid_tail.size >= n // n is within current solid tail, woohoo
return solid_list[-n]; //returns n-th from the end
else
// Since solid tail is smaller then n, we will subtract its length from n and continues searching in the remaining data set
n -= i
// search within the remaining elements
return find_nth_element;
end
Voominibear
March 3rd, 2008, 10:17 PM
How about devide and conquer ?
spoon!
March 3rd, 2008, 10:41 PM
maybe i didn't understand your posts; what's wrong with just searching from the ends
int index1 = array1.length-1, index2 = array2.length-1;
for (int i = 0; i < 9; i++) {
if (index1 >= 0 && array1[index1] > array2[index1])
index1--;
else if (index2 >= 0)
index2--;
else
// raise error here - too few elements
}
if (index1 >= 0 && array1[index1] > array2[index1])
return array1[index1];
else if (index2 >= 0)
return array2[index2];
else
// raise error here - too few elements
ProgramThis
March 4th, 2008, 07:57 AM
Yeah, divide and conquer could work, but how would you write it so that it was effecient? For example if the 2nd half of the arrays looked like this:
... 12 56 58 62 74
... 49 72 74 82 89
Here divide and conquer would be messy, and simply recurring down from the end, like spoon suggests, would actually give you the same amortized running time ( O(n) to find the nth largest). I know that this is sort of a worst case scenario but still, I'm not sure divide and conquer is a good solution for this problem, I could be wrong thoug.
6thsentinel
March 4th, 2008, 04:16 PM
I have an idea which might work, which I have noticed while trying to come up with a solution:
If take n and divide it by 2, then get the elements from the two sorted lists at floor(n/2), the nth number will be within that interval.
Hope this helps.
spoon!
March 4th, 2008, 08:51 PM
i'm not sure if this has been said before, but here is an approximate idea on divide and conquer.
call the lists A & B, and call their last 10 elements A[-10..-0] & B[-10..-0]. the 10th largest element must be in there somewhere. Now consider A[-5] and B[-5].
if A[-5] > B[-5], then you know that the A[-5] > (10th largest element) < B[-5]
this is because:
* we can't find 10 elements bigger than A[-5]; there are 5 in A, and less than 5 in B (since B[-5] is less)
* we can't find 10 elements smaller than B[-5]; there are 5 in B, and less than 5 in A (since A[-5] is greater)
so we now only have to look at the 5th largest element in the two lists of length 5: A[-10..-5] & B[-5..-0]
in other words, we divided the problem in half
my notation isn't very correct because i didn't consider the element at the -5 position, but you get the idea
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.