Click to See Complete Forum and Search --> : Is there any known algorithm to efficiently solve this problem?


patkhor
February 10th, 2009, 01:36 PM
Hi,

As the title says, is there any algorithm to efficiently solve this problem. If there isn't one, could you suggest to me how to approach the problem.

The problem is as follows,

Given k sets of size n,
S1 = {S11, S12, S13, S14, ..., S1n}
S2 = {S21, S22, S23, S24, ..., S2n}
....
Sk = {Sk1, Sk2, Sk3, Sk4, ..., Skn}

And set T = {T1, T2, T3, ..., Tn}

Is there exist any 2 sets, Si and Sj, out of k sets above such that,
{Si1 + Sj1, Si2 + Sj2, ..., Sin + Sjn} = {T1, T2, ..., Tn}. And if there exist such pair of sets, what i and j are. (I know many efficient algorithms just answer the question of existance, but in this specific problem, I also need the value of index i and j)

I've successfully solved the problem with the running-time of O(n^2) by adding each element in all possible pairs and compare the sum with each element of T. But I need efficient algorithm to do this.

Any help would be highly appreciated,
Pat

PS. Sorry for my bad English.

ProgramThis
February 10th, 2009, 03:44 PM
I don't see how it could be anything other than O(i*j*k^2). You HAVE to go through every set S11 through Sk1 and compare it to every other one in the set. Now, you obviously don't have to do S11 with S21 and then S21 to S11 (because you have already computed the values).

So you have to go through and grab the data sets S11 , S21 ... S(k-1)1 and compare it to another one from Sn1 ... Sk1. This is an O(n^2) operation. Then you have to compare all of the values S11 + S21 ... Si1 + Sj1. I don't see any way around it.

TheCPUWizard
February 10th, 2009, 03:48 PM
For the general case, I believe ProgramThis is correct.

However, cant you statistically save time by .... aborting a compare at the first mismatch?

Peter_B
February 10th, 2009, 03:53 PM
Given k sets of size n,

...

I've successfully solved the problem with the running-time of O(n^2) by adding each element in all possible pairs and compare the sum with each element of T. But I need efficient algorithm to do this.


Don't you mean O(n*k^2)?

Peter_B
February 10th, 2009, 04:14 PM
For the general case, I believe ProgramThis is correct.

However, cant you statistically save time by .... aborting a compare at the first mismatch?

And you may be able to remove some of the S sets depending upon some general properties of the members of the sets. For example, if all the members are positive numbers you could filter the Si sets before the pair-wise checks. This is because if the value of T1 is, say, 10, then any Si for which Si1>10 can be ignored.

Peter_B
February 10th, 2009, 04:57 PM
Hang on lads, I've had an idea.

We have n sets of length k, so:

1) Put all the S sets in a hash table
2) for each set Si,
calculate (T-Si) and look this up in the hash table
If we find this in the table, we have found a pair of Si,Sj that add to T

Each hash table insertion or lookup is O(1), giving each of step 1 and 2 an overall order O(n).

Therefore the overall order of this algorithm is O(n) from step 1 plus O(n) from step 2, giving O(n) overall.

If we consider k as well, we have O(kn) for the overall order, as each insertion/lookup/T-Si calculation will be O(k).

Anyone see any problems with this?

patkhor
February 10th, 2009, 05:28 PM
First of all, thank you to all of you for excellent answers.

I'll try to implement algorithm that Peter_B suggested and will let you guys know the result.

Pat

Zachm
February 11th, 2009, 06:22 AM
Hang on lads, I've had an idea.

We have n sets of length k, so:

1) Put all the S sets in a hash table
2) for each set Si,
calculate (T-Si) and look this up in the hash table
If we find this in the table, we have found a pair of Si,Sj that add to T

Each hash table insertion or lookup is O(1), giving each of step 1 and 2 an overall order O(n).

Therefore the overall order of this algorithm is O(n) from step 1 plus O(n) from step 2, giving O(n) overall.

If we consider k as well, we have O(kn) for the overall order, as each insertion/lookup/T-Si calculation will be O(k).

Anyone see any problems with this?
One problem:
- what are the keys to the hash-table ?
If you need to calculate unique keys, in the worst case you will have to iterate over all n elements of each set to determine it's key. This leaves you with the same run-time as the algorithm originally suggested (or even worse).

Regards,
Zachm

Peter_B
February 11th, 2009, 12:12 PM
One problem:
- what are the keys to the hash-table ?
If you need to calculate unique keys, in the worst case you will have to iterate over all n elements of each set to determine it's key. This leaves you with the same run-time as the algorithm originally suggested (or even worse).

Regards,
Zachm

Okay, I'll lay out each step to show the overall order:

We have k sets of size n (NB: this is using the description in patkhor's original post, which is the opposite of that in my previous post, so please ignore my previous post to avoid confusion)

Step 1:
for each of k sets {
Calculate the hash - as there are n members, this is O(n)
Add to hash table - this is O(1) (typically, i.e. avoiding collisions)
}

Overall order of step 1: O((n+1)*k) - > O(kn)

Step 2:
for each of k sets {
Calculate (T-Si) - this is O(n)
Calculate hash(T-Si) - this is O(n)
Lookup entry in hash table [I]- this is O(1)
}

Overall order of step 2: O((2n+1)*k) -> O(kn)

Each step is O(kn), so this is the overall order for the whole algorithm.
This is a lot better than pairwise comparison as this gives O(n*k^2). We save a factor of k (the number of sets).

However, for my algorithm the best case, average case and worst case orders are all O(kn), as we create the hash table containing all the sets Si before we start looking for pairs that sum to T. The brute force pair-wise comparison has a best case order O(1), as the first pair we try may work!