Click to See Complete Forum and Search --> : Please help me with this algorithm


eniven
February 22nd, 2008, 03:39 PM
Hi there,
I'm working on a PhD in an engineering field and it requires a bit of programming. My programming knowledge is extremely limited so I'm having a bit of a tough time with it.

I'm working on a program that parameterizes the sampling distribution for the correlation coefficient. Anyway, most of it is programmed already. But, I'm having a problem with one part of the program.

Say I have some data with 11 X and Y values. I want to calculate the correlation coefficient between them. OK, that's no problem. I've already coded that.

Now, let's say I want to leave out 1 X and Y data pair so that I'm calculating the correlation coefficient based on 10 pairs of X and Y instead of 11. And, I want to do that 11 times, leaving out a different data pair each time so that I calculate 11 different correlation coefficients based on the remaining 10 data points.

Now, I want to do the same thing, but instead of leaving out only 1 data pair, I want to leave out 2, 3, 4, ... n-3 data points.

Why up to n-3 data pairs? Because you need 3 data pairs to calculate a correlation coefficient.


If it matters, I am coding in Fortran. I know it's an old language, but it works efficiently for many of our applications and we have built up an existing library of programs in that language that have served us well and still work.

I'm not necessarily looking for you to write my code for me and maybe there aren't many of you that know Fortran very well, but I would really appreciate some help with some pseudo code at least.

Thanks so much, in advance for any help!!! I'm stumped with this one.

eniven
February 22nd, 2008, 03:41 PM
Just to re-iterate, my main problem is figuring out how to "leave out" the various different combinations of data from the calculation.

MrViggy
February 22nd, 2008, 04:47 PM
Put the pairs in an array; loop from 0 to n-3; inside said loop perform calculations on 0 to 11-outer loop index.

Viggy

KRUNCH
March 5th, 2008, 03:01 PM
i dont know if this would help but heres a stupid solution that would work
if its all ina an 2 dimensional array then make two loops and preform standard calculation like this
for xxx
{
....
for xxxx
{
.....
if(i==1 && j==1)
{
++i;
++j;
continue;
}
...
}
}

so that would loop trough the whole proces but pregresively skipping one round
it would just continue; the loop so it goes to the next round of the loop.
just modify it for youre needs and that should work.

ProgramThis
March 6th, 2008, 03:14 PM
It sounds like you need all 3,4,5 ... 10 sets of combinations. So you need:

//Set 1
1 2 3
1 2 4
...
//Set 2
1 2 3 4
1 2 3 5
...
...
//Set 10
1 2 3 4 5 6 7 8 9 10

Is this correct? If so, then you should have a method that calculates all n number combinations of a given set of pairs. You then have a loop that starts at 3 and goes to 10 calling this method like such:

for(i = 3; i <= 10; i++)
getCombinations(i);

Inside of the getCombinations() method/function you will calculate the n combinations for the data set.


void getCombinations(int n)
Array arr[] = new Array[numberOfCombinations(n)];
//where numberOfCombinations(n) calculates the number of n pair possibilities
//i.e. if n = 3 11!/3!(11 - 3)!
for ( ... )
place the combinations into the array

pm_kirkham
March 15th, 2008, 07:41 AM
You could also convert each number from 7 to 2**11 - 1 to binary, then create an array with the binary digits.

Using that, you can use then use whole array products, sums and dot-products to perform the vector-definition of Pearson correlation.

(Fortran has highly optimized array intrinsics, so this should look simpler, and on modern CPUs the cost of branching or indirection is much higher than a few extra multiplications)