Click to See Complete Forum and Search --> : Normal distribution question
R_Georg
August 12th, 2005, 03:11 PM
Please help.
I am using a plain vanilla random number generator to populate pHistogram
//mean
//std - standard deviation
int Gen(const double &mean, const double &std)
{
static const double pii=3.1415927;
static const double r_max=RAND_MAX+1;
return (int)std*sqrt(-2*log((rand()+1)/r_max))*sin(2*pii*rand()/r_max)+mean;
}
The results are then categorized in bins:
//nMin - the minimum of pHistogram
for(rLoop = 0; rLoop < nNumPoints; rLoop++)
{
dVal = pHistogram[rLoop + 1];
for(rCount = 0; rCount < nNumBins; rCount++)
{
if((dVal >= nMin + (rCount * dVal)) &&
(dVal <= nMin + (rCount * dVal) + dVal))
{
pCounts[rCount]++;
break;
}
}
}
//pCount points are drawn as a connected line
The resulting graph though is skewed towards the beginning of the plot not the middle to produce the familiar bell shaped curve or a decent approximation of it.
What am I doing wrong? :confused:
Peter Sparlinek
August 15th, 2005, 03:52 PM
Hi,
I do not know the formula you are using, but I have 3 comments on the coding of function Gen:
(1) The formula uses the function rand() twice. Is it really intended to use 2 different random values or should the same random value been used twice?
(2) Why do you cast your formula down to integer. double to int just truncates the values ( e.g. 4.87 to 4 ) instead of rounding?
(3) In the return statement you use
return (int)std*sqrt(...)*sin(...)+mean;
This casts std to integer and then multiplies it to the result of the square root and sin function. I assume you wanted to cast the complete formula to integer:
return (int)(std*sqrt(...)*sin(...)+mean);
And finally:
Why do you use bins, when your data only contains integer values?
Peter
R_Georg
August 15th, 2005, 04:29 PM
I thank you for your comments. I've changed the typecast as you suggested. The generator was plucked from an open-source site, so I can't say anything about the style ;) .
Anyway, I am using the bins to generate a barchart. I think I am doing something superficially wrong when sorting the results.
Do you think you can help me with this?
Thanks
Peter Sparlinek
August 16th, 2005, 02:06 AM
Can you post some more of your code and some information on the values the parameters mean, std, nNumPoints, nNumBins ?
BTW, if you use a bin size of 1, i.e.if you display the histogram directly, do you get the expected graph?
If so, something is wrong in the binning, if not, something is wrong with the generation (or both).
There is also an error in the binning:
if((dVal >= nMin + (rCount * dVal)) &&
(dVal <= nMin + (rCount * dVal) + dVal))
It should be
if((dVal >= nMin + (rCount * dVal)) &&
(dVal < nMin + (rCount * dVal) + dVal)) // use '<'instead of '<='
Otherwise some data is placed into 2 bins. (This function can also be programmed much more effective, but that is another issue.)
Peter
R_Georg
August 16th, 2005, 12:20 PM
Thanks a lot for your assistance.
Values are:
mean = 100;
std = 250;
nNumPoints = 300;
nNumBins = 120;
The resulting graph which is displayed sideways to match my coordinates (it's a prototype for a process control, so in place of the random data I'll be getting some real parameter values in the future) looks like this
0
70
27
22
8
9
4
3
2
1
3
0
...
Any ideas?
Thank you in advance.
Peter Sparlinek
August 17th, 2005, 02:53 AM
Values are:
mean = 100;
std = 250;
nNumPoints = 300;
nNumBins = 120;
Using those values means: The function Gen() generates values between -651 and 852. The point is: How do you handle those negative numbers in your histogram pHistogram? This part of the code is missing. Another issue is the binning loop:
for(rLoop = 0; rLoop < nNumPoints; rLoop++)
{
dVal = pHistogram[rLoop + 1];
...
Here you use the histogram values from pHistogram[1] to pHistogram[nNumPoints]. How is pHistogram allocated? Is it really of size nNumPoints+1 or greater?
Looking at the result of your binning I have the following impression:
(1) The negative data is missing.
(2) The 0 is missing
Add your numbers and you will receive nearly nNumPoints/2.
Peter
R_Georg
August 18th, 2005, 02:07 PM
Thank you for your help. The allocation of pHistogram is nNumPoints*2.
Indeed I was missing the negative values.
Now the histogram looks decent and approximates the bell shape with increasing the number of samples.
Thanks again.
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.