Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
I was sitting on a plane from Montreal a few years back next to a math professor from McGill University and casually asked him what was new and hot in mathematics. He said: cryptography and encryption. Of course, secrets. In an information age, secrets and secret data are going to be pretty important. More recently I read that the European Union suspended Blackberry text messaging because Blackberry encrypts text messages and routes them through Canada, so the stories seem to suggest. Encrypted messages that can't be intercepted are a risky prospect in this day and age.
Public encryption depends on huge prime numbers. The basic idea is that data encrypted based on primes could theoretically be cracked by brute force, by testing all prime candidates. However, since the primes used for encryption are so big by the time all possibilities are tested for primality the data is no longer relevant, or so that's the basic theory. There is a lot more to prime numbers and encryption, but that represents the broad strokes as I understand them.
Introducing the BigInteger
The retired Major League Baseball player Randy Johnson was referred to as the "Big Unit". This 6' 10 left handed pitcher occasionally threw a fast ball at more than 100 miles per hour. A 100 mile per hour pitch is approximately 147 feet per second; in other words from pitcher to batter a Randy Johnson fastball covered the 90 feet in two-thirds of second. When I was much younger and stronger I could hit an 85 miler as long as I knew it was coming and it came straight down the pipe. I doubt I could even see a 100 mile an hour pitch or get the bat around fast enough. Perhaps it is better I don't have that pressure.
BigInteger refers to a new data type in the .NET framework. Defined in
System.Numerics BigInteger has methods and overloaded operators that make it easier to use like smaller integral native data types, but using it with the Math class and Math's static methods is not supported. For the most part you can use BigInteger like any other integer type.
BigInteger is stored such that the most significant bit of the last byte is the sign bit. If you initialize a BigInteger with an array of bytes then include an additional byte whose value is zero, for the sign bit.
The biggest difference between BigInteger and smaller integer types in .NET is that BigInteger is the only one that doesn't have
MaxValue properties. The absence of these properties means that BigInteger can grow arbitrarily large and consequently take up a huge amount of memory. The real limitation to BigInteger is the physical resources on your computer. BigInteger types are immutable, so if you increment a BigInteger number then what really happens is the creation of a new BigInteger object though this process is invisible to the developer. Too many small operations on very big BigInteger values or if a BigInteger grows too large then an
OutOfMemoryException will be thrown.
Oddly enough there is no Square Root method for BigInteger types. So, the solution I picked to introduce BigInteger is a solution that calculates huge prime numbers using the Sieve of Eratosthenes and a couple of ways of calculating the square root of BigIntegers.
Calculating Square Roots Up To Double.MaxValue
One method for calculating the square root is calculating . You can use the Math class and calculate the square root of a BigInteger up to
Double.MaxValue with the following formula:
Math.Exp(BigInteger.Log(value) / 2)
If the result is greater than
Double.MaxValue then the result is
Double.PositiveInfinity, making this a reasonable approach for calculating the square root of smaller numbers. Think of this as a quick square root solution for some small-valued BigIntegers but not huge ones.
Calculating the Square Root of Arbitrarily Large BigIntegers
If you need to whip out a square root calculator for really BigIntegers then you can use Newton's method which is substantially more complex. The example for calculating the square root of a BigInteger in Listing 1 is based on John Wein's post of Newton's method, posted on social.msdn.microsoft.com. I translated it to VB, so any mistake in translation is my fault.
Private Function Sqrt(ByVal number As BigInteger) As BigInteger ' problem with lower numbers to right bit shift Dim bitLength As Integer = number.ToByteArray().Length * 8 + 1 Dim G As BigInteger = number >> bitLength / 2 Dim LastG As BigInteger = BigInteger.Zero Dim one As BigInteger = New BigInteger(1) While (True) LastG = G G = (number / G + G) >> 1 Dim i As Integer = G.CompareTo(LastG) If i = 0 Then Return G If i < 0 Then If (LastG - G).CompareTo(one) = 0 Then If (G * G).CompareTo(number) < 0 And (LastG * LastG).CompareTo(number) > 0 Then Return G End If ElseIf (G - LastG).CompareTo(one) = 0 Then If (LastG * LastG).CompareTo(number) < 0 And ((G * G).CompareTo(number) > 0) Then Return LastG End If End While Return G End Function
Listing 1: Calculating the square root of really large BigInteger values in VB2010.
Wein's solution didn't seem to work for very low prime candidates. The reason for this seems to be in the fragment number >> bitLength / 2. This fragment shifts the value of the variable number right by the bit length of number. Very small numbers are going to be shifted to be equal to zero and assigned to G. G is used as a divisor in G = (number / G + G) >> 1, which causes a division by zero error. You can follow this link http://social.msdn.microsoft.com/Forums/en-SG/csharplanguage/thread/c13d3fec-21d9-4f74-92de-a6132d5e9915 for John Wein's original post. For more information on Newton's method you can start with the Wikipedia post http://en.wikipedia.org/wiki/Newton-Raphson. (I think of Wikipedia as a pretty good starting point, but scholarship and accuracy is encouraged based altruism not guaranteed.)
Now we can combine all of these elements and employ the Sieve of Eratosthenes to calculate huge prime numbers based on the BigInteger type.