Building the Right Environment to Support AI, Machine Learning and Deep Learning
Environment: Should be able to be compiled with any standard C++ compiler (untested). Developed in VC++. I originally got the idea for this project after seeing a couple of similar things done on the net - A class based on storing the data in string format, the author of which I cannot recall, and another one by PJ Naughter that was a 96 bit integer. After studying PJ Naughter's code, I began thinking that the system could be made more flexible by making the storage dynamic. I played around with it for a while, and this is the result.
The CBigInt class allows the manipulation of extremely large signed integer values. The data contained in the objects is stored in an array of unsigned longs, which is dynamically allocated on the heap as required. The maximum size of this array is MAXLONG elements, giving a theoretical possible range of values of -268,719,476,704 to 268,719,476,704 - 1. With numbers this big, a further realistic limitation of available memory would also apply, a full range number requiring 8Gb of RAM to store, without output formatting. I have not done any checks for overflow, so in the odd chance that a number that big is obtained, the results are undefined. Each element of the array is stored in the host machine's native format, and the entire array is stored in Little Endian format, with the low order DWORD in element 0, and the high order DWORD in element n). Setting the highest bit of the high order word denotes a negative number.
Maintaining the sign as a number shrinks and grows proved to be one of the more awkward operations in the whole class. Basically, what it means is that if a number is positive, but its internal representation has the high order bit of its highest order DWORD set, then an extra array element containing a zero value must be kept. Also, if a negative number grows in negativity past a DWORD boundary, the new element needs to have all of its "unused" bits set.
There are two private functions that are helpers to organise this. The ExpandTo() function causes the internal array to grow, filling each element with 0xffffffff if the value is negative, or 0 if it is positive. The object's sign is passed as a parameter to this function, rather than being determined by the existing value, to allow for the case where a positive number's high order bit is set. The Expand() function is simply a derivative of ExpandTo(), with the first parameter being a value to expand the array by, rather than to.
The other helper function is Optimize(). This one simply removes redundant elements from the high order of the array, maintaining the sign (and value, of course) of the resulting value.
Due to the enormity of some of the numbers that may be generated, I have tried to write the code with speed optimisations, forgoing code size on a number of occasions, figuring that with data this big, an extra K or two in code is pretty insignificant. Consequently, there are a number of overloaded operators and friend functions, taking standard C++ operands rather than CBigInt operands, with the associated performance benefits of not having to parse the array coming into effect. These performance benefits are most obvious in multiplication and division operations, where the performance improvement raises exponentially with the size of the operands. The test project demonstrates this, with two non-recursive factorial functions defined, both of which return a CBigInt, with one of them performing an CBigInt * DWORD operation, and the other doing the CBigInt * CBigInt operation. The former performs over 100 times faster with 1000!, and (try it if you are willing to wait), over 400 times faster with 10000!.
Apart from all of the operations, the only other thing that was needed was the ability to read and write the numbers in human format. The Format functions convert the number to a string, in any base from 2 to 36. An internal string buffer is allocated as a member for storing this string, and is reallocated each time a Format() function is called, so the pointer should not be stored. There is a version of the function where you may pass your own buffer to be filled, if you need to save or reuse the results.
The FromXxx functions convert a string representation of a number to a CBigInt, as does the constructor that takes a const char* parameter. The latter recognises the "0x", "0", and "0b" prefixes on the strings as being Hex, Octal, and Binary respectively, whereas the FromString function expects the Radix to be passed as a parameter. Parsing of the sting commences from the first non space or tab character, and continues to the first unrecognised digit for the supplied radix.
DownloadsDownload demo project (VC++ specific) - 11Kb
Download source - 9Kb