Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
Polygon clipping is one of those humble tasks computers do all the time. It's a basic operation in creating graphic output of all kinds.
There are several well-known polygon clipping algorithms, each having its strengths and weaknesses. The oldest one (from 1974) is called the Sutherland-Hodgman algorithm. In its basic form, it is relatively simple. It is also very efficient in two important cases, one being when the polygon is completely inside the boundaries, and the other when it's completely outside.
The Laing-Barsky algorithm (1983) is a good deal more complicated, but in certain cases fewer intersections need to be calculated than for Sutherland-Hodgman. Therefore, it may be somewhat faster when many polygon lines intersect with the clipping boundaries. However, it is more sensitive to 'nasty' polygons.
The Weiler algorithm (1977) is even more complicated, but it is the one you'll need if you want to clip a polygon against a non-rectangular window.
Even more ways to clip a polygon exist. None of them is totally perfect. Often, it is possible to feed a weird polygon to an algorithm and retrieve an incorrect result. One of the vertices will disappear, or a ghost vertex will be created. Therefore, the hunt for the perfect clipping algorithm is still open.
A polygon is generally stored as a collection of vertices. Any clipping algorithm takes one collection, and outputs a new collection. A clipped polygon, after all, is also a polygon. Notice that the clipped polygon often will have more vertices than the unclipped one, but it can also have the same number, or less. If the unclipped polygon lies completely outside the clipping boundary, the clipped polygon even has zero vertices.
After some investigation, I opted for the old and trusted Sutherland-Hodgman method to attack the problem. For the project I was working on, the non-rectangular capabilities of Weiler would be overkill. With the current CPUs and their fast floating point operations, the slightly higher efficiency of Laing-Barsky hardly makes sense. In fact, after a rather crude experiment, I suspect it to be even somewhat slower.
Sutherland-Hodgman uses a divide-and-conquer strategy to attack the problem. First, it clips the polygon against the right clipping boundary. The resulting, partly clipped polygon is then clipped against the top boundary, and then the process is repeated for the two remaining boundaries. (Of course, it also works in another order.) In a way, it is the most natural thing to do. If you had to clip a paper polygon with a pair of scissors, you would probably proceed the same way.
|Sutherland-Hodgman's divide-and-conquer strategy. To clip a polygon against a rectangular boundary window, successively clip against each boundary.|
To clip against one boundary, the algorithm loops through all polygon vertices. At each step, it considers two of them, called 'previous' and 'current.' First, it determines whether these vertices are inside or outside the clipping boundary. This, of course, is simply a matter of comparing the horizontal or vertical position to the boundary's position.
Then, it applies the following simple rules:
If 'previous' and 'current' are both inside: Output 'current.' If 'previous' is inside, and 'current' is outside: Output the intersection point of the corresponding edge and the clipping boundary. If 'previous' and 'current' are both outside: Do nothing. If 'previous' is outside, and 'current' is inside: Output the intersection point, and then output 'current.'
This way, you get a new polygon, clipped against one boundary, and ready to be clipped against the next boundary. The method works, but has one disadvantage: To clip against all four boundaries, you have to store the intermediate, partly clipped polygons. It's evident that this is a costly operation.
Luckily, there are ways to avoid the intermediate storage. Instead of storing an output point, you can send it to the next stage immediately, where it can be clipped against the next boundary.
The classical way to do this is to make one general boundary-clipping function, which calls itself recursively to put a vertex to the next stage. An extra parameter tells the function which boundary to use for clipping. Inside the function, a few switch statements account for the differences.
A somewhat faster option is to create separate functions for the four boundaries, thus eliminating the nasty switch statements. However, this means developing and debugging four very similar functions, which is an open invitation for all kinds of trouble.