Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
Date: 1/31/2018 @ 2 p.m. ET
I first heard the word "heuristic" while studying at Michigan State University. I can't really recall the class—it was probably a statistics class—but the word itself has stuck with me. Heuristic means "a rule of thumb or educated guess that narrows the search for solutions." (Source: The Dictionary of Computing). For the purposes of this article, heuristic relates to "the rule of thumb or educated guess" one can glean from measurements.
In Part 1 of this three-part series, I examine what can and can't be reliably learned from a simple heuristic measure, lines of code. In Part 2, I will illustrate how you can employ VS.NET IDE macros to obtain lines-of-code measurements and how you can use macros in general as a productivity tool. In Part 3, I will demonstrate a pattern that can discernibly improve a software development team. All of these elements work in concert, as I will show. Each article is useful alone, but together they will provide you with invaluable tools and skills.
Progressing as a programmer or programming team requires improving upon what has been done in the past until some optimal level of skill is attained. But, how can any programmer or team improve without measuring the current level of proficiency? Knowing for sure whether improvement is occurring may be difficult, if not impossible.
One heuristic measurement that is easy to obtain is lines of code. Lines of code is a weak heuristic (I will tell you why in a minute), but it is an easy place to start and it does have some value.
Good Heuristics and Bad
On a day-to-day basis, you can measure how many lines of code you and your team are producing. However, lines-of-code counts in and of themselves will not tell you anything about the quality of the code produced, whether the code solves the right problem, or whether the team is improving. As such, lines of code can be either a good heuristic or a bad one, depending on whether or not it's used properly. For example, zero lines of code may mean that your people are stuck, spending too much time in meetings, or just goofing off.
Consider this scenario: Yesterday my team of five produced 1,000 lines of code but today produced only 125. Is this meaningful? Maybe or maybe not. To be sure, I know that the team produced 75 percent fewer lines of code. They could have spent time doing other things. Perhaps today's problem was substantially more difficult than yesterday's. Or it could mean something else entirely. Who knows for sure? What the measurement does indicate is that you have a heuristic that, if included with other heuristics, is a starting point for measuring progress.
The lines-of-code measurement is a stick in the sand that, when considered with other factors, is a starting point. It marks the beginning of your productivity tracking. If you use lines of code as the only determinant for whether to reward or punish your programmers, then you are using them incorrectly. However, if you use them as a starting point and evaluate how new lines relate to your schedule (I prefer daily and weekly milestones), they can be useful.
Look at your schedule. What meaningful work was due during a given interval? If the lines of code represent a completed problem for a given interval, the measurement is more meaningful. If the lines of code do not reflect a solution specifically designed to solve part of the overall problem, more investigation is needed. For example, if the lines represent a utility that really enhances productivity, you need to evaluate them in terms of how much of a productivity improvement the utility provides.
The next thing you need to know is whether the new code is refactored and whether it employs known design patterns. Consider the alternate scenario: Yesterday we did 125 lines and today we did 1,000 lines. If the lines of code are not refactored, but instead represent repetitive, copied-and-pasted code, we have no real improvement at all. Lines of code are the gateway to knowing all of these things, but they tell us nothing by themselves.
Use What You Have Learned to Improve
Given the present scenario, you know your team has a certain number of lines of new code. The questions you need to answer are:
- Were more lines, fewer lines, or the same number of lines produced as the prior equivalent period?
- Do the lines of code solve a problem that is part of the scheduled solution? If not, why not?
- Is the change in productivity proportionately relative to the schedule? That is, if more meetings were scheduled or people were in training, is the difference relative to the amount of time spent not programming?
- Was the complexity of work greater or less than the previous period's complexity? New problems take longer than common problems.
- Is the quality of code better than the previous period's quality? Is the team using refactoring more consistently? Are known design patterns employed? Are known anti-patterns avoided? What is the change in code quality?
- Are programmers repeatedly handling a problem that a pattern, component, or code generator could reliably solve much more quickly?
If you are measuring and examining lines of code, you should be able to discern whether the code is good, refactored code, or buggy, messy code.
Once you've gathered these measurements, then what? To find the answer, you need to answer more questions. If the code is not refactored, why isn't it? Do the programmers need more training? Do they need to know how to write refactored code? Is this practice being encouraged? Does the code use known patterns or reflect a high degree of code reuse? If not, why not? Are patterns and code reuse being encouraged or discouraged?
Lines of code do not answer any of these questions. They simply provide a starting point for isolating and then evaluating newly produced code. It is the answers to the questions above that will indicate where your team needs improvement.
In Part 2 of this series, I will demonstrate how to use VS.NET macros to measure the lines of code in a solution. You can use other tools, such as Visual SourceSafe (source control) to isolate new lines—and all of this is much easier to do with a project plan and a schedule.
Take the First Step
Lines of code are a heuristic that is akin to drawing a line in the sand. Unfortunately, lines of code are the first step in a journey of a thousand miles. One must isolate newly produced code and evaluate it relative to the schedule, the complexity of the problem, and the qualitative aspects of the code. Not doing so is working in ignorance of measurable productivity and quality.
Paul Kimmel is the VB Today columnist, has written several books on .NET programming, and is a software architect. You may contact him at email@example.com if you need assistance or are interested in joining the Lansing Area .NET Users Group (glugnet.org).