Building the Right Environment to Support AI, Machine Learning and Deep Learning
I recently read Joel Spolsky's On Software after reading Erick Sink's On The Business Of Software; both are books that I highly recommend. They both offer funny and informative insights from their respective blogs in book form on their various takes on the business of software development. Spolsky's discussion of the McDonalds theory of software development is spot on. Spolsky says that from McDonalds' University all the way to the exacting nature of the uniform delivery of what really has become a bland (albeit loved) product, McDonalds has really detailed how to create a uniform product across independently owned franchises. This would equate to an exactingly specified software development methodology that in theory could produce software in a predictable and uniform way. Spolsky says it can't and it won't. He's right.
The theory is if that senior developers with a track record of success write in exhausting detail how to build software, anyone can follow that methodology and get a uniform result every time. This theory is inherently wrong. Every software project is vastly different from every other and requires so many intangible decisions that ultimately lead to success and failure that it is impossible to create a methodology map that ensures a reliable, predictable, and manageable success every time. On every project, it comes down to how the practitioners on that project choose to spend their time on the tangible and intangible elements of the project. Hence, a single methodology dogmatically adhered to simply becomes an albatross around the neck of those forced to follow its tenants.
Disclaimer: Any misunderstanding of Joel Spolsky's meaning or intent is my own.
Geometry applied in the context of software development project estimation refers to the size, shape, and relationships among all of the elements of a software project, including those practices whose direct result are tangibles and those time slots that are essentially intangible but necessary in some way. Many estimators get some of the tangible parts of estimation correct, but it is all of the intangible aspects that are often overlooked and result in delays, disappointments, and undelivered software.
Tangible elements (often called deliverables) can include project plans, requirements, detailed designs, the software, and time spent testing, debugging, documenting, and launching software. Many people know about these elements and some include these elements in estimations. Sadly, some do not include all of these elements. Intangibles such as water cooler time, illness, bad requirements, wrong requirements, changed requirements, unforeseen hard bugs in your code or the tool provider's code, learning, white board time, unit tests, arguments, meetings, distractions, failed or replaced hardware, absenteeism, car accidents, sustainable productivity levels—getting in and staying in "the zone"—reading FoxNews.com, checking out research.microsoft.com and more, are all too often overlooked, dismissed, or denied. But, these are the elements as likely to happen as coding that chew away at the schedule, willingly or not, and is what is most often overlooked in a project schedule. To account for the tangibles, methodologies like RUP were invented. To account for intangibles, methodologies like XP and Scrum were invented. But, again there is no substitute for having people who have demonstrated an ability to deliver.
Tip: Software projects that succeed often have a single-minded determination and focus on the number one deliverable, the software.
It is the tangible and intangible elements of software development that make up every single project's geometry. Thus, you can break all of the tangible items known down to minute tasks and try to add them, but this will not lead you to a singularly reliable estimate. For how are they added together, consecutively or concurrently? A Gantt chart may suggest the formation and consequently a starting point and ending point, but sadly Gantt charts can only show what is known at a point in time, and almost every Gantt chart (and project schedule I have seen) has left out the intangible elements of the estimate. And, if you include the intangibles in the estimate, it is likely to lead to the termination of the estimating-slacker. Failure to include intangibles in the project geometry is ignoring the humanity in the humans doing the work.
Unfortunately, believe it or not, it is all of the ways you spend your time that will dictate how long it takes to deliver the software, and few if any of these tangible elements should be negotiated out of the schedule and no amount of bullying, threatening, or carrot-and-stick approaches will mitigate the intangible elements of software-time in any sustainable or predictable way. This is why software projects are still very unpredictable.
As an alternative approach, one can try complex algorithms such as CoCoMo II invented and detailed by Dr. Barry Boehm—see CoCoMo on Wikipedia or read Dr. Boehm' s seminal book for more information—but I have found CoCoMo estimates to suggest that successful projects I have completed should have required as much as two times the amount of time and five times the number of resources (people) as the project actually took. Again, the practitioners will ultimately decide the outcome, but tools and point-in-time guesstimates [sic] can be useful and insightful starting points as long as these are made by the practitioners actually tasked to do the work.
The Process I Use
I can unabashedly tell you that, after twenty years, I have a way that I estimate projects based on dozens of projects that I participated in or observed closely. I can also tell you that some of these elements have met consistently with open or barely contained hostility. You have been cautioned.
Every manager needs to be able to predict outcomes and determine resources. Unfortunately, at the onset of a project this is almost impossible to do in any meaningful way. Too little is known about a project (of any size or complexity) at its onset to permit these estimates to be reliable. This is especially true if the estimates come from above. Manger deadlines are the worst kinds of guesses, uniformly, and almost always are taken to be schedule by mandate estimates (what programmers refer to as death marches [Yourdon, Death March]).
Note: If there is a live-or-die quality to a schedule—as in people will live or die or companies will live or die, and these should be very rare—a merciless willingness to trim unnecessary features and a do the hard stuff first approach can work. This should never, ever, be routine practice, and people know the difference. However, in such a circumstance you need a superhero and said superhero should be richly rewarded.
How do I handle estimates? I want to do them, always. At a project's inception, I guess at how complex a problem is. Then, I compare it to projects I have done before. Next, I chunk the project into macro-sized elements and estimate these. The chunks include tangible parts of every project, such as designing, programming, and testing. Then, I determine the design and implementation chunks based on a loose assemblage of all of the pieces, like building a database, writing stored procedures, coding, unit testing, GUI design, more testing, sanity testing of requirements, and deployment. Finally, I add up all the bits and multiply by 2.6. That last part, if spoken out loud, will get you into trouble. If ignored, it will surely lead to a very late project and probably get you into one kind of trouble or another.
Your actual multiplier may be more or less, but here are the factors that are always present:
- Every team overestimates their capability. It's true. (If any person on your team says "I have never failed" and it's a death march, put that person in charge and sell popcorn at the firing squad.)
- Productivity is not uniformly sustainable at eight hours per day. Only people in third-world countries can chop off arms and put them in tar, but hovering over and bullying in the real world, one is just killing the will to live of someone's baby.
- Everyone always underestimates how long debugging and testing will take, but debugging, testing, and fixing always take a significant time. I plan for 1.5 times the coding time. (Overestimating is the opposite of adding too much salt to food. Salt you can't take out, but time you can put back. This never happens.)
- The intangibles add up and take the willing and unwilling along for the ride, unwillingly.
- And, change is a constant and bugs are ever present.
In short, accounting for a third of every day as actual forward movement is the only reliable bet. The most important thing is that developers have to be given enough time to do a good job. My third-of-the-day productivity number is not an arbitrary number, and it does not leave room for loafing off. Every bit of the total estimate, including my 2.6 scalar will be needed just to get the job done.
Note: Last year, one other programmer and I wrote 65,000 lines of code in production in eight months. This year, I wrote 18,500 lines in three months. This is hand-written code, and these are darn good numbers, so my scalar personal scalar is not based on slow coding. It is based on 20 years of my experience of the intangible.
Here is the rub. A conscientious manager will never ever give you 2.6 times your original estimate up front. However, an insightful manager knows instinctively that some padding is needed but will likely provide a significantly smaller number. However, a very insightful manager will let the practitioners come up with their own estimates and trust that a genuine effort will be made to meet their estimates. The very insightful, experienced manager will anticipate that the estimates will still be off to some extent. The predicament is that trust takes a long time to build and teams are so fluid that successful practitioners often move on to higher-paying jobs just as that trust has been established and unsuccessful ones have often been terminated; hence, the trust-building must ensue anew.
An Estimating Process We Can Sell
To tackle the dynamics of trust and transition, you have to have an estimating process you can sell. This process has to begin establishing trust on both sides early. This process must include sufficient time to ultimately succeed. The process must not depend on superheroes—single-handed miracles—as these are not sustainable or reliable, and the estimates must come from the practitioners.
Here is such a process from the perspective of a practitioner:
- Let three of your most reliable developers independently provide an initial estimate.
- Take the average of all three estimates with the caveat that a finite amount of time be permitted for the initial estimate.
- Know that the first averaged-estimate must be re-estimated after all of the requirements are complete and a rudimentary design has been defined.
- Anticipate change and the unknown.
- Continually track initial estimates against actual times during the project, noting whose estimates need adjusting, and encourage everyone to account for these adjustments in future estimates within this project. (Don't punish here. Encourage.)
- If the schedule is the most important thing, be aggressive about trimming features, while encouraging practitioners to adhere to the most important features first and sticking to actual requirements. (Written requirements are harder to fudge around with "features".)
- Finally, finish the project no matter what it takes and wrap up with a post mortem to see how everyone did relative to their estimates and use this knowledge on the next project to improve the estimating process.
The implication here is that the current project estimates may be significantly different than the actual results, but if your best estimators are used on future projects, the estimating reliability will improve unless your whole team changes at once. However, even if your whole team changes at once, you will still have some baseline information, which is better than starting from scratch.
Estimating has to go on throughout a project's lifecycle, and tracking estimates to actual results has to span projects.
Managers should never estimate. Any manager six months out of a meaningful project will lose the ability to estimate any better than even the greenest of newbies, unless the manager is hiring chuckleheads. (In that case, his or her estimates will still be bad.) More importantly, when a manager estimates or overrides the developers' estimates, he is signaling that he doesn't trust his developers' estimates (or he thinks they are chuckleheads). Finally, Manager estimates are mandates that lead to burned-out superheroes and death marches. Managers are trying to obtain aggressive resource allocations to look good for their bosses, their constituency. Practitioners are trying to build good software for the customer, the real constituency. The software is the only goal that counts right now. Ultimately, everyone will be happy when projects succeed, money comes in, and estimates become more reliable.
Anticipate that more time will be needed for the intangibles than the tangibles, but that most conscientious practitioners are asking for time to do a good job and would gladly beat their own estimates. Arbitrarily huge over-estimates only come from sub-contractors padding their sales figures, but a practice of having fine-grained milestones and watching milestones closely and tracking estimates to actuals will tell you whether you have good contractors or not. Everyone knows what to do with the latter.
Finally, tracking estimates to actuals continually in the spirit of collaboratively improving (rather than punishing) will help everyone get better at estimating. In the absence of empirical numbers now, trust those who have successfully delivered in the past and start tracking estimates now. (To managers: If you don't know how or why your projects are succeeding, you have a problem, not the worst kind of problem but a problem nonetheless.)
If your software gets done, your customers are happy, and your best programmers haven't left in frustration, say a prayer of thanks but start tracking estimates.
About the Author
Paul Kimmel is the VB Today columnist for www.codeguru.com and has written several books on object-oriented programming and .NET. Look for his upcoming book LINQ Unleashed for C# from Sams. Paul is a software architect for Tri-State Hospital Supply Corporation. You may contact him for technology questions at firstname.lastname@example.org.
If you are interested in joining or sponsoring a .NET Users Group, check out www.glugnet.org. Glugnet has two groups in Michigan, Glugnet in East Lansing and Glugnet-Flint (in Flint, of course) and some of the best sponsors (and some of the best companies) in the free world. If you are interested in attending, check out the www.glugnet.org web site for updates or contact me.
Copyright © 2007 by Paul T. Kimmel. All Rights Reserved.