Showing posts with label Intermediate. Show all posts
Showing posts with label Intermediate. Show all posts

Sunday, November 3, 2019

Review: Clean Agile, by Robert C. Martin, and More Effective Agile, by Steve McConnell

This started out as a review of McConnell's book, but Just-In-Time, my pre-order of Uncle Bob's book arrived Friday. Ah, sweet serendipity! I read it yesterday, and it fits right in.

I have no idea what the two authors think of each other. I don't know if they're friends, enemies, or frenemies. I don't know if they shake their fists at each other or high-five. But as a software developer, I do believe they're both worth listening to.

I've read most of the books in Martin's Clean Code series. I'm a big fan. He was one of the original signatories of the Agile Manifesto.

A recent post by Phillip Johnston, CEO of Embedded Artistry, set me off on a path reading some of Steve McConnell's books and related material. I've become a big fan of his as well.

Week before last, I read McConnell's Software Estimation: Demystifying the Black Art, 2006. Last week, I read his new book More Effective Agile: A Roadmap for Software Leaders, that just came out in August, the one I'm reviewing here.

This week, I'm reading his Code Complete: A Practical Handbook of Software Construction, 2nd edition, 2004, and Software Requirements, 3rd edition, 2013, by Karl Wiegers and Joy Beatty (or maybe over the next few weeks, since they total some 1500 pages; I note that in the Netflix documentary series "Inside Bill's Brain: Decoding Bill Gates", one of his friends says Gates reads 150 pages an hour; that's a superpower, and I am totally jealous!).

These are areas where software engineering practice has continually run into problems.

The Critical Reading List

Martin's and McConnell's new books are excellent, to the point that I can add them as the other half of this absolutely critical reading list:
In fact, I would be so bold as to say that not reading these once you know about them constitutes professional negligence, whether you are an engineer, a manager, or an executive. If you deal with software development in any way, producer or consumer, you must read these.

Brooks' first edition outlined the problems in software engineering in 1975. Twenty years later, his second edition showed that we were still making the same mistakes.

There are a few items that are extremely dated and quaint. Read those for their historical perspective. But don't for a moment doubt the timely relevance of the rest of the book.

Brooks is the venerated old man of this. Everybody quotes him, particularly Brooks' Law: Adding human resources to a late software project makes it later.

Every 12 years after Brooks' first edition, DeMarco and Lister addressed the theme from a different perspective in their editions of Peopleware.

Forty-four years after, we are still making the same mistakes, just cloaked in the Agile name. So McConnell's new book addresses those issues in modern supposedly Agile organizations, with suggestions about what to do about them.

Meanwhile, Martin's book returns us to the roots of Agile, literally back to the basics to reiterate and re-emphasize them. Because many of them have been lost in what Martin Fowler calls "the Agile Industrial Complex," the industry that has grown out of promoting Agile.

The first three books are easy reading. McConnell's is roughly equivalent to two of them put together. It also forms the root of a study tree of additional resources, outlining a very practical and pragmatic approach.

There are clearly some tensions and disagreements between the authors and the way things have developed. Martin goes so far as to include material with dissenting opinions in his book.

Don't just read these once. Re-read them at least once a year. Each time, different details will feel more relevant as you make progress.

Problems

The problems in the industry that have persisted for decades can be summarized as late projects, over budget, and poor software that doesn't do what it's supposed to do or just plain doesn't work.

Tied up in this are many details. Poor understanding and management of requirements, woefully underestimated work, poor understanding of hidden complexities, poor testing, poor people management.

Much of it is the result of applying the Taylor Scientific Management method to software development. Taylorism may work for a predictable production line of well-defined inputs, steps, and outputs, running at a repeatable rate, but it is a terrible model for software management. Software development is not a production line. There are far too many unknowns.

In general, most problems arise because companies practice the IMH software project management method: Insert Miracle Here. With Agile, they have adopted the IAMH variant: Insert Agile Miracle Here.

But as Brooks writes, there are no silver bullets. Relying on miracles is not an effective project management technique. This is a source of no end of misery for all involved with software.

As Sandro Mancuso, author of the Clean Code series book The Software Craftsman: Professionalism, Pragmatism, Pride (Yes! Read it!) writes in chapter 7 of Clean Agile, "Craftsmanship", "the original Agile ideas got distorted and simplified, arriving at companies as the promise of a process to deliver software faster." I.e. miracles.

A Pet Peeve (Insert Rant Here)

One of the areas of disagreement between various authors is the open-plan office. The original Agile concept was co-locating team members so that they could communicate immediately, directly, and informally, at higher bandwidth than through emails or heavy formal documents. It was meant to foster collaboration and remove impediments to effective communication.

Peopleware is extremely critical of the open-plan office, and I couldn't agree more. The prevailing implementation of it is clearly based more on the idea of cutting real-estate and office overhead costs than on encouraging productive communication. The result has all the charm of a cattle concentration feedlot, everyone getting their four square feet to exist in.

Another distortion of the Agile concepts embraced by management at the cost of actual effective development. That might make the CFO happy, but it's a false economy that should horrify the CTO.

Those capex savings can incur significant non-recurring engineering costs and create technical problems that will incur further downstream development and support costs. And that just means more opex for facilities where the engineering gets done, because the project takes longer.

You're paying me all this money to be productive and concentrate on complex problems, then you deliberately destroy my concentration to save on furniture and floorspace? It's like a real-life version of Kurt Vonnegut's short story Harrison Bergeron. What does that do to the product design and quality? What customer problems does it create, with attendant opportunity costs?

I turned down an excellent job offer in 2012 after the on-site interviews because of this. I was bludgeoned by my impression of the office environment: sweatshop. They probably thought of me as a prima donna.

McConnell also recommends against this, referencing the 2018 article It's Official: Open-Plan Offices Are Now the Dumbest Management Fad of All Time, which summarized the findings of a Harvard study on the topic. The practice appears to me to be the office-space equivalent of Taylorism.

Ok, now that I have all that off my chest, on to the actual reviews.

Clean Agile, Robert C. Martin

Martin's premise is that Agile has gotten muddled. He says it has gotten blurred through misinterpretation and usurpation.

His purpose is to set the record straight, "to be as pragmatic as possible, describing Agile without nonsense and in no uncertain terms."

He starts out with the history of Agile, how it came about, and provides an overview of what it does. He then goes on to cover the reasons for using it, the business practices, the team practices, the technical practices, and becoming Agile.

An important concept is the Iron Cross of project management: good, fast, cheap, done: pick any three. He says that in reality, each of these four attributes have coefficients, and good management is about managing those coefficients rather than demanding they all be at %100; that is the kind of management Agile strives to enable, by providing data.

The next concept is Ron Jeffries' Circle of Life: the diagram decribing the practices of XP (eXtreme Programming). Martin chose XP for this book because he says it is the best defined, the most complete, and the least muddled of the Agile processes. He references Kent Beck's Extreme Programming Explained: Embrace Change (he prefers the original 2000 edition; my copy is due to arrive week after next).

The enumeration and description of the various practices surprised me, reinforcing his point that things have gotten muddled. While I was aware of them, I was not aware of their original meanings and intent.

The most mind-blowing moment was reading about acceptance tests, under the business practices. Acceptance tests have become a real hand-waver, "one of the least understood, least used, and most confused of all the Agile practices."

But as he describes them, they have the power to be amazing:
  • The business analysts specify the happy paths.
  • QA writes the tests for those cases early in the sprint, along with the unhappy paths (QA engineer walks into a bar; orders a beer; orders 9999 beers; orders NaN beers; orders a soda for Little Bobby Tables; etc.). Because you want your QA people to be devious and creative in showing how your code can be abused, so that you can prevent anyone else from doing it. You want Machiavelli running your QA group.
  • The tests define the targets that developers need to hit.
  • Developers work on their code, running the tests repeatedly, until the code passes them.
Holy crap! Holy crap! This ties actual business-defined requirements end-to-end through to the running code. It is a fractal-zoom-out-one-level application of Test Driven Development (and we all thought TDD was just for the developer-written unit tests!).

It completely changes the QA model. Then the unit and acceptance tests get incorporated into Continuous Build, under the team practices.

There are other important business practices that I believe are poorly understood, such as splitting and spikes. Splitting means splitting a complex story into smaller stories, as long as you maintain the INVEST guidelines:
  • Independent
  • Negotiable
  • Valuable
  • Estimable
  • Small
  • Testable
Splitting is important when you realize a story is more complex than originally thought, a common problem. Rather than trying to beat it into submission (or be beaten into submission by the attempt), break it apart and expose the complexity in manageable chunks.

I never knew just what a spike was. It's a meta-story, a story for estimating a story. It's called that because it's a long, thin slice through all the layers of the system. When you don't know how to estimate a story, you create a spike for the sole purpose of figuring that out.

Almost as mind-blowing is his discussion of the technical practices. Mind-blowing because much of this whole area has been all but ignored by most Agile implementations. Reintroducing them is one of the strengths of this book.

Martin has been talking about this for a while. He gave the talk in this video, Robert C. Martin - The Land that Scrum Forgot, at a 2011 conference (very watchable at 2x speed). The main gist is that Scrum covered the Agile management practices, but left out the Agile technical practices, yet they are fundamental to making the methodology succeed.

These are the XP practices:
  • Test-Driven Development (TDD), the double-entry bookkeeping of software development.
  • Refactoring.
  • Simple Design.
  • Pair Programming.
Of these, I would say TDD is perhaps the most-practiced. But all of these have been largely relegated to a dismissive labeling as something only the extremos do. Refactoring is seen as something you do separately when things get so bad that you're forced into it. Pair programming in particular is viewed as a non-starter.

I got my Scrum training in a group class taught by Jeff Sutherland, so pretty much from the horse's mouth. That was 5 years ago, so my memory is a bit faded, but I don't remember any of these practices being covered. I learned about sprints and stories and points, but not about these.

As Martin describes them, they are the individual daily practices that developers should incorporate into every story as they do them. Every story makes use of them in real-time, not in some kind of separate step.


Refactoring builds on the TDD cycle, recognizing that writing code that works is a separate dimension from writing code that is clean:
  1. Create a test that fails.
  2. Make the test pass.
  3. Clean up the code.
  4. Return to step 1.
Simple Design means "writing only the code that is required with a structure that keeps it simplest, smallest, and most expressive." It follows Kent Beck's rules:
  1. Pass all the tests.
  2. Reveal the intent (i.e. readability).
  3. Remove duplication.
  4. Decrease elements.
Pair programming is the one people find most radical and alarming. But as Martin points out, it's not an all-the-time 100% thing. It's an on-demand, as-needed practice that can take a variety of forms as the situation requires.

Who hasn't asked a coworker to look over some code with them to figure something out? Now expand that concept. It's the power of two-heads-are-better-than-one. Maybe trading the keyboard back and forth, maybe one person driving while the other talks. Sharing information, knowledge, and ideas in both directions, as well as reviewing code in real-time. There's some bang for the buck!

The final chapters cover becoming Agile, including some of the danger areas that get in the way, tools, coaching (pro and con), and Mancuso's chapter on craftsmanship, which reminds us that we do this kind of work because we love it. We are constantly striving to be better at it. I am a software developer. I want to be professional about it. This hearkens back to the roots of Agile.

More Effective Agile, Steve McConnell

McConnell has a very direct, pragmatic writing style. He is brutally honest about what works and what doesn't, and the practical realities and difficulties that organizations run into.

His main goal is addressing practical topics that businesses care about, but that are often neglected by Agile purists:
  • Common challenges in Agile implementation.
  • How to implement Agile in only part of the organization (because virtually every company will have parts that simply don't work that way, or will interact with external entities that don't).
  • Agile's support for predictability.
  • Use of Agile on geographically distributed teams
  • Using Agile in regulated industries.
  • Using Agile on a variety of different types of software projects.
He focuses on techniques that have been proven to work over the past two decades. He generalizes non-Agile approaches as Sequential development, typically in some sort of phased form.

The book contains 23 chapters, organized into these 4 parts:
  • INTRODUCTION TO MORE EFFECTIVE AGILE
  • MORE EFFECTIVE TEAMS
  • MORE EFFECTIVE WORK
  • MORE EFFECTIVE ORGANIZATIONS
It includes full bibliography and index.

Throughout, he uses the key principle of "Inspect and Adapt": inspect your organization for particular attributes, then adapt your process as necessary to improve those attributes.

Another important concept is that Agile is not one monolithic model that works identically for all organizations. It's not one-size-fits-all, because the full range of software projects covers a variety of situations. So the book covers the various ways organizations can tailor the practices to their needs. Probably to the horror of Agile purists.

Each chapter is organized as follows:
  • Discussion of key principles and details that support them. This includes problem areas and various options for dealing with them.
  • Suggested Leadership Actions
  • Additional Resources
The Suggested Leadership Actions are divided into recommended Inspect and Adapt lists. The Inspect items are specific things to examine in your organization. I suspect they would reveal some rude surprises. The Adapt items cover actions to take based on the issues revealed by inspection.

The Additional Resources list additional reading if you need to delve further into the topics covered.

One of the very useful concepts in the book is the "Agile Boundary". This draws the line between the portion of the organization that uses Agile techniques, and the portion that doesn't. Even if the software process is 100% Agile, the rest of the company may not be.

Misunderstanding the boundary can cause a variety of problems. But understanding it creates opportunities for selecting an appropriate set of practices. This is helpful for ensuring successful Agile implementation across a diverse range of projects.

A significant topic of discussion is the tension between "pure Agile" and the more Sequential methods that might be appropriate for a given organization at a given point in a project.

The Agile Boundary defines the interface where the methods meet, and which methods are appropriate on each side of it under given circumstances. Again, Agile is not a single monolithic method that can be applied identically to every single project. As he says, it's not a matter of "go full Agile or go home".

There's a lot of information to digest here, because it all needs to be taken in the context of your specific environment. The chapters that stand out to me based on my personal experience:
  • More Effective Agile Projects: keeping projects small and sprints short; using velocity-based planning (which means you need accurate velocity measurement), delivering in vertical slices, and managing technical debt; and structuring work to avoid burnout.
  • More Effective Agile Quality: minimizing the defect creation gap (i.e. finding and removing defects quickly, before they get out); creating and using a definition of done (DoD); maintaining a releasable level of quality at all times; reducing rework, which is typically not well accounted for.
  • More Effective Agile Testing: using automated tests created by the development team, including unit and acceptance tests, and monitoring code coverage.
  • More Effective Agile Requirements Creation: stories, product backlog, refining the backlog, creating and using a definition of ready (DoR).
  • More Effective Agile Requirements Prioritization: having an effective product owner, classifying stories by combined business value and development cost.
  • More Effective Agile Predictability: strict and loose predictability of cost, schedule, and feature set; dealing with the Agile Boundary.
  • More Effective Agile Adoptions.
Requirements make an interesting area, because that is often a source of problems. The Agile approach is to elicit just enough requirements up front to be able to size a story, then rely on more detailed elicitation and emergent design when working on the story.

But the problem I've seen with that is one of the classic issues in estimation. Management tends to treat those very rough initial estimates as commitments, not taking into account the fact that further refinement has been deferred. So downstream dependent commitments get made based on them.

The risk comes when further examination of the story reveals that there is more work hidden underneath than originally believed. I've seen this repeatedly. Then the whole chain of dependent commitments gets disrupted, creating chaos as the organization tries to cope.

For example, consumer-product embedded systems are very sensitive to this. The downstream dependent commitments involve hardware manufacturing and the retail pipeline, where products need to be pre-positioned to prepare for major sales cycles such as holidays.

The Christmas sales period means consumer products need to be in warehouses by mid-November at the latest. Both the hardware manufacturing facilities (and their supply chains) and the sales channels are Taylor-style systems, relying on bulk delivery and just-in-time techniques. They need predictability. That's your Agile Boundary right there, on two sides of the software project.

IOT products have fallen into the habit of relying on a day 1 OTA update after the consumer unboxes them, but that's risky. If the massive high-scale OTA of all the fielded devices runs into problems, it creates havoc for consumers, who are not going to be happy. That can have significant opportunity costs if it causes stalled revenue or returns, or some horribly expensive solution to work around a failed OTA, not to mention the reputation effect on future sales.

What about commercial/industrial embedded systems? Cars, planes, factory equipment, where sales, installation, and operation are dependent on having the software ready. These can have huge ripple effects.

Online portal rollouts that gate real-world services are also sensitive to it. Martin uses the example of healthcare.gov. People need to have used the system successfully by a certain date in order to access real-world services, with life-safety consequences.

Both of these highlight the real-world deadlines that make business sense for picking software schedule dates. As software engineers, we can't just whine about arbitrary, unreasonable dates. There's a whole chain of dependencies that needs to be managed.

Schedule issues need to be surfaced and addressed as soon as possible, just like software bugs. The later in the process a software bug is identified, the more expensive it is to fix, sometimes by orders of magnitude. Dealing with schedule bugs is no different.

In his book on estimation, McConnell talks about the Cone of Uncertainty, the greater uncertainty about estimates early in the project, that narrows to better certainty over time as more information is available. Absolute certainty only comes after the completion. But everybody behaves as if the certainty is much better much earlier.

It's clear from the variety of information in this book that Agile is not simply a template that can be laid down across any organization and be successful. It takes work to adapt it to the realities of each organization. There is no simple recipe for success. No silver bullets.

That's why it's necessary to re-read this periodically, because each time you'll be viewing it in the context of your organization's current realities. That's continuing the Inspect and Adapt concept.

Update Nov 10, 2019


My copy of Beck's Extreme Programming Explained arrived yesterday, and I've been reading through it. Here we see the benefits of going back to original sources, in this case on open plan offices. In Chapter 13, "Facilities Strategy", he says:
The best setup is an open bullpen, with little cubbies around the outside of the space. The team members can keep their personal items in these cubbies, go to them to make phone calls, and spend time at them when they don't want to be interrupted. The rest of the team needs to respect the "virtual" privacy of someone sitting in their cubby. Put the biggest, fastest development machines on tables in the middle of the space (cubbies might or might not contain machines).
So it appears what caught on was the group open bullpen part, and what has been left out was the personal space part (and it's attendant value).

There's a continuous spectrum on which to interpret Beck's recommendation, with the typical modern open office representing one end (all open space, no private space), and individual offices representing the other (no open space, all private space).

There's a point on the spectrum where I would shift to liking it, if I had a private place to make my own where I could concentrate in relative quiet, with enough space to bring in a pairing partner.

Where I find the open office breaks down is the overall noise level from multiple separate conversations. It can be a near-constant distraction when I'm trying to work (hence the rampant proliferation of headphones in open offices).

Meanwhile, when I need to have a conversation with someone, I want to be able to do it without competing with all those others, and without disturbing those around me.

What seems to me to have the most practical benefit is optimizing space for two-person interactions, acoustically isolated from other two-person interactions. So individual workspaces with room for two to work together. That allows for individual time as well as the pairing method, from simple rubber-duck debugging to full keyboard and mouse back-and-forth.

Those are both high-value, high-quality times. That's the real value proposition for the company.

And in fact, that's precisely the kind of setup Beck says Ward Cunningham told him about.

Given that most developers now work on dedicated individual machines, through which they might be accessing virtualized cloud computing resources, the argument for a centralized bullpen with machines seems less compelling.

The open bullpen space seems to be less optimal, but still useful for times when more than two people might be involved.

This is clearly a philosophical difference from Beck's intent, but I think the costs of open plan offices as he experienced them, tempered by the reality of how they've been adopted, outweigh their benefits.

Meanwhile, his followup discussion in that chapter is fully in harmony with Peopleware's Part II: "The Office Environment".

Friday, November 23, 2018

Review: Web-Based Course Test-Driven Development For Embedded C/C++, James W. Grenning


Full disclosure: I was given a seat in this course by James Grenning.

I took James Grenning's 3-day web-based course Test-Driven Development for Embedded C/C++ September 4-6, 2018. It was organized as a live online delivery, 5 hours each day. The schedule worked out perfectly for me in Boston, starting at 9AM each morning, but he had attendees from as far west as California and as far east as Germany, spanning 9 time zones.

The participants ranged from junior level embedded developers to those with more than 20 years of experience. One worked in a fully MISRA-compliant environment. This was the first introduction to TDD for some of them.

The course was organized as blocks consisting of presentation and discussion, coding demo, then live coding exercises. It used CppUTest as the TDD framework.

The short answer: this is an outstanding course. It will change the way you work. I highly recommend it, well worth the investment in time and money. The remote delivery method worked great.

I had previously read Grenning's book, Test Driven Development for Embedded C, which I reviewed in August. I covered a lot of his technical information on TDD in the review, so I'll only touch on that briefly here. He covers the same material in the course presentation portions.

The course naturally has a lot of overlap with the book, so each can serve as a standalone resource. But I found the combination of the two to be extremely valuable. They complement each other well because the book provides room to delve more deeply into background information, while the course provides guided practice time with an expert.

Reading the book first meant I was fully grounded in the motivations and technical concepts of TDD, so I was ready for them when he covered them in the course. I was also already convinced of its value. What the live course brings to that part is the opportunity to ask questions and discuss things.

You can certainly take the course without first reading the book, which was the case for several of the participants.


Presentations

For the presentation portions, Grenning covered the issues with development using non-TDD practices, what he calls "debug-later programming" (DLP). This consists of a long phase of debug fire-fighting at the end of development, that often leaves bugs behind.

He introduced the TDD microcycle, comparing the physics of DLP to the physics of TDD. By physics he means the time domain, the time taken from injection of a bug (repeat after me: "I are a ingenuer, I make misteaks, I write bugs") to its removal. This is one of the most compelling arguments for adopting TDD. TDD significantly compresses that time frame.

He covered how to apply the process to embedded code and some of the design considerations. He also talked about the common concerns people have about TDD.

One quote from Kent Beck that I really liked:
TDD is a great excuse to think about the problem before you think about the solution.
He covered the concept of test fakes and the use of "spies" to capture information. He covered mocks as well, including mocking the actual hardware so that you can run your tests off-target.

He covered refactoring to keep tests easy to follow and maintain. He also covered refactoring of "legacy code" (i.e. any production code that was not built using TDD), including "code smells" and "code rot", using TDD to provide a safety harness. This included a great quote from Donald Knuth (bold emphasis mine):
Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

Coding Demos

Grenning performed two primary live coding demos. First, he used TDD to build a circular buffer, a common data structure in embedded systems. He used this to demonstrate the stepwise process of applying the TDD microcycle.

Second, he performed a refactoring demo on a set of tests. He used this to show how to apply the refactoring steps to simplify the tests and make the test suite more readable and maintainable.

This was just as valuable as the TDD microcycle, because a clean test suite means it will live a long and useful life. Failing to refactor and keep it clean risks making it a one-off throwaway after getting its initial value.


Coding Exercises

Grenning uses Cyber-Dojo to conduct exercises (as well as his demos). This is a cloud-based, Linux VM, ready-to-use, code-build-run environment that allows each student to work individually, but he can monitor everyone's code as they work. This turned out to be one of the most valuable aspects of the course.

I should also mention that I had read Jeff Langr's book Modern C++ Programming with Test-Driven Development: Code Better, Sleep Better in between reading Grenning's book and taking this course. Langr puts a lot of emphasis on short, easily-readable tests, and that's something that also comes out in Grenning's class.

What was so valuable about doing these exercises in Cyber-Dojo is that Grenning was able to stop someone who was heading off in the wrong direction and quickly bring them back on track, or help them if they weren't sure what to do next. That fast feedback cycle is very much in tune with TDD itself. It works just as well as a teaching method.

So if someone started writing code without a test, or wrote too much code for what the test covered, or had too much duplication in tests, or had too much setup that could be factored out, he let them know and guided them back. In some cases he interrupted the exercise to go through someone's code on the screen with everybody watching, not to put them on the spot, but to cover the issues that we all run into.

That's critical because learning to truly work effectively in TDD style requires a reorientation of your thinking. We all have the coding habits of years that need to be overcome.

That doesn't happen automatically just because you read a book and have decided to follow it. It takes effort; half the effort is just noticing that you're straying from the intended path. That's the value of having a live instructor who can watch over your shoulder. It's like being an apprentice under the watchful eye of a master craftsman.

For me, this was ultimately the greatest value in the class. Having Grenning provide real-time guidance had an immediate effect on my coding, for both the test code and the production code. Whether it was talking about my mistakes or someone else's, I was able to immediately improve my work.

That made a huge difference between the test code I wrote before the class and the test code I wrote by the end of the class.

The coding exercises were building our own circular buffer, building a light controller spy, using TDD with the spy to implement a light scheduler, and implementing a flash chip driver. Note that these exercises are also covered in his book.

I also found that Cyber-Dojo made for an interesting example of pair programming, something I've never done before. Grenning provided initial files to work on, like a pair partner guiding you in the next step, then provided active feedback, like a partner asking questions and making suggestions: "Are you missing something there? What if you tried this? Wait, before you do that...".


The Big Lesson

The big lesson for me from this course was that it finally drove home that TDD is ALL ABOUT DEVELOPMENT! Sometimes I have to be clubbed over the head for something to really sink in, and that's what happened here.

We get so focused on the word "test" in TDD that we jump to the conclusion that it's just a test methodology. We emphasize test, as in TEST-Driven Development.

But really, the emphasis should be reversed, it's Test-Driven DEVELOPMENT. That means you apply design concepts and address the requirements of the product as you engage in a very active development thought process that is driven forward by tests.

Did you ever write some throwaway test code just so you could see how something worked, or to explore some design ideas? Hmmm, well TDD formalizes that.

The fact that you do end up with useful unit tests is almost a side effect of the process. An extremely valuable side effect, but a side effect nonetheless.

The real output of the process is working production code. That's what really matters. That's the real goal.

At some point on the last day of the course, I recognized the change in emphasis deep in my being. Maybe the difference is subtle, but it is critical.

That recognition first started to dawn after I read the book and applied it at work. I was amazed at the cleanliness of the resulting code. It was DRY and DAMP and SOLID, with no further refinement or debugging required.

Yes, I had a unit test suite. But look at the production code! It was breathtaking, right out of the chute. That was motivating.

It was in that receptive frame of mind that I did the coding exercises in the course. That was when the club hit. It was one of those moments of realization where you divide time into what came before, and what came after, the physical moment of grok, providing a whole new lens through which to perceive the work.

Savor that consideration for a moment.

People have been saying for years that TDD is about development, but we tend to focus on the test. Grenning emphasizes development when he talks about developing while "test-driving", meaning he is doing his development driven by tests. I guess it just takes time for the real implications to sink in.

One of Grenning's slides quotes Edsger Dijkstra:
Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with, and as a result, the programming process will become cheaper. If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.
While we all aspire to be like Dijkstra, this seems like a pipe dream. Until you realize that TDD does exactly that. It provides the shortest path to working software. I think he would have liked that.

Now that I've relegated the test aspect of this to second-class citizenship, let me bring it back to prominence.

The testing aspect approaches Dijkstra's ideal, because it finds bugs immediately as part of the code, build, test cycle. So the bugs are squashed before they've had time to scatter and hide in the dark corners. That reduces the dreaded unbounded post-development debug cycle to near zero.

If you don't let bugs get into the code, you won't have to spend time later removing them. Yeah, what Dijkstra said.

This doesn't guarantee bug-free code. There might still be bugs that occur during the integration of parts that are working (for example, one module uses feet, while another uses inches), or the code may not have addressed the requirements properly (the customer wanted a moving average of the last 5 data points, while the code uses the average of all data points), but as a functional unit, each module is internally consistent and working according to its API.

The resulting unit test suite is an extremely valuable resource, just as valuable as the production code. What makes it so valuable? Two things: safety harness, and example usage.

It provides a safety harness to allow you to do additional work on the code, then run the suite as a regression test to prove you haven't broken anything. Or to detect breakage so you can fix it immediately.

Using and extending the suite liberates you to make changes to the code safely. Need to add some functionality? Fix one of those integration or requirements bugs? Refactor for better performance or maintainability? Clean up some tech debt? Have at it.

You can instantly prove to yourself that you haven't screwed anything up, or show that you have, so that you can fix it before it ever gets committed to the codebase. No one will ever see that dirty laundry.

It provides example usage, showing how to use the API: how to call the various functions, in what order, how to setup and drive various behavioral scenarios, how to exercise the interfaces for different functional behaviors, how different parameters affect things, how to interpret return values.

This is real, live code, showing how to use the production code as a client. You can even get creative and add exploratory tests that push the production code in odd directions to see what happens. Grenning calls these characterization tests and learning tests.

The test suite is actually something quite magical: self-updating documentation! Since you need to invest the time to maintain the tests in order to get the development done, you are also automatically updating the example usage documentation for free.

You might argue that tools like Doxygen offer similar self-updating capability, but they still require updating textual explanations along with the code. They are subject to the same staleness that can happen with any comments, where the comments (or Doxygen annotations) aren't kept up to date with code changes (see Tim Ottinger's Rules for Commenting for advice to help avoid stale comments).

But if you want to really know how to use the production code, go read the tests! If you've truly followed the TDD process as Grenning shows you in this course, they will tell you how to produce every bit of behavior that it is capable of, because every bit of behavior implemented will have been driven by the tests.

That's the full-circle, closed-loop feedback power of test-driven DEVELOPMENT.

Doxygen still has its place. I think of the Doxygen docs as API reference, while the test suite is API tutorial, showing actual usage.


Another Lesson

I've already alluded to the other interesting lesson that I drew from this course: it takes practice! We're not used to working like this, so it takes practice and self-awareness to learn how to do it.

That was particularly driven home by the coding exercises. Even though I had just read his book and followed through the exact same exercises, and read Langr's book, and applied the knowledge at work, I still had trouble getting rolling on the first couple of exercises. It was a matter of instilling the new habits.

It took a few times having Grenning redirect me (or listen to the advice he gave someone else). By the final exercise, after the benefit of his live feedback, I was able to catch myself in time and start applying the habits on my own.

It's still going to take some time. I'll know I've gotten there when I start thinking of the tests automatically as the first step of coding.


Third Time's A Charm

At one point in the discussion I mentioned that Grenning's book and this course represented my third attempt at using TDD, and one of the participant said he would be interested in hearing about my previous attempts.

My first attempt was in 2007, when I was introduced to TDD by a coworker. I read Kent Beck's Test Driven Development: By Example and used it to develop the playback control module for a large video-on-demand server intended for use in cable provider head ends.

This was both a great success and a classic failure. It was a great success in that it accelerated my work on the module, avoiding many bugs and shortening the debug cycle. In that respect it lived up to the promise of TDD completely.

It was a classic failure in that I made the tests far too brittle. I put too much internal knowledge of the module in them, with many internal details that were useful when I was first developing the module, but that became a severe impediment to ongoing maintenance.

The classic symptom of this problem was that a minor change in implementation would cause a cascade of test failures. The production code was fine, but some internal detail such as a counter value that was being checked by the tests had changed. Otherwise the test code itself was also fine. But I had overburdened it with details that should have been hidden by encapsulation.

The result was that ultimately I had to abandon the test suite. It had provided good initial value, but failed to deliver on-going value because it became a severe maintenance burden.

This is exactly the type of situation that Grenning's course seeks to prevent. During coding exercises, he watches out for cases of inappropriate information exposure. Thus another benefit of this is improved encapsulation and information hiding.

My second attempt was in 2013, when I wanted to refactor some of the code in an IP acceleration server as part of improvements to one of its features. I had read Michael Feathers' Working Effectively with Legacy Code, and found that many of the things he covered applied to the codebase I was working on.

This was a revenue-generating service product, so I needed to be sure I didn't break it.

The main strategy the book covers is to use TDD to provide that safety harness I mentioned above, in order to verify that the legacy code behaves the same after modification as it did before.

I began building a set of test fakes that could be used with Google Test. One issue was that the code relied heavily on the singleton pattern, so there always had to be some implementation of each class that would satisfy the linker. And of course there were chains of such dependencies interlocked in a web.

My first task was to replace that bit by bit with dependency injection. I focused just on the parts necessary to allow me to test the area I was modifying. Part of Feathers' strategy is to tackle just enough of the system at a time to be able to make progress, rather than a wholesale break-everything-down-and-rebuild approach.

I had enough success with this that once I finished my primary work on the feature changes, I embarked on a background project to put the entire codebase into 100% dependency injection. That would allow me to build unit tests for any arbitrary component, in combination with any set of faked dependencies, with the longer-term goal of building out near-100% unit test coverage incrementally.

However, not too long after starting this, I ended up changing jobs. So once again I got the short-term benefit from TDD, but didn't reap the long-term benefit. It was a useful exercise to go through, though, providing good experience on how to migrate such a codebase to TDD.

This is another area that Grenning's course covers.


Related Links

For the perspective of another class participant, see Phillip Johnston's post What I Learned from James Grenning's Remote TDD Course.

There are things about the TDD process that make people suspicious. Is it just hacking? In this interview with Grenning, embedded systems expert Jack Ganssle raises some of those concerns. Grenning explains how the process works to reach the goal of well-designed, working production code that meets customer requirements.

Elecia and Christopher White have a great interview podcast with Grenning. Best joke: how many Scrum masters does it take to manage one developer? Also good Shakespeare and Bradbury quotes that are much ado about programming.

Tuesday, August 14, 2018

Review: Test Driven Development for Embedded C, James W. Grenning



The TL;DR:
  • Test Driven Development for Embedded C by James W. Grenning is an outstanding book. 
  • The title says C, but if you work in C, C++, C#, Go, Objective-C, Java, Javascript, or anything else, this is worth reading.
  • It says embedded, but if you work in embedded systems, front end web apps, mobile apps, desktop apps, backend servers, or anything else, this is worth reading.
  • And it's not just TDD, it's all the concepts that go into good design.
  • Get it, read it, USE it. You won't regret it.
Background

I first learned about XP (eXtreme Programming) concepts in 2007, when I was introduced to Kent Beck's Test-Driven Development: By Example. I used TDD (Test-Driven Development) to develop a major component on a server system. I learned more in 2013, when I read Michael Feathers' Working Effectively With Legacy Code. I used that to apply TDD to an existing server codebase.

Over the past 3 months, I've been on a reading binge, triggered by reading Robert C. Martin's 2017 book Clean Architecture: A Craftsman's Guide to Software Structure and Design. I have an hour-long commuter rail ride, so I have lots of time to read and work on my laptop, plus a little lunchtime reading, and I always have a book open at home.

I read his Clean Code: A Handbook of Agile Software Craftsmanship, The Clean Coder: A Code of Conduct for Professional Programmers, and am currently in the middle of his Agile Software Development, Principles, Patterns, and Practices.

I read Sandro Mancuso's The Software Craftsman: Professionalism, Pragmatism, Pride, and am in the middle of Mike Cohn's Agile Estimating and Planning, both from Martin's series.

I read Andrew Hunt and David Thomas' The Pragmatic Programmer: From Journeyman to Master, and am halfway through Pete McBreen's Software Craftsmanship: The New Imperative; Martin Fowler's Refactoring: Improving the Design of Existing Code is waiting on the shelf.

I've encountered bits and pieces of this material over the years, but this was a chance to go back to primary sources, get the full details and parts I've missed out on, and really understand them. I highly recommend it.

Review

But maybe you don't have time for all that. Maybe you'd like to cut to the chase and see how to apply their principles in practice.

Test Driven Development for Embedded C by James W. Grenning does that. It draws from many of those sources and more, showing you real-world examples to put them into practice.

Grenning is one of the original authors of the Agile Manifesto (as are Beck, Fowler, Hunt, Martin, and Thomas). He contributed the chapter "Clean Embedded Architecture" to Clean Architecture, and is the inventor of the Agile planning poker estimation method.

The book was published in 2011, so is now 7 years old, but it remains as timely as ever. That's especially true as IoT vastly expands the number of embedded systems that we rely on in our daily lives. Effective testing is critically important. For instance, see Testing Is How You Avoid Looking Stupid.

If you work on embedded systems in C, this is a must read.

If you work in a different language besides C, or on a different type of system than embedded systems, you may not think that a book on embedded C programming applies to you. But it's broadly applicable and worth reading.

The book is organized as an introductory chapter, the remaining chapters grouped into 3 parts, and appendices. I see it as three distinct portions, plus appendices: Chapter 1; Parts I and II (chapters 2-10); and Part III (chapters 11-15).

Throughout, Grenning addresses the common concerns people have with applying TDD to embedded systems. Embedded systems are a particular challenge, with particular target system constraints, so people might be skeptical.

This is a very hands-on, how-to book. I've included a number of lists from it, including those Grenning draws from other sources, because they illustrate the practical, pragmatic, disciplined approach. You can use this as a cheat sheet to remember them after you've read the book.

It might be tempting to think you can get by just with the information I've provided here and skip the book. But I've included it specifically with the hope that you'll realize you must read the book, and that it will be a worthwhile investment.

First Portion

This is the motivational portion, the appetizer. Grenning introduces TDD, its benefits in general, and the specific benefits for embedded systems.

He lists Kent Beck's TDD microcycle:
  1. Add a small test.
  2. Run all the tests and see the new one fail, maybe not even compile.
  3. Make the small changes needed to pass the test.
  4. Run all the tests and see the new one pass.
  5. Refactor to remove duplication and improve expressiveness.
The microcycle is critically important to the technique, so Grenning reminds you of it several times as he works through examples. This is what makes TDD effective, and I know from my own experience is also what makes it fun and extremely satisfying. He has a sidebar titled "Red-Green-Refactor and Pavlov's Programmer", which is very apt. That Pavlovian drive to take the next step in the cycle draws you into the zone and keeps you cranking.

For embedded systems, in addition to all the benefits that apply to other types of software, the primary benefits include being able to develop tested, working code when the target hardware isn't available; being able to test off-target (i.e. not on the target embedded system), where you have all the benefits of a general-purpose system and none of the constraints of an embedded one, including speed of development turnaround cycle; being able to isolate hardware/software interactions; and decoupling software from hardware dependencies.

That last point is part of the Big Lesson (see below) from all this. TDD in general, for any type of software, results in testable and tested software. But more than that, it drives development in a way that improves the design significantly.

That improved design means a much longer and happier life for the software and the systems that use it. They will be able to adapt to changes much more easily. It's not just about getting V1.0 done. It's about getting to V10.0.

In Software Craftsmanship, Pete McBreen starts off with the origin of the term software engineering. It was coined by a 1967 NATO study group working on "the problems of software." A 1968 NATO conference identified a software crisis and suggested that software engineering was the best way out of that crisis. They were concerned with very large defense systems. McBreen gives the example of the SAFEGUARD Ballistic Missile Defense System, developed from 1969 through 1975.

He says, "These really large projects are really systems engineering projects. They are combined hardware and software projects in which the hardware is being developed in conjunction with the software. A defining characteristic of this type of project is that initially the software developers have to wait for the hardware, and then by the end of the project the hardware people are waiting for the software. Software engineering grew up out of this paradox."

McBreen is questioning the value of that style of large-scale software engineering in the development of commercial products, suggesting that a different approach is needed.

But doesn't that situation sound familiar? Doesn't that sound like the problem embedded systems developers face all the time, that Grenning is addressing? This was a situation where TDD and off-target testing could have significantly alleviated the software crisis.

Granted, it was more complicated, since they were also developing the very processors and programming languages they would use, while modern systems rely on COTS (Commercial Off The Shelf) processors and languages. But we see that this has been a pervasive problem for some 50 years.

All types of systems, from embedded to frontend mobile apps to high-scale backend servers, in all those languages, from C to C++, Objective-C, Go, Java, Javascript, etc., can benefit.

All that code can be removed from its normal production environment and run off-target, off-platform, in a unit test environment that allows you to exercise every code path you want easily and quickly. That includes the obscure dark corners of the code trying to handle unusual error cases that are hard to produce on the target system.

For some of my own experience testing off-target, see Off-Target Testing And TDD For Embedded Systems.

Second Portion

This portion is the meat of the book, applying TDD to real-world embedded development and going through the mechanics with practical examples.

Following the lead of Martin's book, Grenning makes restrained use of UML diagrams. While some people dislike UML because they associate it with the heavyweight BDUF (Big Design Up Front) software engineering methodologies that McBreen was talking about, this is a very effective use of it that communicates information quickly. Which is the whole point of UML.

Grenning presents two unit test harnesses, Unity and CppUTest (of which he is one of the authors). All of the material applies just as well to other test harness tools, such as Google Test/Google Mock. It's equally applicable to other languages and their language-specific test harnesses.

He uses Gerard Meszaros' Four-Phase Test pattern to structure tests:
  • Setup: Establish the preconditions to the test.
  • Exercise: Do something to the system.
  • Verify: Check the expected outcome.
  • Cleanup: Return the system under test to its initial state after the test.
The rubber meets the road in his five examples of using TDD to develop embedded code:
  • LED driver
  • Light scheduler for a home automation system
  • Circular buffer
  • Flash driver for ST Microelectronics 16 Mb flash memory device
  • OS isolation layer (aka OSAL, OS Abstration Layer) for Linux/POSIX, Micrium uC/OS-III, and Win32 (this is actually an appendix and only covers thread control, but establishes the pattern)
Clearly, these have real hardware dependencies on both the processor I/O interface and the attached devices, as well as the system clock, and real OS dependencies. Those are critical concerns for the embedded developer. The LED driver is very simple behavior, so makes for a gentle introduction. The others are more complex.

Grenning discusses driver requirements, then shows the initial tests and code. Notice I said tests first. That's an important concept in TDD. You always write the test first, that uses the code in the way that you want the code to work. Then you write the code that satisfies that usage. He emphasizes the save-make-run cycle that you do repeatedly during this process. Then you repeat for the next test and bit of code. That's how you make fast progress.

The key concept is faking out portions of the system, so that the Code Under Test (CUT) can run as if it was running on the real system. That's critical for making TDD work off-target and off-platform. There are several strategies for doing this. In the case of the LED driver, he uses virtual registers to simulate memory-mapped I/O. This is simply a variable under the control of the test suite.

He also talks about test-driving the interface before test-driving the internals. That's another critical concept, integral to the whole design process. That's design-for-change. Because things will change. A product with a long, useful life, that represents an ongoing revenue stream for a company, will change over that time to adapt to changes in underlying technologies, user requirements, and usage. TDD means you can make changes without fear of breaking things (because you'll find and fix breakage as a result of performing the microcycle).

He talks about the strategy of incremental progress and refactoring as you go. This is in the heat of development. Final code does not flow directly from your fingertips. It evolves in incremental steps as you work. Did you ever look at someone's code and marvel at how clean and easy to follow it was, despite the complexity of the job it was achieving? You might think you could never do something that easily. This process results in that kind of code. Like a novelist in the heat of writing a scene, the first draft is never the final product, and the story arc evolves over time.

This is where he covers several important guidelines for driving the TDD process effectively. He lists Robert Martin's Three Laws of TDD:
  • Do not write production code unless it is to make a failing unit test pass.
  • Do not write more of a unit test than is sufficient to fail, and build failures are failures.
  • Do now write more production code than is sufficient to pass the one failing unit test.
He describes Kent Beck's snappy acronym DTSTTCPW: Do The Simplest Thing That Could Possibly Work, which initially means just faking it (for instance, hard code a function to return false in order to get the test that uses it to pass). Then keep tests small and focused, and refactor on green (many unit test setups show a failing result in red, and a passing result in green).

As this evolves, the faked out code turns into real code (the hard coded false is changed to actual code that does something and returns true or false under the appropriate conditions). That builds out a verified test suite as it builds out verified code.

This leads to the TDD State Machine, which tells you what to do next. The guidelines above and the state machine take you through the mechanics of working in the TDD style. They answer the questions:
  • How should you start?
  • What should you do next?
  • How do you know when you're done?
Whenever you write some production code, ask yourself, "Do you have a test for that?". If not, stop, go back, and write the missing test.

He also covers Dave Thomas and Andrew Hunt's DRY principle: Don't Repeat Yourself. This mantra helps drive the refactoring so that you keep the code lean and clean. I'll throw in additionally the DAMP principle: use Descriptive And Meaningful Phrases, a concept the book applies without calling out by name. This favors readable function and variable names that express intent over cryptic abbreviations and syntax. The result is code that reads with a narrative flow.

Keeping your code DRY and DAMP makes it easy for others to understand and modify (which might be you when you come back to it six months or a year later). This is the same as Beck's microcycle step 5.

To some degree this all turns TDD into a very mechanistic process. But that's a good thing. It's not a random, ad hoc process where you're constantly questioning yourself about what do to. Instead it's an orderly stepwise process that makes effective progress. You quickly see and appreciate the value.

It's also very fun and satisfying, because that mechanistic aspect actually drives your creativity. What's the next thing you can add to it? What's the next test, the next bit of functionality? When you finish, you feel like you've accomplished something, and you have the evidence to prove it. It's addicting.

That leads to Grenning's Embedded TDD Cycle, which starts with TDD on the development system, then advances to the target processor and eval hardware, then the actual target hardware:
  • Stage 1: Write a unit test; make it pass; refactor. This is red-green-refactor, the TDD microcycle on the development platform.
  • Stage 2: Compile unit tests for target processor. This is a build check that verifies toolchain compatibility.
  • Stage 3: Run unit tests on the eval hardware or simulator.
  • Stage 4: Run unit tests on target hardware.
  • Stage 5: Run acceptance tests on target hardware. These are automated and manual tests of the integrated system.
This sequence gives you confidence in the code under test quickly, then you can address any hardware-dependent issues that start to arise, such as compiler, library, or primitive data type differences. Next you to start exercising the hardware-dependent code.

Testing separately on eval hardware and actual target hardware helps shake out hardware issues in the actual target, since the eval hardware is presumably known good. One of the challenges in embedded development is always trying to determine if problems are due to the software or due to the hardware, since both are in active development and haven't had much soak time to prove them out.

For the other TDD examples, Grenning goes through a progression of different collaborator strategies. These are the test doubles, the fakes, that are substitutable for real components. They stand in for those components to break the test dependencies and allow you to simulate and monitor interactions. An important point is that they are much lighter weight than full-scale simulators. Full simulators can themselves require significant development. These fakes have only enough behavior to support the tests (part of the DTSTTCPW mindset).

He uses these types of doubles:
  • Spies
  • Stubs
  • Mocks
  • Exploding fakes
He goes through the following substitution methods, showing how to do them and discussing when they are appropriate:
  • Link-time substitution
  • Function pointer substitution
  • Preprocessor substitution
  • Combined link-time and function pointer substitution
These are fully-worked-out examples, although he starts omitting intermediate steps as he progresses in the interest of brevity. All the code is available online.

Third Portion

This portion completes the meal, complementing the meat in the second portion. It addresses design issues. This is important because design for testability also means design for flexibility and long product life.

Grenning starts out with Martin's SOLID principles:
  • S: Single Responsibility Principle (SRP)
  • O: Open Closed Principle (OCP)
  • L: Liskov Substitution Principle (LSP)
  • I: Interface Segregation Principle (ISP)
  • D: Dependency Inversion Principle (DIP)
He covers both how the previous chapters have incorporated these principles, and how to use them to guide the development process. TDD is closely intertwined with them.

Don't be put off by the apparent difference between non-object oriented and object-oriented languages. The specific language used is irrelevant. The syntactic mechanics may be different, but the concerns and concepts are all the same. C can be every bit as object-oriented as Java, it just takes a little more developer discipline. That means that all of the concepts of the various principles above apply.

He uses the SOLID principles in four module design models of increasing complexity, applicable in different embedded system design cases:
  • Single-instance module: Encapsulates a module's internal state when only one instance of the module is needed.
  • Multiple-instance module: Encapsulates a module's internal state and lets you create multiple instance of the module's data.
  • Dynamic interface: Allows a module's interface functions to be assigned at runtime.
  • Per-type dynamic interface: Allows multiple types of modules with the same interface to have unique interface functions.
You'll probably recognize more than one of these in the systems you work on. You may also recognize object-oriented concepts, and in fact he shows how to implement, use, and test a C++ virtual function table (vtable) in C.

Part of good design is adapting to change. He covers Martin Fowler's concepts of refactoring, both the code smells that point to things that need to be refactored, and the strategies for doing it with TDD. He describes a disciplined stepwise process that avoids burning bridges.

This then leads into Michael Feathers' concepts of working on legacy code (which Feathers defines as "code without tests"). He lists Feathers' legacy code change algorithm:
  1. Identify change points.
  2. Find test points.
  3. Break dependencies.
  4. Write tests.
  5. Make changes and refactor.
He describes how to apply this to embedded systems. Two important types of unit tests during this process are characterization tests that establish how the legacy code behaves, and learning tests that help you learn how to work with third-party code.

The final chapter covers test patterns and antipatterns. This is useful for helping to build good, effective unit tests that are maintainable over the long term.

The Big Lesson

For embedded systems, working with the specific hardware is a critical detail. But as Martin points out in Clean Architecture, it's just a detail. For GUI-based mobile, web, and desktop apps, the GUI is just a detail. For either of these, as well as backend servers, the OS (or lack thereof on a bare metal system) is just a detail. The network is just a detail. The database or the filesystem is just a detail. The frameworks or third-party packages are just details.

All of those details, critical though they may be, can be isolated and segregated from the code that defines what it is your system is about. That code is called the business logic, which sounds a little too dry for me. But's it's the stuff that makes your system something that other people want to use. So it's the stuff that makes your system drive a meaningful business.

Your business logic interacts with all those details to make a functioning system. TDD allows you to test that logic, in all its happy, twisty, and unhappy paths, separated from its dependencies on the details. The details are represented by test doubles: dummies, stubs, spies, mocks, and fakes.

This is where the Gang of Four's concept of programming to an interface, not an implementation, stated in their book Design Patterns, comes into play. You write your business logic to work to an interface to accomplish the detail interactions. In the production environment, you use the real detail components, the real implementations, with a thin adaptation layer that conforms to the interfaces.

In the test environment you can substitute test doubles that conform to the interfaces; these are alternate implementations. Since you're in control of the test doubles, you can drive any scenario you need to in order to exercise the business logic.

That isolation also allows you to substitute in other versions of production details, so it's a design strategy, not just a testability strategy. Maybe you want to use some different hardware in your embedded system, or run your app on a different mobile device with a different GUI, or deploy the system on a different OS, or use a different database.

By defining your details as abstract data types or abstract services, you can drop in replacements, with just the effort of implementing the interface layers.

Tuesday, April 17, 2018

More C+-

In The Case For C+-, I talked about writing quick tools in a simple C style, but taking advantage of the C++ standard library, primarily the dynamic data structures. It ends up being C++ without any (or just a few) user-defined classes, so is something of a lightweight object-oriented approach (yes, yes, I'm sure OO purists are barfing at the thought). The main benefit is fast coding.

There I showed as an example the msgresolve tool, which I used to resolve messages logged by an IOT device (the client) and its server. This is a lot of string processing and cross-indexing, with logs containing potentially thousands or tens of thousands of messages.

Shortly after I had completed msgresolve, I needed to have a tool to help me sift through large text files of server logs, logging the TCP connections made by clients and their subsequent activity. I was chasing down a problem where some of the connections were shutting down sooner than expected.

I wasn't sure what was causing the early shutdowns, and wasn't even sure initially which connections had experienced it, so I wanted to be able to gather all the lines for a given connection and list them out for tracing through, for each connection.

That would help me identify the ones that were live at the end of the log sample vs. the ones that had ended early. The log entries for hundreds of connections were all intermixed.

Armed with the methods I had used in msgresolve.cpp, conceptual design was easy. I wanted an ordered list of connections, and associated with each one, the sequential list of log entries associated with the connection.

There were also connections with some internal addresses I wanted to ignore. I could have done this filtering with grep, but it was easy enough to build the capability into the program so that it could stand alone. That also helped me explore some additional string processing functions.

Given that architecture, the data structure I needed was a std::map that mapped a string (the connection identification) to a std::list of strings (the log lines for the connection).

I had the program working in less than an hour. Then I spent at least another hour screwing around with the timestamps in the log entries, figuring out how to process them and deciding what to do with them. Then a little more time on refactoring and cleanup.

Throughout, I used a sample log file that had entries for several connections, including addresses I wanted to skip. I used that as a simple unit test to exercise the code paths.

The resulting code provided the impetus for a simple generalized string processing module, which I'll cover in another post. But you can see some clear patterns emerging in this code.

Doing quick tools like this is fun and very satisfying. It makes your serotonin flow. You have a problem you need to deal with, so you sit down and spew a bunch of code in a short time, refine it, and use the results.

This is actually quite different from long-term product development. That kind of work has its intense coding phases, but once the initial version of the product is out, a lot of the work is much smaller surgical changes.

Even fitting a major new feature in often involves many small bits of code scattered throughout the larger code base, integrating the tendrils. Getting that to work has a different kind of satisfaction.

Design Choices

These tools also give you a chance to think about different approaches. You can balance the variables of memory consumption, CPU consumption, I/O consumption, time, and code complexity (that is, ease of writing and maintaining the code, and compiled code space consumption, not algorithmic complexity) for a given situation.

For instance, the log files I was dealing with had over a million lines of data, some 200MB worth covering hundreds of connections.

That meant I had several choices:
  1. I could load all the data into memory and then print it out in an orderly manner. This is a single-pass solution, that consumes large amounts of memory.
  2. I could scan the file once, identifying all the individual connections, then for each connection, scan the file from beginning to end to read and print their lines. This a multi-pass solution that requires little memory but significant file I/O.
  3. I could scan the file once, and for each identified connection, track the file position of the first and last line, then for each connection, just scan that range of the file. This is a multi-pass solution that reduces the total file I/O for a negligible increase in memory.
  4. I could do the same thing, but instead of tracking just the first and last line file positions, build a list of the file position and length of each line, then on each pass, just skip directly to the locations of the lines. This is still multi-pass, but significantly reduces the total file I/O because it only visits each file position twice, requiring a bit more complexity and a bit more memory.
The decision on which choice to use is system-dependent. If memory is cheap and plentiful, and file I/O is relatively expensive, either in terms of time or charges to transfer data over a data link (maybe the data is remote, accessed over a cellular link), then the single-pass solution is best, choice #1.

On a small-memory system, a multi-pass solution is better, and you just have to live with the extra I/O. In that case, choice #4, which is the most complicated code, has the best compromise of low memory and low I/O consumption.

Although if you're really pressed for code space, the simplest multipass solution that iterates over the entire file for each connection is the better choice, #2.

Realistically though, you don't run tools such as this locally on small systems. Where possible, you offload the data to a larger system and run the tools offline.

In this case, I'm running on a Mac with 16GB of memory. Slurping up a 200MB text file and holding everything in memory is nothing. So the single-pass solution is the way to go.

Loading The Data

The log file may just be a chronological sample of all the activity logged by the server, so some connections will already exist at the start of the log, and some will remain at the end. Meanwhile, connection starts and terminations will appear in the log.

The program parses the lines from the log to find connection lines (i.e. some activity for a given connection). It identifies them by looking for a connection ID, which consists of an IPv4 address/port pair (a remote socket ID), and may optionally include a hexadecimal client ID.

It uses just the socket ID as the connection ID, which is the key to the connection map. When it finds a new connection ID, it adds it to the map with an empty list. For each connection line, it finds the connection map entry and appends the line to the list of lines for that connection.

As it loads connection lines, it filters them against the set of IP addresses to skip (these skip address are due to logging of other types of connections besides the client connections). That helps reduce the noise from a large log file.

Printing The Data

The program prints the connections by first iterating through the map and printing out a summary of each one: the connection ID, the number of lines, the duration of the connection data found in the log, and how its lifetime relates to the overall log. Since the map is an ordered structure, printing connections is always in sorted order by connection ID (though string sorted, not numeric sorted).

Then it makes a second pass through the connection map, iterating through each line for each connection and printing it out, with separators between connections. It prints a header and summary line before and after the activity lines for each connection, showing where the connection starts and ends relative to the start and end of the log.

As a last quick change, I added a threshold time value to clearly identify connections that ended at least 60 seconds before the end of the log. This would be a good candidate for a command-line parameter to override the default threshold.

All the output lines have unique greppable features or keywords so you can use other tools for additional postprocessing or extraction. For instance, I could grep out the end summary line of each connection, and maybe the last couple of activity lines before it, to see how each connection ended up. I could use the "threshold exceeded" indication to identify the ones that had ended early.

Some Design Evolution

This program adds the isMember() function, which determines whether a string is a member of a set of strings. Since my usage here was intended to deal with a small set and I had other functions that had similar iterative structure, in the heat of battle I quickly coded it as a linear search of a vector of strings.

That worked fine here, but as I pulled a bunch of this code out into a general string processing module, I realized that was a bad choice, because it's an O(N) search.

That became especially bad when I wanted an overload that took a vector of strings and determined if they were all members. That meant an O(M) repetition of O(N) searches: an O(M*N) or effectively O(N^2) algorithm.

That gets out of hand fast as M and N get larger. Meanwhile, the std::unordered_set is perfect for this, an O(1) algorithm for single searches, and an O(M) algorithm when repeated for M items.

I've left the original isMember() implementation here as an example of the evolution of a concept as you generalize it for other uses.

I also threw in a few overloads that I didn't end up using, but that set the stage for running with the concept in the string processing module. More discussion of that in the post containing the module.

The Result

This program turned out to be another fun exercise in string processing once I had built the basic support functions and could see the problem in those terms. It felt more like working in Python, and in fact just as the dict structure from msgresolve.cpp was inspired by Python, so are the split() and join() functions here.

The funny thing is that it took doing stuff in Python to make me see this approach. That points out one of the advantages of working in multiple different languages: you start seeing opportunities to apply some of the common idioms from one language in another language.

Here's logsplit.cpp:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
// Usage: logsplit <serverLog>
//
// Splits a server log file by IPv4 connection. Prints a
// summary list of the connections, then the log lines for
// each separate connection.
//
// This is an example of a C++ program that is written mostly
// in plain C style, but that makes use of the container and
// composition classes in the C++ standard library. It is a
// lightweight use of C++ with no user-defined classes.
//
// 2018 Steve Branam <sdbranam@gmail.com> learntocode

#include <iostream>
#include <iomanip>
#include <sstream>
#include <vector>
#include <list>
#include <map>

enum ARGS
{
    ARGS_PROGNAME,
    ARGS_SERVER_LOG,
    ARGS_REQUIRED,
    ARGS_SKIP = ARGS_REQUIRED
};

enum SERVER
{
    SERVER_DATE,
    SERVER_TIME,
    SERVER_THREAD,
    SERVER_SEVERITY,
    SERVER_FUNC,
    SERVER_CONN,
    SERVER_TIME_LEN = 16,
    SERVER_TIMESTAMP_LEN = 28
};

enum CONN
{
    CONN_IP,
    CONN_PORT,
    CONN_CLIENT_ID
};

enum
{
    END_TIME_THRESHOLD = 60
};

typedef std::string String;
typedef std::vector<String> StringVec;
typedef std::list<String> StringList;
typedef std::map<String, StringList> ConnMap;
typedef std::pair<String, StringList> ConnMapEntry;

const char *timeFormat = "%Y-%m-%d %H:%M:%S";
StringVec skipIps;
size_t lines = 0;
size_t skipped = 0;
String firstTimestamp;
String lastTimestamp;
ConnMap connections;

StringVec split(const String& str, const char* delim)
{
    char buffer[str.size() + 1];
    StringVec strings;

    strcpy(buffer, str.c_str());

    char *token = std::strtok(buffer, delim);
    while (token != NULL) {
        strings.push_back(token);
        token = std::strtok(NULL, delim);
    }
    
    return strings;
}

String join(const StringVec& strings, const String& sep,
            size_t start = 0, size_t end = 0)
{
    String str;

    if (!end) {
        end = strings.size();
    }
    for (size_t i = start; i < end; ++i) {
        str.append(strings[i]);
        if (i + 1 < end) {
            str.append(sep);
        }
    }
    return str;
}

bool isMember(const String&str, const StringVec& set)
{
    for (size_t i = 0; i < set.size(); ++i) {
        if (str == set[i]) {
            return true;
        }
    }

    return false;
}

typedef int (*CharMatch)(int c);

bool isToken(const String& token, CharMatch isMatch)
{
    if (token.empty()) {
        return false;
    }
    else {
        for (size_t i = 0; i < token.size(); ++i)
        {
            if (!isMatch(token[i])) {
                return false;
            }
        }
    }
    return true;
}

bool isToken(const StringVec& tokens, CharMatch isMatch,
             size_t start = 0, size_t end = 0)
{
    if (!end) {
        end = tokens.size();
    }
    for (size_t i = start; i < end; ++i) {
        if (!isToken(tokens[i], isMatch)) {
            return false;
        }
    }
    return true;
}

bool isNumeric(const String& token)
{
    return isToken(token, isdigit);
}

bool isHex(const String& token)
{
    return isToken(token, isxdigit);
}

bool isNumeric(const StringVec& tokens,
               size_t start = 0, size_t end = 0)
{
    return isToken(tokens, isdigit, start, end);
}

bool isIpv4Address(const String& str)
{
    StringVec tokens(split(str, "."));

    return ((tokens.size() == 4) &&
            isNumeric(tokens));
}

bool isIpv4Port(const String& str)
{
    return ((str.size() <= 5) &&
            isNumeric(str));
}

bool isIpv4Socket(const StringVec& strings)
{
    return ((strings.size() >= 2) &&
            isIpv4Address(strings[0]) &&
            isIpv4Port(strings[1]));
}

time_t getTime(const String& strTime, const char* format)
{
    std::tm t = {};
    std::istringstream ss(strTime);
    ss >> std::get_time(&t, format);
    return mktime(&t);
}

time_t getTime(const String& field)
{
    // Skip opening and closing brackets.
    return getTime(field.substr(1, SERVER_TIMESTAMP_LEN - 2),
                   timeFormat);
}

size_t getDuration(const time_t& start, const time_t& stop)
{
    size_t seconds(difftime(stop, start));
    return seconds;
}

size_t getDuration(const String& start, const String& stop)
{
    return getDuration(getTime(start), getTime(stop));
}

size_t getDuration(const time_t& start, const String& stop)
{
    return getDuration(start, getTime(stop));
}

size_t getDuration(const String& start, const time_t& stop)
{
    return getDuration(getTime(start), stop);
}

bool isServerTime(const String& str)
{
    if (str.size() == SERVER_TIME_LEN) {
        for (size_t i = 0; i < str.size(); ++i)
        {
            if (!isdigit(str[i]) &&
                (str[i] != ':') &&
                (str[i] != '.') &&
                (str[i] != ']')) {
                return false;
            }
        }
        return true;
    }
    return false;
}

bool isConnId(const String& str)
{
    StringVec fields(split(str, ":"));

    return (isIpv4Socket(fields) &&
            (fields.size() < CONN_CLIENT_ID + 1 ||
             isHex(fields[CONN_CLIENT_ID])));
}

bool isServerConn(const StringVec& fields)
{
    return ((fields.size() > SERVER_CONN) &&
            isServerTime(fields[SERVER_TIME]) &&
            isConnId(fields[SERVER_CONN]));
}

bool loadServer(const char* fileName)
{
    FILE* file = std::fopen(fileName, "r");
    
    if (file) {
        char buffer[1000];
        while (std::fgets(buffer, sizeof(buffer), file) != NULL) {
            String line(buffer);
            StringVec fields = split(buffer, " \t");

            if (isServerConn(fields)) {
                ++lines;
                lastTimestamp = line.substr(0, SERVER_TIMESTAMP_LEN);
                if (firstTimestamp.empty()) {
                    firstTimestamp = lastTimestamp;
                }
                
                strncpy(buffer, fields[SERVER_CONN].c_str(),
                        sizeof(buffer));
                StringVec conn = split(buffer, ":");

                if (isMember(conn[CONN_IP], skipIps)) {
                    ++skipped;
                }
                else {
                    String key(conn[CONN_IP]);
                    key.append(":");
                    key.append(conn[CONN_PORT]);

                    ConnMap::iterator match;
                    match = connections.find(key);
                    if (match == connections.end()) {
                        connections.insert(ConnMapEntry(key,
                                           StringList()));
                        match = connections.find(key);
                    }
                    match->second.push_back(line);
                }
            }
        }
        std::fclose(file);
        if (connections.empty()) {
            std::cout << "No connections found" << std::endl;
            return false;
        }
        return true;
    }
    std::cout << "Failed to open server file"
              << fileName << std::endl;
    return false;
}

void printSeparator()
{
    std::cout << std::endl
              << "=-=-=-=-" << std::endl
              << std::endl;
}

void listConnections()
{
    std::cout << connections.size() << " connections "
              << firstTimestamp << "-" << lastTimestamp << " "
              << lines << " lines, "
              << getDuration(firstTimestamp, lastTimestamp) << " sec:"
              << std::endl;
    if (skipIps.size()) {
        std::cout << "(skipped " << skipped << " connections with "
                  << join(skipIps, ", ") << ")" << std::endl;
    }
    std::cout << std::endl;
    for (ConnMap::iterator curConn = connections.begin();
         curConn != connections.end();
         curConn++) {        
        String conn(curConn->first);
        StringList connLogs(curConn->second);
        std::cout << conn << "\t"
                  << connLogs.front().substr(0, SERVER_TIMESTAMP_LEN)
                  << "-" << connLogs.back().substr(0, SERVER_TIMESTAMP_LEN)
                  << " " << connLogs.size() << " lines, "
                  << getDuration(connLogs.front(), connLogs.back())
                  << " sec" << std::endl;
    }
    printSeparator();
}

void logConnections()
{
    time_t timeFirst = getTime(firstTimestamp);
    time_t timeLast = getTime(lastTimestamp);
    
    for (ConnMap::iterator curConn = connections.begin();
         curConn != connections.end();
         curConn++) {        
        String conn(curConn->first);
        StringList connLogs(curConn->second);
        size_t duration(getDuration(connLogs.front(), connLogs.back()));
        
        std::cout << "Connection " << conn
                  << " " << connLogs.front().substr(0, SERVER_TIMESTAMP_LEN)
                  << "-" << connLogs.back().substr(0, SERVER_TIMESTAMP_LEN)
                  << " " << connLogs.size() << " lines, "
                  << duration << " sec:" << std::endl << std::endl;

        size_t seconds = getDuration(timeFirst, connLogs.front());
        std::cout << firstTimestamp << " Starts " << seconds
                  << " sec after start of log." << std::endl;

        for (StringList::iterator curLog = connLogs.begin();
             curLog != connLogs.end();
             curLog++) {        
            std::cout << *curLog;
        }

        seconds = getDuration(connLogs.back(), timeLast);
        std::cout << lastTimestamp
                  << " " << connLogs.size() << " lines, "
                  << duration << " sec. Ends "
                  << seconds << " sec before end of log.";
        if (seconds > END_TIME_THRESHOLD) {
            std::cout << " Exceeds threshold.";
        }
        std::cout << std::endl;
        printSeparator();
    }
}

int main(int argc, char* argv[])
{
    if (argc < ARGS_REQUIRED ||
        String(argv[1]) == "-h") {
        std::cout << "Usage: " << argv[ARGS_PROGNAME]
                  << " <serverLog> [<skipIps>]" << std::endl;
        std::cout << "Where <skipIps> is comma-separated list "
                  << "of IP addresses to skip." << std::endl;
        return EXIT_FAILURE;
    }
    else {
        if (argc > ARGS_REQUIRED) {
            skipIps = split(argv[ARGS_SKIP], ",");
        }
        
        if (loadServer(argv[ARGS_SERVER_LOG])) {
            listConnections();
            logConnections();
        } else {
            return EXIT_FAILURE;
        }
    }
    return EXIT_SUCCESS;
}

The sample log file, conns.log:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[2018-04-03 13:16:29.469659] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket receive_buffer_size=117708
[2018-04-03 13:16:29.469678] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket send_buffer_size=43520
[2018-04-03 13:16:29.867381] [0x00007fb3ff129700] [debug]   set_idle_send_timeout() 10.1.180.206:30450 set idle send timeout to 60 seconds
[2018-04-03 13:16:29.867394] [0x00007fb3ff129700] [debug]   set_idle_receive_timeout() 10.1.180.206:30450 set idle receive timeout to 120 seconds
[2018-04-03 13:16:29.867450] [0x00007fb3ff92a700] [info]    handle_connected() 10.1.180.206:30450 remote connected [2423/8951]
[2018-04-03 13:16:29.959877] [0x00007fb3ff129700] [debug]   install_connection() 10.1.180.206:30450:00003bd4 linkType 0
[2018-04-03 13:16:29.966599] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x4a2137a6, 231 bytes
[2018-04-03 13:16:29.966935] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xa11f878a, 35 bytes, msg type 3
[2018-04-03 13:17:29.967117] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x1c35386e, 29 bytes, msg type 1
[2018-04-03 13:17:29.967228] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:17:30.086722] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:30.086813] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:40.086722] [0x00007fb3ff129700] [debug]   receive() 10.2.80.206:3050:00000bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:40.086813] [0x00007fb3ff129700] [debug]   send() 10.2.80.206:3050:00000bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:30.139377] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xa0f8a8e1, 78 bytes
[2018-04-03 13:18:29.867494] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 127.0.0.1:32450:00003bd4 60 seconds
[2018-04-03 13:18:29.967315] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:18:30.086988] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x8fcf53f6, 29 bytes, msg type 1
[2018-04-03 13:18:30.087101] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:18:30.197029] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x515f1d4e, 29 bytes
[2018-04-03 13:18:30.197120] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x6827d190, 29 bytes, msg type 17
[2018-04-03 13:18:30.249027] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xda66c5c7, 78 bytes
[2018-04-03 13:19:30.087189] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:19:30.139486] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 10.1.180.206:30450:00003bd4 60 seconds
[2018-04-03 13:19:30.197322] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x812afb16, 29 bytes, msg type 1

Sample run:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
$ g++ logsplit.cpp -o logsplit
$ ./logsplit conns.log 127.0.0.1,3.3.3.3
2 connections [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 25 lines, 181 sec:
(skipped 1 connections with 127.0.0.1, 3.3.3.3)

10.1.180.206:30450 [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 22 lines, 181 sec
10.2.80.206:3050 [2018-04-03 13:17:40.086722]-[2018-04-03 13:17:40.086813] 2 lines, 0 sec

=-=-=-=-

Connection 10.1.180.206:30450 [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 22 lines, 181 sec:

[2018-04-03 13:16:29.469659] Starts 0 sec after start of log.
[2018-04-03 13:16:29.469659] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket receive_buffer_size=117708
[2018-04-03 13:16:29.469678] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket send_buffer_size=43520
[2018-04-03 13:16:29.867381] [0x00007fb3ff129700] [debug]   set_idle_send_timeout() 10.1.180.206:30450 set idle send timeout to 60 seconds
[2018-04-03 13:16:29.867394] [0x00007fb3ff129700] [debug]   set_idle_receive_timeout() 10.1.180.206:30450 set idle receive timeout to 120 seconds
[2018-04-03 13:16:29.867450] [0x00007fb3ff92a700] [info]    handle_connected() 10.1.180.206:30450 remote connected [2423/8951]
[2018-04-03 13:16:29.959877] [0x00007fb3ff129700] [debug]   install_connection() 10.1.180.206:30450:00003bd4 linkType 0
[2018-04-03 13:16:29.966599] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x4a2137a6, 231 bytes
[2018-04-03 13:16:29.966935] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xa11f878a, 35 bytes, msg type 3
[2018-04-03 13:17:29.967117] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x1c35386e, 29 bytes, msg type 1
[2018-04-03 13:17:29.967228] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:17:30.086722] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:30.086813] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:30.139377] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xa0f8a8e1, 78 bytes
[2018-04-03 13:18:29.967315] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:18:30.086988] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x8fcf53f6, 29 bytes, msg type 1
[2018-04-03 13:18:30.087101] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:18:30.197029] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x515f1d4e, 29 bytes
[2018-04-03 13:18:30.197120] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x6827d190, 29 bytes, msg type 17
[2018-04-03 13:18:30.249027] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xda66c5c7, 78 bytes
[2018-04-03 13:19:30.087189] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:19:30.139486] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 10.1.180.206:30450:00003bd4 60 seconds
[2018-04-03 13:19:30.197322] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x812afb16, 29 bytes, msg type 1
[2018-04-03 13:19:30.197322] 22 lines, 181 sec. Ends 0 sec before end of log.

=-=-=-=-

Connection 10.2.80.206:3050 [2018-04-03 13:17:40.086722]-[2018-04-03 13:17:40.086813] 2 lines, 0 sec:

[2018-04-03 13:16:29.469659] Starts 71 sec after start of log.
[2018-04-03 13:17:40.086722] [0x00007fb3ff129700] [debug]   receive() 10.2.80.206:3050:00000bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:40.086813] [0x00007fb3ff129700] [debug]   send() 10.2.80.206:3050:00000bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:19:30.197322] 2 lines, 0 sec. Ends 110 sec before end of log. Exceeds threshold.

=-=-=-=-