Flink And Blink

New Location For Blog Posts

2021-08-06T15:36:00.001-07:00

I'm blogging these days at EmbeddedRelated.com. See my latest posts.

Thanks for reading, and I'll see you there!

Steve

Review: Clean Agile, by Robert C. Martin, and More Effective Agile, by Steve McConnell

2019-11-03T17:15:00.001-08:00

This started out as a review of McConnell's book, but Just-In-Time, my pre-order of Uncle Bob's book arrived Friday. Ah, sweet serendipity! I read it yesterday, and it fits right in.

I have no idea what the two authors think of each other. I don't know if they're friends, enemies, or frenemies. I don't know if they shake their fists at each other or high-five. But as a software developer, I do believe they're both worth listening to.

I've read most of the books in Martin's Clean Code series. I'm a big fan. He was one of the original signatories of the Agile Manifesto.

A recent post by Phillip Johnston, CEO of Embedded Artistry, set me off on a path reading some of Steve McConnell's books and related material. I've become a big fan of his as well.

Week before last, I read McConnell's Software Estimation: Demystifying the Black Art, 2006. Last week, I read his new book More Effective Agile: A Roadmap for Software Leaders, that just came out in August, the one I'm reviewing here.

This week, I'm reading his Code Complete: A Practical Handbook of Software Construction, 2nd edition, 2004, and Software Requirements, 3rd edition, 2013, by Karl Wiegers and Joy Beatty (or maybe over the next few weeks, since they total some 1500 pages; I note that in the Netflix documentary series "Inside Bill's Brain: Decoding Bill Gates", one of his friends says Gates reads 150 pages an hour; that's a superpower, and I am totally jealous!).

These are areas where software engineering practice has continually run into problems.

The Critical Reading List

Martin's and McConnell's new books are excellent, to the point that I can add them as the other half of this absolutely critical reading list:

The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd edition), 1995, by Frederick P. Brooks, Jr.
Peopleware: Productive Projects and Teams, 3rd edition, 2013, by Tom DeMarco and Timothy Lister.
Clean Agile: Back to Basics, 2019, by Robert C. Martin.
More Effective Agile: A Roadmap for Software Leaders, 2019, by Steve McConnell.

In fact, I would be so bold as to say that not reading these once you know about them constitutes professional negligence, whether you are an engineer, a manager, or an executive. If you deal with software development in any way, producer or consumer, you must read these.

Brooks' first edition outlined the problems in software engineering in 1975. Twenty years later, his second edition showed that we were still making the same mistakes.

There are a few items that are extremely dated and quaint. Read those for their historical perspective. But don't for a moment doubt the timely relevance of the rest of the book.

Brooks is the venerated old man of this. Everybody quotes him, particularly Brooks' Law: Adding human resources to a late software project makes it later.

Every 12 years after Brooks' first edition, DeMarco and Lister addressed the theme from a different perspective in their editions of Peopleware.

Forty-four years after, we are still making the same mistakes, just cloaked in the Agile name. So McConnell's new book addresses those issues in modern supposedly Agile organizations, with suggestions about what to do about them.

Meanwhile, Martin's book returns us to the roots of Agile, literally back to the basics to reiterate and re-emphasize them. Because many of them have been lost in what Martin Fowler calls "the Agile Industrial Complex," the industry that has grown out of promoting Agile.

The first three books are easy reading. McConnell's is roughly equivalent to two of them put together. It also forms the root of a study tree of additional resources, outlining a very practical and pragmatic approach.

There are clearly some tensions and disagreements between the authors and the way things have developed. Martin goes so far as to include material with dissenting opinions in his book.

Don't just read these once. Re-read them at least once a year. Each time, different details will feel more relevant as you make progress.

Problems

The problems in the industry that have persisted for decades can be summarized as late projects, over budget, and poor software that doesn't do what it's supposed to do or just plain doesn't work.

Tied up in this are many details. Poor understanding and management of requirements, woefully underestimated work, poor understanding of hidden complexities, poor testing, poor people management.

Much of it is the result of applying the Taylor Scientific Management method to software development. Taylorism may work for a predictable production line of well-defined inputs, steps, and outputs, running at a repeatable rate, but it is a terrible model for software management. Software development is not a production line. There are far too many unknowns.

In general, most problems arise because companies practice the IMH software project management method: Insert Miracle Here. With Agile, they have adopted the IAMH variant: Insert Agile Miracle Here.

But as Brooks writes, there are no silver bullets. Relying on miracles is not an effective project management technique. This is a source of no end of misery for all involved with software.

As Sandro Mancuso, author of the Clean Code series book The Software Craftsman: Professionalism, Pragmatism, Pride (Yes! Read it!) writes in chapter 7 of Clean Agile, "Craftsmanship", "the original Agile ideas got distorted and simplified, arriving at companies as the promise of a process to deliver software faster." I.e. miracles.

A Pet Peeve (Insert Rant Here)

One of the areas of disagreement between various authors is the open-plan office. The original Agile concept was co-locating team members so that they could communicate immediately, directly, and informally, at higher bandwidth than through emails or heavy formal documents. It was meant to foster collaboration and remove impediments to effective communication.

Peopleware is extremely critical of the open-plan office, and I couldn't agree more. The prevailing implementation of it is clearly based more on the idea of cutting real-estate and office overhead costs than on encouraging productive communication. The result has all the charm of a cattle concentration feedlot, everyone getting their four square feet to exist in.

Another distortion of the Agile concepts embraced by management at the cost of actual effective development. That might make the CFO happy, but it's a false economy that should horrify the CTO.

Those capex savings can incur significant non-recurring engineering costs and create technical problems that will incur further downstream development and support costs. And that just means more opex for facilities where the engineering gets done, because the project takes longer.

You're paying me all this money to be productive and concentrate on complex problems, then you deliberately destroy my concentration to save on furniture and floorspace? It's like a real-life version of Kurt Vonnegut's short story Harrison Bergeron. What does that do to the product design and quality? What customer problems does it create, with attendant opportunity costs?

I turned down an excellent job offer in 2012 after the on-site interviews because of this. I was bludgeoned by my impression of the office environment: sweatshop. They probably thought of me as a prima donna.

McConnell also recommends against this, referencing the 2018 article It's Official: Open-Plan Offices Are Now the Dumbest Management Fad of All Time, which summarized the findings of a Harvard study on the topic. The practice appears to me to be the office-space equivalent of Taylorism.

Ok, now that I have all that off my chest, on to the actual reviews.

Clean Agile, Robert C. Martin

Martin's premise is that Agile has gotten muddled. He says it has gotten blurred through misinterpretation and usurpation.

His purpose is to set the record straight, "to be as pragmatic as possible, describing Agile without nonsense and in no uncertain terms."

He starts out with the history of Agile, how it came about, and provides an overview of what it does. He then goes on to cover the reasons for using it, the business practices, the team practices, the technical practices, and becoming Agile.

An important concept is the Iron Cross of project management: good, fast, cheap, done: pick any three. He says that in reality, each of these four attributes have coefficients, and good management is about managing those coefficients rather than demanding they all be at %100; that is the kind of management Agile strives to enable, by providing data.

The next concept is Ron Jeffries' Circle of Life: the diagram decribing the practices of XP (eXtreme Programming). Martin chose XP for this book because he says it is the best defined, the most complete, and the least muddled of the Agile processes. He references Kent Beck's Extreme Programming Explained: Embrace Change (he prefers the original 2000 edition; my copy is due to arrive week after next).

The enumeration and description of the various practices surprised me, reinforcing his point that things have gotten muddled. While I was aware of them, I was not aware of their original meanings and intent.

The most mind-blowing moment was reading about acceptance tests, under the business practices. Acceptance tests have become a real hand-waver, "one of the least understood, least used, and most confused of all the Agile practices."

But as he describes them, they have the power to be amazing:

The business analysts specify the happy paths.
QA writes the tests for those cases early in the sprint, along with the unhappy paths (QA engineer walks into a bar; orders a beer; orders 9999 beers; orders NaN beers; orders a soda for Little Bobby Tables; etc.). Because you want your QA people to be devious and creative in showing how your code can be abused, so that you can prevent anyone else from doing it. You want Machiavelli running your QA group.
The tests define the targets that developers need to hit.
Developers work on their code, running the tests repeatedly, until the code passes them.

Holy crap! Holy crap! This ties actual business-defined requirements end-to-end through to the running code. It is a fractal-zoom-out-one-level application of Test Driven Development (and we all thought TDD was just for the developer-written unit tests!).

It completely changes the QA model. Then the unit and acceptance tests get incorporated into Continuous Build, under the team practices.

There are other important business practices that I believe are poorly understood, such as splitting and spikes. Splitting means splitting a complex story into smaller stories, as long as you maintain the INVEST guidelines:

Independent
Negotiable
Valuable
Estimable
Small
Testable

Splitting is important when you realize a story is more complex than originally thought, a common problem. Rather than trying to beat it into submission (or be beaten into submission by the attempt), break it apart and expose the complexity in manageable chunks.

I never knew just what a spike was. It's a meta-story, a story for estimating a story. It's called that because it's a long, thin slice through all the layers of the system. When you don't know how to estimate a story, you create a spike for the sole purpose of figuring that out.

Almost as mind-blowing is his discussion of the technical practices. Mind-blowing because much of this whole area has been all but ignored by most Agile implementations. Reintroducing them is one of the strengths of this book.

Martin has been talking about this for a while. He gave the talk in this video, Robert C. Martin - The Land that Scrum Forgot, at a 2011 conference (very watchable at 2x speed). The main gist is that Scrum covered the Agile management practices, but left out the Agile technical practices, yet they are fundamental to making the methodology succeed.

These are the XP practices:

Test-Driven Development (TDD), the double-entry bookkeeping of software development.
Refactoring.
Simple Design.
Pair Programming.

Of these, I would say TDD is perhaps the most-practiced. But all of these have been largely relegated to a dismissive labeling as something only the extremos do. Refactoring is seen as something you do separately when things get so bad that you're forced into it. Pair programming in particular is viewed as a non-starter.

I got my Scrum training in a group class taught by Jeff Sutherland, so pretty much from the horse's mouth. That was 5 years ago, so my memory is a bit faded, but I don't remember any of these practices being covered. I learned about sprints and stories and points, but not about these.

As Martin describes them, they are the individual daily practices that developers should incorporate into every story as they do them. Every story makes use of them in real-time, not in some kind of separate step.

TDD follows the outline I listed in Review: Test Driven Development for Embedded C, James W. Grenning.

Refactoring builds on the TDD cycle, recognizing that writing code that works is a separate dimension from writing code that is clean:

Create a test that fails.
Make the test pass.
Clean up the code.
Return to step 1.

Simple Design means "writing only the code that is required with a structure that keeps it simplest, smallest, and most expressive." It follows Kent Beck's rules:

Pass all the tests.
Reveal the intent (i.e. readability).
Remove duplication.
Decrease elements.

Pair programming is the one people find most radical and alarming. But as Martin points out, it's not an all-the-time 100% thing. It's an on-demand, as-needed practice that can take a variety of forms as the situation requires.

Who hasn't asked a coworker to look over some code with them to figure something out? Now expand that concept. It's the power of two-heads-are-better-than-one. Maybe trading the keyboard back and forth, maybe one person driving while the other talks. Sharing information, knowledge, and ideas in both directions, as well as reviewing code in real-time. There's some bang for the buck!

The final chapters cover becoming Agile, including some of the danger areas that get in the way, tools, coaching (pro and con), and Mancuso's chapter on craftsmanship, which reminds us that we do this kind of work because we love it. We are constantly striving to be better at it. I am a software developer. I want to be professional about it. This hearkens back to the roots of Agile.

More Effective Agile, Steve McConnell

McConnell has a very direct, pragmatic writing style. He is brutally honest about what works and what doesn't, and the practical realities and difficulties that organizations run into.

His main goal is addressing practical topics that businesses care about, but that are often neglected by Agile purists:

Common challenges in Agile implementation.
How to implement Agile in only part of the organization (because virtually every company will have parts that simply don't work that way, or will interact with external entities that don't).
Agile's support for predictability.
Use of Agile on geographically distributed teams
Using Agile in regulated industries.
Using Agile on a variety of different types of software projects.

He focuses on techniques that have been proven to work over the past two decades. He generalizes non-Agile approaches as Sequential development, typically in some sort of phased form.

The book contains 23 chapters, organized into these 4 parts:

INTRODUCTION TO MORE EFFECTIVE AGILE
MORE EFFECTIVE TEAMS
MORE EFFECTIVE WORK
MORE EFFECTIVE ORGANIZATIONS

It includes full bibliography and index.

Throughout, he uses the key principle of "Inspect and Adapt": inspect your organization for particular attributes, then adapt your process as necessary to improve those attributes.

Another important concept is that Agile is not one monolithic model that works identically for all organizations. It's not one-size-fits-all, because the full range of software projects covers a variety of situations. So the book covers the various ways organizations can tailor the practices to their needs. Probably to the horror of Agile purists.

Each chapter is organized as follows:

Discussion of key principles and details that support them. This includes problem areas and various options for dealing with them.
Suggested Leadership Actions
Additional Resources

The Suggested Leadership Actions are divided into recommended Inspect and Adapt lists. The Inspect items are specific things to examine in your organization. I suspect they would reveal some rude surprises. The Adapt items cover actions to take based on the issues revealed by inspection.

The Additional Resources list additional reading if you need to delve further into the topics covered.

One of the very useful concepts in the book is the "Agile Boundary". This draws the line between the portion of the organization that uses Agile techniques, and the portion that doesn't. Even if the software process is 100% Agile, the rest of the company may not be.

Misunderstanding the boundary can cause a variety of problems. But understanding it creates opportunities for selecting an appropriate set of practices. This is helpful for ensuring successful Agile implementation across a diverse range of projects.

A significant topic of discussion is the tension between "pure Agile" and the more Sequential methods that might be appropriate for a given organization at a given point in a project.

The Agile Boundary defines the interface where the methods meet, and which methods are appropriate on each side of it under given circumstances. Again, Agile is not a single monolithic method that can be applied identically to every single project. As he says, it's not a matter of "go full Agile or go home".

There's a lot of information to digest here, because it all needs to be taken in the context of your specific environment. The chapters that stand out to me based on my personal experience:

More Effective Agile Projects: keeping projects small and sprints short; using velocity-based planning (which means you need accurate velocity measurement), delivering in vertical slices, and managing technical debt; and structuring work to avoid burnout.
More Effective Agile Quality: minimizing the defect creation gap (i.e. finding and removing defects quickly, before they get out); creating and using a definition of done (DoD); maintaining a releasable level of quality at all times; reducing rework, which is typically not well accounted for.
More Effective Agile Testing: using automated tests created by the development team, including unit and acceptance tests, and monitoring code coverage.
More Effective Agile Requirements Creation: stories, product backlog, refining the backlog, creating and using a definition of ready (DoR).
More Effective Agile Requirements Prioritization: having an effective product owner, classifying stories by combined business value and development cost.
More Effective Agile Predictability: strict and loose predictability of cost, schedule, and feature set; dealing with the Agile Boundary.
More Effective Agile Adoptions.

Requirements make an interesting area, because that is often a source of problems. The Agile approach is to elicit just enough requirements up front to be able to size a story, then rely on more detailed elicitation and emergent design when working on the story.

But the problem I've seen with that is one of the classic issues in estimation. Management tends to treat those very rough initial estimates as commitments, not taking into account the fact that further refinement has been deferred. So downstream dependent commitments get made based on them.

The risk comes when further examination of the story reveals that there is more work hidden underneath than originally believed. I've seen this repeatedly. Then the whole chain of dependent commitments gets disrupted, creating chaos as the organization tries to cope.

For example, consumer-product embedded systems are very sensitive to this. The downstream dependent commitments involve hardware manufacturing and the retail pipeline, where products need to be pre-positioned to prepare for major sales cycles such as holidays.

The Christmas sales period means consumer products need to be in warehouses by mid-November at the latest. Both the hardware manufacturing facilities (and their supply chains) and the sales channels are Taylor-style systems, relying on bulk delivery and just-in-time techniques. They need predictability. That's your Agile Boundary right there, on two sides of the software project.

IOT products have fallen into the habit of relying on a day 1 OTA update after the consumer unboxes them, but that's risky. If the massive high-scale OTA of all the fielded devices runs into problems, it creates havoc for consumers, who are not going to be happy. That can have significant opportunity costs if it causes stalled revenue or returns, or some horribly expensive solution to work around a failed OTA, not to mention the reputation effect on future sales.

What about commercial/industrial embedded systems? Cars, planes, factory equipment, where sales, installation, and operation are dependent on having the software ready. These can have huge ripple effects.

Online portal rollouts that gate real-world services are also sensitive to it. Martin uses the example of healthcare.gov. People need to have used the system successfully by a certain date in order to access real-world services, with life-safety consequences.

Both of these highlight the real-world deadlines that make business sense for picking software schedule dates. As software engineers, we can't just whine about arbitrary, unreasonable dates. There's a whole chain of dependencies that needs to be managed.

Schedule issues need to be surfaced and addressed as soon as possible, just like software bugs. The later in the process a software bug is identified, the more expensive it is to fix, sometimes by orders of magnitude. Dealing with schedule bugs is no different.

In his book on estimation, McConnell talks about the Cone of Uncertainty, the greater uncertainty about estimates early in the project, that narrows to better certainty over time as more information is available. Absolute certainty only comes after the completion. But everybody behaves as if the certainty is much better much earlier.

It's clear from the variety of information in this book that Agile is not simply a template that can be laid down across any organization and be successful. It takes work to adapt it to the realities of each organization. There is no simple recipe for success. No silver bullets.

That's why it's necessary to re-read this periodically, because each time you'll be viewing it in the context of your organization's current realities. That's continuing the Inspect and Adapt concept.

Update Nov 10, 2019

My copy of Beck's Extreme Programming Explained arrived yesterday, and I've been reading through it. Here we see the benefits of going back to original sources, in this case on open plan offices. In Chapter 13, "Facilities Strategy", he says:

The best setup is an open bullpen, with little cubbies around the outside of the space. The team members can keep their personal items in these cubbies, go to them to make phone calls, and spend time at them when they don't want to be interrupted. The rest of the team needs to respect the "virtual" privacy of someone sitting in their cubby. Put the biggest, fastest development machines on tables in the middle of the space (cubbies might or might not contain machines).

So it appears what caught on was the group open bullpen part, and what has been left out was the personal space part (and it's attendant value).

There's a continuous spectrum on which to interpret Beck's recommendation, with the typical modern open office representing one end (all open space, no private space), and individual offices representing the other (no open space, all private space).

There's a point on the spectrum where I would shift to liking it, if I had a private place to make my own where I could concentrate in relative quiet, with enough space to bring in a pairing partner.

Where I find the open office breaks down is the overall noise level from multiple separate conversations. It can be a near-constant distraction when I'm trying to work (hence the rampant proliferation of headphones in open offices).

Meanwhile, when I need to have a conversation with someone, I want to be able to do it without competing with all those others, and without disturbing those around me.

What seems to me to have the most practical benefit is optimizing space for two-person interactions, acoustically isolated from other two-person interactions. So individual workspaces with room for two to work together. That allows for individual time as well as the pairing method, from simple rubber-duck debugging to full keyboard and mouse back-and-forth.

Those are both high-value, high-quality times. That's the real value proposition for the company.

And in fact, that's precisely the kind of setup Beck says Ward Cunningham told him about.

Given that most developers now work on dedicated individual machines, through which they might be accessing virtualized cloud computing resources, the argument for a centralized bullpen with machines seems less compelling.

The open bullpen space seems to be less optimal, but still useful for times when more than two people might be involved.

This is clearly a philosophical difference from Beck's intent, but I think the costs of open plan offices as he experienced them, tempered by the reality of how they've been adopted, outweigh their benefits.

Meanwhile, his followup discussion in that chapter is fully in harmony with Peopleware's Part II: "The Office Environment".

Review: Engineering A Safer World, by Nancy Leveson

2019-07-08T04:34:00.000-07:00

This is a 6-year-old post cross-posted from my woodworking blog (written before I had this blog available). It remains as timely and important as ever. I'm reposting it motivated by the discussion of the Boeing 737 MAX, such as at EmbeddedArtistry.com (and mentioned at Embedded.fm).

As a software engineer I've been a dedicated reader of RISKS DIGEST for over 20 years. Formally, RISKS is the Forum On Risks To The Public In Computers And Related Systems, ACM Committee on Computers and Public Policy, moderated by Peter G. Neumann (affectionately known to all as PGN).

RISKS is an online news and discussion group covering various mishaps and potential mishaps in computer-related systems, everything from data breaches and privacy concerns to catastrophic failures of automated systems that killed people. It's an extremely valuable resource, exposing people to many concerns they might otherwise not know about.

All back issues are archived and available online. It's fascinating to see the evolution of computer-related risks over time. It's also disheartening to see the same things pop up year after year as sad history repeatedly repeats itself.

Nancy Leveson's work on safety engineering has been mentioned in RISKS ever since volume 1, issue 1. She's currently Professor of Aeronautics and Astronautics and Professor of Engineering Systems at MIT. Her 2011 book Engineering A Safer World, Systems Thinking Applied to Safety, was noted in RISKS 26.71, but has not yet been reviewed there. I offer this informal review.

This book should be required reading for anyone who wishes to avoid having their work show up as a RISKS news item. There's no excuse for not reading it: Leveson and MIT Press have made it available as a free downloadable PDF (555 pages), which is how I read it. The download link is available on the book's webpage at http://mitpress.mit.edu/books/engineering-safer-world.

This was my first introduction to formal safety engineering, so yes, I speak with the enthusiasm of the newly evangelized.

The topic is the application of systems theory to the design of safer systems and the analysis of accidents in order to prevent future accidents (not, notably, to assign blame). Systems theory originated in the 1930's and 1940's to cope with the increasing complexity of systems starting to be built at that time.

This theory holds that systems are designed, built, and operated in a larger sociotechnical context. Control exists at multiple hierarchical levels, with new properties emerging at higher levels ("emergent properties"). Leveson says safety is an emergent property arising not from the individual components, but from the system as a whole. When analyzing an accident, you must identify and examine each level of control to see where it failed to prevent the accident.

So while an operator may have been the person who took the action that caused an accident, you must ask why that action seemed a reasonable one to the operator, why the system allowed the operator to take that action, why the regulatory environment allowed the system to be run that way, etc. Each of these levels may have been an opportunity to prevent the accident. Learning how they failed to do so is an opportunity to prevent future accidents.

Furthermore, systems and their contexts are dynamic, changing over time. What used to be safe may no longer be. Consider that most systems are in use for decades, with many people coming and going over time to maintain and operate them, while much in the world around them changes. Leveson says most systems migrate to states of higher risk over time. If safety is not actively managed to adapt to this change, accidents become inevitable.

Another important point is the distinction between reliability and safety. Components may operate reliably at various levels, yet still result in an accident, frequently due to the interactions between components and subsystems.

Much of Leveson's view can be summarized in two salient quotes. First is a brief comment on the human factor: "Depending on humans not to make mistakes is an almost certain way to guarantee that accidents will happen."

The second is more involved:

"Stopping after identifying inadequate control actions by the lower levels of the safety control structure is common in accident investigation. The result is that the cause is attributed to "operator error," which does not provide enough information to prevent accidents in the future. It also does not overcome the problem of hindsight bias. In hindsight, it is always possible to see that a different behavior would have been safer. But the information necessary to identify that safer behavior is usually only available after the fact. To improve safety, we need to understand the reasons people acted the way they did. Then we can determine if and how to change conditions so that better decisions can be made in the future.

"The analyst should start from the assumption that most people have good intentions and do not purposely cause accidents. The goal then is to understand why people did not or could not act differently. People acted the way they did for very good reasons: we need to understand why the behavior of the people involved made sense to them at the time."

The book is organized into three parts. Part I, "Foundations," covers traditional safety engineering (specifically, why it is inadequate) and introduces systems theory. Part II, "STAMP: An Accident Model Based On Systems Theory," introduces System-Theoretic Accident Model and Processes, covering safety constraints, hierarchical safety control structures, and process models. Part III, "Using STAMP," covers how to apply it, including the STPA (System-Theoretic Process Analysis) approach to hazard analysis and the CAST (Causal Analysis based on STAMP) accident analysis method.

Throughout, Leveson illustrates her points with accidents from various domains. These cover a military helicopter friendly-fire shootdown, chemical and nuclear plant accidents, pharmaceutical issues, the Challenger and Columbia space shuttle losses, air and rail travel accidents, the loss of a satellite, and contamination of a public water supply. They resulted in deaths, injuries with prolonged suffering, destruction, and significant financial losses. There's also one fictional case used for training purposes.

The satellite loss was an example where there was no death, injury, or ground damage, but an $800 million satellite was wasted, along with a $433 million launch vehicle (all due to a single misplaced decimal point in a software configuration file). Financial losses in all cases included secondary costs due to litigation and loss of business. Accidents are expensive in both humanity and money.

Several accidents are examined in great detail to expose the complexity of the event and glean lessons, identifying the levels of control, the system hazards they faced, and the safety constraints they violated. They show that the answer to further prevention is not simply to punish the operator on duty at the time. What's to prevent another accident from occurring under a different operator? What systemic factors exist that increase the likelihood of accidents?

These systems affect us every day. During the time I was reading the book, there was an airline crash at San Francisco, a fiery oil train derailment in Canada, and a major passenger train derailment in Spain. I started reading it while a passenger on an aircraft model mentioned 14 times in the book, and read the remainder while traveling to and from work on the Boston commuter rail.

The book can be read on several levels. At a minimum, the cases studies and analyses are horribly fascinating for the lessons they impart. Fans of The Andromeda Strain will be riveted.

As I read the account of two US Black Hawk helicopters shot down by friendly fire in Iraq, I could visualize a split screen showing the helicopters flying low in the valleys of the no-fly zone to avoid Iraqi air defense radar, the traces going inactive on the AWACS radar scopes, the F-15's picking up unidentified contacts that did not respond to IFF, and the mission controllers back in Turkey, as events ground to their inexorable conclusion. It made my hair stand on end.

All the case studies are equally jaw-dropping, down to the final example of a contaminated water supply in Ontario. Further shades of Andromeda, since that was a biological accident resulting in deaths.

They're all examples of systems that migrated to very high risk states, where they became accidents waiting to happen. It was just a matter of which particular event out of the many possible triggered the accident.

Part of what's so shocking about these cases is the enormously elaborate multilayered safety systems that were in place. The military goes to great lengths in its air operations control to avoid friendly fire incidents, the satellite software development process had numerous checkpoints, NASA had a significant safety infrastructure.

Yet it seems that this very elaborateness contributed to a false sense of safety, with uncoordinated control leaving gaps in coverage. In some cases this led to complacency that resulted in scaling back safety programs.

The other shocking cases were at the opposite end of the spectrum, where plants were operated more fast and loose.

The one bright spot in the case studies is the description of the US Navy's SUBSAFE program, instituted after the loss of the USS Thresher in 1963. It flooded during deep dive testing; despite emergency recovery attempts by the crew, they were unable to surface. Just pause and think about that for a moment.

SUBSAFE is an example of a tightly focused and rigorously executed safety program. The result is that no submarine has been lost in 50 years, with the exception of the USS Scorpion in 1968, where the program requirements were waived. The result of that tragic lesson was the requirements were never again waived.

The book can be read at an academic level, as a study of the application of systems theory to the creation of safer systems and analysis of accidents. It can be read at an engineering level, as a guide on how to apply the methodology in the development and operation of such systems. It's not a cookbook, but it points you in the right direction. It includes an extensive bibliography for follow-up.

Even those who work on systems that don't present life safety or property damage risks can benefit, because any system behaving poorly can make people's lives miserable. They frequently pose significant business risks, affecting the life and death of a company.

This book paired with PGNs book Computer-Related Risks would make an excellent junior or senior level college survey course for all engineering fields, along the lines of "with great power comes great responsibility". While some might feel it's a text more suited to graduate level practicum, I think it's worth conveying at the undergraduate level for broader distribution.

Review: Web-Based Course Test-Driven Development For Embedded C/C++, James W. Grenning

2018-11-23T12:36:00.000-08:00

Full disclosure: I was given a seat in this course by James Grenning.

I took James Grenning's 3-day web-based course Test-Driven Development for Embedded C/C++ September 4-6, 2018. It was organized as a live online delivery, 5 hours each day. The schedule worked out perfectly for me in Boston, starting at 9AM each morning, but he had attendees from as far west as California and as far east as Germany, spanning 9 time zones.

The participants ranged from junior level embedded developers to those with more than 20 years of experience. One worked in a fully MISRA-compliant environment. This was the first introduction to TDD for some of them.

The course was organized as blocks consisting of presentation and discussion, coding demo, then live coding exercises. It used CppUTest as the TDD framework.

The short answer: this is an outstanding course. It will change the way you work. I highly recommend it, well worth the investment in time and money. The remote delivery method worked great.

I had previously read Grenning's book, Test Driven Development for Embedded C, which I reviewed in August. I covered a lot of his technical information on TDD in the review, so I'll only touch on that briefly here. He covers the same material in the course presentation portions.

The course naturally has a lot of overlap with the book, so each can serve as a standalone resource. But I found the combination of the two to be extremely valuable. They complement each other well because the book provides room to delve more deeply into background information, while the course provides guided practice time with an expert.

Reading the book first meant I was fully grounded in the motivations and technical concepts of TDD, so I was ready for them when he covered them in the course. I was also already convinced of its value. What the live course brings to that part is the opportunity to ask questions and discuss things.

You can certainly take the course without first reading the book, which was the case for several of the participants.

Presentations

For the presentation portions, Grenning covered the issues with development using non-TDD practices, what he calls "debug-later programming" (DLP). This consists of a long phase of debug fire-fighting at the end of development, that often leaves bugs behind.

He introduced the TDD microcycle, comparing the physics of DLP to the physics of TDD. By physics he means the time domain, the time taken from injection of a bug (repeat after me: "I are a ingenuer, I make misteaks, I write bugs") to its removal. This is one of the most compelling arguments for adopting TDD. TDD significantly compresses that time frame.

He covered how to apply the process to embedded code and some of the design considerations. He also talked about the common concerns people have about TDD.

One quote from Kent Beck that I really liked:

TDD is a great excuse to think about the problem before you think about the solution.

He covered the concept of test fakes and the use of "spies" to capture information. He covered mocks as well, including mocking the actual hardware so that you can run your tests off-target.

He covered refactoring to keep tests easy to follow and maintain. He also covered refactoring of "legacy code" (i.e. any production code that was not built using TDD), including "code smells" and "code rot", using TDD to provide a safety harness. This included a great quote from Donald Knuth (bold emphasis mine):

Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

Coding Demos

Grenning performed two primary live coding demos. First, he used TDD to build a circular buffer, a common data structure in embedded systems. He used this to demonstrate the stepwise process of applying the TDD microcycle.

Second, he performed a refactoring demo on a set of tests. He used this to show how to apply the refactoring steps to simplify the tests and make the test suite more readable and maintainable.

This was just as valuable as the TDD microcycle, because a clean test suite means it will live a long and useful life. Failing to refactor and keep it clean risks making it a one-off throwaway after getting its initial value.

Coding Exercises

Grenning uses Cyber-Dojo to conduct exercises (as well as his demos). This is a cloud-based, Linux VM, ready-to-use, code-build-run environment that allows each student to work individually, but he can monitor everyone's code as they work. This turned out to be one of the most valuable aspects of the course.

I should also mention that I had read Jeff Langr's book Modern C++ Programming with Test-Driven Development: Code Better, Sleep Better in between reading Grenning's book and taking this course. Langr puts a lot of emphasis on short, easily-readable tests, and that's something that also comes out in Grenning's class.

What was so valuable about doing these exercises in Cyber-Dojo is that Grenning was able to stop someone who was heading off in the wrong direction and quickly bring them back on track, or help them if they weren't sure what to do next. That fast feedback cycle is very much in tune with TDD itself. It works just as well as a teaching method.

So if someone started writing code without a test, or wrote too much code for what the test covered, or had too much duplication in tests, or had too much setup that could be factored out, he let them know and guided them back. In some cases he interrupted the exercise to go through someone's code on the screen with everybody watching, not to put them on the spot, but to cover the issues that we all run into.

That's critical because learning to truly work effectively in TDD style requires a reorientation of your thinking. We all have the coding habits of years that need to be overcome.

That doesn't happen automatically just because you read a book and have decided to follow it. It takes effort; half the effort is just noticing that you're straying from the intended path. That's the value of having a live instructor who can watch over your shoulder. It's like being an apprentice under the watchful eye of a master craftsman.

For me, this was ultimately the greatest value in the class. Having Grenning provide real-time guidance had an immediate effect on my coding, for both the test code and the production code. Whether it was talking about my mistakes or someone else's, I was able to immediately improve my work.

That made a huge difference between the test code I wrote before the class and the test code I wrote by the end of the class.

The coding exercises were building our own circular buffer, building a light controller spy, using TDD with the spy to implement a light scheduler, and implementing a flash chip driver. Note that these exercises are also covered in his book.

I also found that Cyber-Dojo made for an interesting example of pair programming, something I've never done before. Grenning provided initial files to work on, like a pair partner guiding you in the next step, then provided active feedback, like a partner asking questions and making suggestions: "Are you missing something there? What if you tried this? Wait, before you do that...".

The Big Lesson

The big lesson for me from this course was that it finally drove home that TDD is ALL ABOUT DEVELOPMENT! Sometimes I have to be clubbed over the head for something to really sink in, and that's what happened here.

We get so focused on the word "test" in TDD that we jump to the conclusion that it's just a test methodology. We emphasize test, as in TEST-Driven Development.

But really, the emphasis should be reversed, it's Test-Driven DEVELOPMENT. That means you apply design concepts and address the requirements of the product as you engage in a very active development thought process that is driven forward by tests.

Did you ever write some throwaway test code just so you could see how something worked, or to explore some design ideas? Hmmm, well TDD formalizes that.

The fact that you do end up with useful unit tests is almost a side effect of the process. An extremely valuable side effect, but a side effect nonetheless.

The real output of the process is working production code. That's what really matters. That's the real goal.

At some point on the last day of the course, I recognized the change in emphasis deep in my being. Maybe the difference is subtle, but it is critical.

That recognition first started to dawn after I read the book and applied it at work. I was amazed at the cleanliness of the resulting code. It was DRY and DAMP and SOLID, with no further refinement or debugging required.

Yes, I had a unit test suite. But look at the production code! It was breathtaking, right out of the chute. That was motivating.

It was in that receptive frame of mind that I did the coding exercises in the course. That was when the club hit. It was one of those moments of realization where you divide time into what came before, and what came after, the physical moment of grok, providing a whole new lens through which to perceive the work.

Savor that consideration for a moment.

People have been saying for years that TDD is about development, but we tend to focus on the test. Grenning emphasizes development when he talks about developing while "test-driving", meaning he is doing his development driven by tests. I guess it just takes time for the real implications to sink in.

One of Grenning's slides quotes Edsger Dijkstra:

Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with, and as a result, the programming process will become cheaper. If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.

While we all aspire to be like Dijkstra, this seems like a pipe dream. Until you realize that TDD does exactly that. It provides the shortest path to working software. I think he would have liked that.

Now that I've relegated the test aspect of this to second-class citizenship, let me bring it back to prominence.

The testing aspect approaches Dijkstra's ideal, because it finds bugs immediately as part of the code, build, test cycle. So the bugs are squashed before they've had time to scatter and hide in the dark corners. That reduces the dreaded unbounded post-development debug cycle to near zero.

If you don't let bugs get into the code, you won't have to spend time later removing them. Yeah, what Dijkstra said.

This doesn't guarantee bug-free code. There might still be bugs that occur during the integration of parts that are working (for example, one module uses feet, while another uses inches), or the code may not have addressed the requirements properly (the customer wanted a moving average of the last 5 data points, while the code uses the average of all data points), but as a functional unit, each module is internally consistent and working according to its API.

The resulting unit test suite is an extremely valuable resource, just as valuable as the production code. What makes it so valuable? Two things: safety harness, and example usage.

It provides a safety harness to allow you to do additional work on the code, then run the suite as a regression test to prove you haven't broken anything. Or to detect breakage so you can fix it immediately.

Using and extending the suite liberates you to make changes to the code safely. Need to add some functionality? Fix one of those integration or requirements bugs? Refactor for better performance or maintainability? Clean up some tech debt? Have at it.

You can instantly prove to yourself that you haven't screwed anything up, or show that you have, so that you can fix it before it ever gets committed to the codebase. No one will ever see that dirty laundry.

It provides example usage, showing how to use the API: how to call the various functions, in what order, how to setup and drive various behavioral scenarios, how to exercise the interfaces for different functional behaviors, how different parameters affect things, how to interpret return values.

This is real, live code, showing how to use the production code as a client. You can even get creative and add exploratory tests that push the production code in odd directions to see what happens. Grenning calls these characterization tests and learning tests.

The test suite is actually something quite magical: self-updating documentation! Since you need to invest the time to maintain the tests in order to get the development done, you are also automatically updating the example usage documentation for free.

You might argue that tools like Doxygen offer similar self-updating capability, but they still require updating textual explanations along with the code. They are subject to the same staleness that can happen with any comments, where the comments (or Doxygen annotations) aren't kept up to date with code changes (see Tim Ottinger's Rules for Commenting for advice to help avoid stale comments).

But if you want to really know how to use the production code, go read the tests! If you've truly followed the TDD process as Grenning shows you in this course, they will tell you how to produce every bit of behavior that it is capable of, because every bit of behavior implemented will have been driven by the tests.

That's the full-circle, closed-loop feedback power of test-driven DEVELOPMENT.

Doxygen still has its place. I think of the Doxygen docs as API reference, while the test suite is API tutorial, showing actual usage.

Another Lesson

I've already alluded to the other interesting lesson that I drew from this course: it takes practice! We're not used to working like this, so it takes practice and self-awareness to learn how to do it.

That was particularly driven home by the coding exercises. Even though I had just read his book and followed through the exact same exercises, and read Langr's book, and applied the knowledge at work, I still had trouble getting rolling on the first couple of exercises. It was a matter of instilling the new habits.

It took a few times having Grenning redirect me (or listen to the advice he gave someone else). By the final exercise, after the benefit of his live feedback, I was able to catch myself in time and start applying the habits on my own.

It's still going to take some time. I'll know I've gotten there when I start thinking of the tests automatically as the first step of coding.

Third Time's A Charm

At one point in the discussion I mentioned that Grenning's book and this course represented my third attempt at using TDD, and one of the participant said he would be interested in hearing about my previous attempts.

My first attempt was in 2007, when I was introduced to TDD by a coworker. I read Kent Beck's Test Driven Development: By Example and used it to develop the playback control module for a large video-on-demand server intended for use in cable provider head ends.

This was both a great success and a classic failure. It was a great success in that it accelerated my work on the module, avoiding many bugs and shortening the debug cycle. In that respect it lived up to the promise of TDD completely.

It was a classic failure in that I made the tests far too brittle. I put too much internal knowledge of the module in them, with many internal details that were useful when I was first developing the module, but that became a severe impediment to ongoing maintenance.

The classic symptom of this problem was that a minor change in implementation would cause a cascade of test failures. The production code was fine, but some internal detail such as a counter value that was being checked by the tests had changed. Otherwise the test code itself was also fine. But I had overburdened it with details that should have been hidden by encapsulation.

The result was that ultimately I had to abandon the test suite. It had provided good initial value, but failed to deliver on-going value because it became a severe maintenance burden.

This is exactly the type of situation that Grenning's course seeks to prevent. During coding exercises, he watches out for cases of inappropriate information exposure. Thus another benefit of this is improved encapsulation and information hiding.

My second attempt was in 2013, when I wanted to refactor some of the code in an IP acceleration server as part of improvements to one of its features. I had read Michael Feathers' Working Effectively with Legacy Code, and found that many of the things he covered applied to the codebase I was working on.

This was a revenue-generating service product, so I needed to be sure I didn't break it.

The main strategy the book covers is to use TDD to provide that safety harness I mentioned above, in order to verify that the legacy code behaves the same after modification as it did before.

I began building a set of test fakes that could be used with Google Test. One issue was that the code relied heavily on the singleton pattern, so there always had to be some implementation of each class that would satisfy the linker. And of course there were chains of such dependencies interlocked in a web.

My first task was to replace that bit by bit with dependency injection. I focused just on the parts necessary to allow me to test the area I was modifying. Part of Feathers' strategy is to tackle just enough of the system at a time to be able to make progress, rather than a wholesale break-everything-down-and-rebuild approach.

I had enough success with this that once I finished my primary work on the feature changes, I embarked on a background project to put the entire codebase into 100% dependency injection. That would allow me to build unit tests for any arbitrary component, in combination with any set of faked dependencies, with the longer-term goal of building out near-100% unit test coverage incrementally.

However, not too long after starting this, I ended up changing jobs. So once again I got the short-term benefit from TDD, but didn't reap the long-term benefit. It was a useful exercise to go through, though, providing good experience on how to migrate such a codebase to TDD.

This is another area that Grenning's course covers.

Related Links

For the perspective of another class participant, see Phillip Johnston's post What I Learned from James Grenning's Remote TDD Course.

There are things about the TDD process that make people suspicious. Is it just hacking? In this interview with Grenning, embedded systems expert Jack Ganssle raises some of those concerns. Grenning explains how the process works to reach the goal of well-designed, working production code that meets customer requirements.

Elecia and Christopher White have a great interview podcast with Grenning. Best joke: how many Scrum masters does it take to manage one developer? Also good Shakespeare and Bradbury quotes that are much ado about programming.

Accuracy Vs. Precision

2018-11-16T04:57:00.000-08:00

This is nothing new, but it's something that needs to be constantly hammered home. It's an important point that can make a critical difference in the behavior of embedded systems interacting with the sloppiness of physics in the real world.

I was reminded of the topic by Elecia White's excellent video Intro to Inertial Sensors: From Taps to Gestures to Location. The inertial sensors that are now common in smartphones and embedded systems are accelerometers, gyroscopes, and magnetometers, possibly integrated into a single Inertial Measurement Unit (IMU).

These amazing devices are MicroElectroMechanical Systems (MEMS), explained in Dejan Nedelkovski's equally excellent video How MEMS Accelerometer Gyroscope Magnetometer Work & Arduino Tutorial.

But working in the digital world with sensor data converted from the analog world poses interesting problems. Some of these are addressed in Jack W. Crenshaw's amazing book Math Toolkit for Real-Time Programming. There is always error in the system to some degree, so you have to be prepared to handle it.

Accuracy and precision are two of those problems, and have been since the dawn of measurement. It's important to understand the distinction between them. They are often confused in informal usage.

A common analogy for understanding them is taken from riflery, showing a shooting target. White includes a version of it in her video. As I learned while earning the Boy Scout riflery merit badge at Resica Falls summer camp lo these many years ago, you want your shots to be tightly grouped together (precision), and you want that group to be on-target, centered around the bull's-eye (accuracy).

The following image is taken from the NOAA article Accuracy Versus Precision, which does a nice job of explaining the difference. I'll briefly restate it here should NOAA scientific information mysteriously disappear from the Web.

Accuracy is how close a measurement is to the true value, how close it is to the bull's-eye. Precision is how closely repeated measurements come to duplicating measured values, how tightly they are grouped.

Not Accurate Not Precise: these are not close to the bull's-eye, so the measurements are not close to the true value, and they are not tightly grouped, so repeated measurements have a lot of difference.

Accurate Not Precise: these are close to the bull's-eye, so the measurements are centered around the true value, but they are not tightly grouped, so repeated measurements range all over the place.

Not Accurate Precise: these are not close to the bull's-eye, so the measurements are not close to the true value, but they are tightly grouped, so repeated measurements are close to each other. From a riflery perspective, this is good, because it means you have control, you just need to adjust your sight to compensate.

Accurate Precise: these are both close to the bull's-eye and tightly grouped. The measurements are on-target, close to the true value, and repeated measurements give close to the same result.

In an embedded system, you need to characterize and calibrate things. Characterization means understanding how much variation a sensor has in its measurements, how precise it is (which, as White explains, can vary with temperature and barometric pressure, plus humidity, external vibration, external electrical and magnetic fields, external sources of Radio Frequency Interference (RFI), and other factors; man, the real world is a sloppy place!).

Calibration determines how far off the measurements are from the true value and adjusts the values to compensate for the difference.

Meanwhile, the calculations that use the values must be able to handle the accuracy and precision appropriately, along with odd cases such as a true zero value being measured as a small negative value (because the measurement is centered around zero, but may range between small negative and positive limits). Treating values as if they are more accurate or precise than they really are is downright dangerous.

That can lead the embedded system to crash or take inappropriate actions. If it happens to be controlling the flight of an airplane or the operation of a chemical plant, people can be killed and tremendous damage can result. If it happens to be controlling a consumer device, the consequences may be less dire, but can be equally damaging to the company.

There is always error and noise in the system. You have to understand it and how to manage it.

So You Want To Be An Embedded Systems Developer

2018-09-22T06:35:00.000-07:00

Then listen now to what I say.
Just get an electric guitar
and take some time and learn how to play.

Oh, wait, that's a song by the Byrds. But the strategy is the same. Get some information and tools and learn how to use them. No need to sell your soul to the company.

The items I've listed below are sufficient to get you started on a career as an embedded systems developer. There are of course many additional resources out there. But these will arm you with enough knowledge to begin.

I own or have watched every resource and piece of hardware listed on this page. I've either gone through them entirely, or am in the process of doing so. I can vouch for their usefulness in getting you up to speed. It's a firehose of learning.

My personal learning method is to bounce around between multiple books and videos in progress, while spending time hands-on with the hardware. This is similar to a college student juggling multiple classes with labs (without tests, term papers, or due dates!).

Your method may be different. Feel free to approach things in a different order. I offer this in the spirit of sodoto.

What's An Embedded System?

It's a computer that's embedded inside another product, like a car, a microwave, a robot, an aircraft, or a giant industrial machine in a factory; or an IoT device like an Amazon Echo, a Sonos speaker, or a SimpliSafe home security system. You think of the thing as the end product, not as a computer. The computer happens to be one of the things inside that makes it work.

The fascinating thing about embedded systems is that you get to have your hands in the guts of things. The code you write makes a physical object interact with the real world. It's a direct control feedback loop. Working on them is incredibly seductive.

Embedded systems are a multi-disciplinary endeavor. At a minimum they require a mix of electronics and software knowledge. Depending on the particular application (the end product you're building), they may also require some combination of mechanical, materials science, physics, chemical, biological, medical, or advanced mathematical knowledge.

Hobbyist vs. Professional Hardware

There's a wide range of hardware available to learn on, at very reasonable prices. Most of the microcontrollers and boards were originally aimed at the professional community, but advances in technology and falling prices have made them accessible to the hobbyist and educational communities.

Meanwhile, those same advances have enabled hardware aimed directly at the hobbyist and educational communities. Some of that hardware has advanced to the point that it is used in the professional community. So the lines have been blurred.

All of the boards covered here have a variety of hardware interfaces and connectors that allow you to connect up other hardware devices. These are the various sensors, indicators, and actuators that allow an embedded system to interact with the real world.

Two hobbyist/educational platforms are Arduino and Raspberry Pi. For a beginner, these offer a great way to start out. There's an enormous amount of information available on using them from the hobbyist, educational, and maker communities.

I've listed a few books on them below in the Primary Resources, and there are a great many more, as well as free videos and websites. These books tend to be written at a more beginner level than books aimed at professionals.

Arduino is a bare-metal platform, meaning it doesn't run an operating system. An IDE (Integrated Development Environment) is available for free, for writing and running programs on it. You program it with the C and C++ programming languages.

Many of the low-level details are taken care of for you. That's both the strength and the weakness of Arduino.

It's a strength because it offers a quick streamlined path to getting something running. That makes it a great platform for exploring new concepts and new hardware.

It's a weakness because it isolates you too much from the critical low-level details that you need to understand in order to progress beyond the level of beginner.

Those low-level details are the difference between success in the real world and dangerous mediocrity. Dangerous as in you can actually get people killed, so if you want to do this professionally, you need to understand the responsibility you're taking on.

My attitude is to take advantage of that streamlined path whenever needed, and use it to boost yourself into the more demanding work. There are always going to be new pieces of hardware to hook up to an Arduino. I'll always start out at the beginner level learning about them.

In that context, Arduino makes a great prototyping and experimentation platform, without having to worry so much about the low-level details. Then, every bit of knowledge I pick up that way can be carried over to more complex platforms. Meanwhile, Arduino is a perfectly capable platform in its own right.

Raspberry Pi is a Linux platform, meaning it is a single-board computer running the Linux operating system. In some ways it is similar to Arduino, in that many low-level details are taken care of for you.

But it is more capable due to more hardware interfaces and the Linux environment. It can operate as a full desktop computer in the palm of your hand. You program it with the Python, C, and C++ programming languages, as well as others. The Linux capability opens up lots of possibilities.

Many of the same arguments for and against Arduino apply to Raspberry Pi. It also offers a great way to learn Linux and its application to embedded systems. It can be used at the beginner level, but also offers greater range to go beyond that.

Professional hardware, aimed at commercial and industrial use, offers the classic embedded systems development experience. This is where you need to be able to dig down to the low levels. These platforms run both bare-metal and with operating systems.

The operating systems tend to be specialized, especially when the application requires true hard real-time behavior, but also include embedded Linux.

Hard real-time means the system must respond to real-world stimulus on short, fixed deadlines, reliably, every time, or the system fails. For instance, an aircraft flight control system that must respond to a sensor input within 100ms, or the plane crashes. Or a chemical plant process control system that must respond to a sensor within 100ms, or the plant blows up and spews a cloud of toxic chemicals over the neighboring city. Or a rocket nozzle control system that must respond to guidance computer input within 50ms or it goes off course and has to be destroyed, obliterating $800 million worth of satellite and launch vehicle.

Those are what system failure can mean, showing the responsibilities. There are hard real-time systems with less severe consequences of failure, as well as soft real-time systems with looser deadlines and allowable failure cases (such as a smart speaker losing its input stream after 200ms and failing to play music), but it's important to keep in mind what can be at stake.

If your goal is to work professionally as an embedded systems developer, you need to be able to work with the professional hardware. But don't hesitate to use the hobbyist hardware to give you a leg up learning new things. The broad range of experience from working with all of them will give you great versatility and adaptability.

The Primary Resources

The items listed below are all excellent resources that provide the minimum required knowledge for a beginner, progressing up to more advanced levels. If you already have some knowledge and experience, they'll fill in the gaps.

These are well-written, very practical guides. There's some overlap and duplication among them, but each author has a different perspective and presentation, helping to build a more complete picture.

They also have links and recommendations for further study. Once you've gone through them, you'll have the background knowledge to tackle more advanced resources.

The most important thing you can do is to practice the things covered. This material requires hands-on work to really get it down, understand it, and be able to put it to use, especially if you're using it to get a job.

Whether you practice as you read along or read through a whole book first, invest the time and effort to actually do what it says. That's how you build the skills and experience that will help you in the real world.

Expect to spend a few days to a few weeks on each of these resources, plus a few months additional. While they're mostly introductory, some assume more background knowledge than others, such as information on binary and hexadecimal numbers. You can find additional information on these topics online by searching on some of the keywords.

Some of the material can be very dense at first, so don't be afraid to go through it more than once. Also, coming back to something after having gone through other things helps break through difficulties.

Looking at this list, it may seem like a lot. Indeed, it is an investment in time and money, some items more than others. But if you think of each one as roughly equivalent to half a semester of a college course once you put in the time to practice the material, this adds up to about two years worth of focused college education.

That's on par with an Associate degree, or half of a Bachelor's degree. And it will leave you with practical skills that you can put to use on a real job.

These are in a roughly recommended order, but you can go through the software and electronics materials in parallel. You might also find it useful to jump around between different sections of different books based on your knowledge level at the time. Note that inexpensive hardware is listed in the next part of this post, including some of the boards these use.

If you find some of the material too difficult, never fear, back off to the beginner resources. If you find some too simple, never fear, it gets deep. Eventually, it all starts to coalesce, like a star forming deep in space, until it ignites and burns brightly in your mind.

The resources:

You can learn Arduino in 15 minutes.: This is a nice short video that talks about the basics of Arduino microcontroller systems. It helps to start breaking down the terminology and show some of the things involved. That makes it a good introduction to more involved topics. You can also dive down the rabbit hole of endless videos on Arduino, microcontrollers, and electronics from here. This guy's channel alone offers lots of good information.

Hacking Electronics: Learning Electronics with Arduino and Raspberry Pi, 2nd Edition, 2017, by Simon Monk. This is a great beginner-level hands-on book that covers just enough of a wide range of hardware and software topics to allow you to get things up and running, starting from zero knowledge.

Programming the Raspberry Pi: Getting Started with Python, 2nd Edition, 2016, by Simon Monk. This is a nice practical guide to Python on the Raspberry Pi, with much more detail on programming than his Hacking Electronics above. Meanwhile it has less beginner information on hardware. So the two books complement each other nicely.

Programming Arduino: Getting Started with Sketches, 2nd Edition, 2016, by Simon Monk. Similar to his book on Python, but for C on Arduino, also a nice complement to his Hacking Electronics.

Embedded Software Engineering 101: This is a fantastic blog series by Christopher Svec, Senior Principal Software Engineer at iRobot. What I really like about it is that he goes through things at very fine beginner steps, including a spectacular introduction to microcontroller assembly language.

Modern Embedded Systems Programming: This is a breathtakingly spectacular series of short videos by Miro Samek that take you from the ground up programming embedded systems. They're fast paced, covering lots of material at once, including the C programming language, but he does a great job of breaking things down. He uses an inexpensive microcontroller evaluation kit (see hardware below) and the free size-limited evaluation version of the IAR development software suite. He also has a page of additional resource notes. What I really like about this is that in addition to covering a comprehensive set of information with many subtle details, he shows exactly how the C code translates to data and assembly instructions in microcontroller memory and registers. In contrast to Arduino, this is all the low-level details. You will know how things work under the hood after this course (currently 27 videos). Along the way you'll pick up all kinds of practical design, coding, and debugging skills that would normally take years to acquire. Did I mention this course is freakin' awesome?

RoboGrok: This is an amazing complete online 2-semester college robotics video course by Angela Sodemann at Arizona State University, available to the public. Start with the preliminaries page. In addition to some of the basics of embedded systems, it covers kinematics and machine vision, doing hands-on motor and sensor control through a PSoC (Programmable System on a Chip) board. She sells a parts kit, listed below. This is a great example of applied embedded systems.

C Programming Language, 2nd Edition, 1988, by Brian W. Kernighan and Dennis M. Ritchie: C is the primary language used for embedded systems software, though C++ is starting to become common. This is the seminal book on C, extremely well-written, that influenced a generation of programming style and other programming books. The resources listed above all include some basics of C, and this will complete the coverage.

Embedded C Coding Standard, 2018 (BARR-C:2018), by Michael Barr: This will put you on the right track to writing clean, readable, maintainable code with fewer bugs. It's a free downloadable PDF, which you can also order as an inexpensive paperback. Coding standards are an important part of being a disciplined developer. When you see ugly, hard to read code, you'll appreciate this.

Programming Embedded Systems: in C and C++, 1999, by Michael Barr: Even though this is now 20 years old, it's a great technical introduction and remains very relevant. Similar in many respects to Samek's video series, it takes a beginner through the process of familiarizing yourself with the processor and its peripherals, and introduces embedded operating system concepts. There is a later edition available, but this one is available used at reasonable prices.

Programming Arduino Next Steps: Going Further with Sketches, 2nd Edition, 2019, by Simon Monk. This goes deeper into Arduino, covering more advanced programming and interfacing topics. It also includes information on the wide array of third-party non-Arduino boards that you can program with the IDE. This starts to get past the argument that Arduino is just for beginners doing little toy projects.

Making Embedded Systems: Design Patterns for Great Software, 2011, by Elecia White. This is an excellent book on the software for small embedded systems that don't use operating systems (known as bare-metal, hard-loop, or superloop systems), introducing a broad range of topics essential to all types of embedded systems. And yes, the topic of design patterns is applicable to embedded systems in C. It's not just for non-embedded systems in object-oriented languages. The details of implementation are just different.

Exploring Raspberry Pi: Interfacing to the Real World with Embedded Linux, 2016, by Derek Molloy. This goes into significantly more depth on the Raspberry Pi and embedded Linux. It's quite extensive, so is best approached by dividing it into beginner, intermediate, and advanced topics, based on your knowledge level at the moment. Spread out your reading accordingly. It has great information on hardware as well as software, including many details of the Linux environment. Two particularly fascinating areas are using other microcontrollers such as Arduino as slave real-time controllers, and creating Linux Kernel Modules (LKMs).

Make: Electronics: Learning Through Discovery, 2nd Edition, 2015, by Charles Platt. This is hands down the best book on introductory electronics I've ever seen. Platt focuses primarily on other components rather than microcontrollers, covering what all those other random parts on a board do. See Review: Make: Electronics and Make:More Electronics for more information on this and the next book, and Learning About Electronics And Microcontrollers for additional resources.

Make: More Electronics: Journey Deep Into the World of Logic Chips, Amplifiers, Sensors, and Randomicity, 2014, by Charles Platt. More components that appear in embedded systems.

Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems, 2006, David J. Agans. By now you've found many ways to get into trouble with code and hardware. This is a fantastic book for learning how to get out of trouble. It's a simple read that outlines a set of very practical rules that are universally applicable to many situations, then elaborates on them with real-life examples.

Real-Time Concepts for Embedded Systems, 2003, by Qing Li and Caroline Yao. This is an introduction to the general concurrency control mechanisms in embedded operating systems (and larger-scale systems).

Reusable Firmware Development: A Practical Approach to APIs, HALs, and Drivers, 2017, by Jacob Beningo. This covers how to write well-structured low-level device driver code in a way that you can use on multiple projects. Embedded systems are notorious for having non-reusable low-level code, written in a way that's very specific to a specific hardware design, which can often ripple up to higher levels. That means you have to rewrite everything every time for every project. Good Hardware Abstraction Layers (HALs) and Application Programming Interfaces (APIs) provide a disciplined, coherent approach that allows you to reuse code across projects, saving you enormous amounts of time in development and testing. This also helps you become a better designer, because it encourages you to think in a modular way, starting to think in terms of broader architecture in a strategic manner, not just how to deal with the immediate problem at hand in a tactical manner.

Embedded Systems Architecture, 2018, by Daniele Lacamera. This is a very up-to-date book that uses the popular ARM Cortex-M microcontroller family as its reference platform. That makes it a great complement to Samek's video series, since the TI TIVA C that he uses is an ARM Cortex-M processor. This also goes into more detail on areas such as the toolchain (including debugging with OpenOCD), bootloading, and memory management. It briefly uses the ST STM32F746 Discovery board as an example.

Embedded Systems Fundamentals with Arm Cortex-M based Microcontrollers: A Practical Approach, 2017, by Alexander G. Dean. As the name indicates, this is another detailed book on ARM Cortex-M, intended as a college-level textbook. Among other good practical details, it includes a nice chapter on analog interfacing. It uses the inexpensive NXP FRDM-KL25Z development board for hands-on examples.

TI Tiva ARM Programming For Embedded Systems: Programming ARM Cortex-M4 TM4C123G with C, 2016, by Muhammad Ali Mazidi, Shujen Chen, Sarmad Naimi, and Sepehr Naimi. This is a detailed book that uses the exact same Tiva C board as Samek's video series.

Designing Embedded Hardware: Create New Computers and Devices, 2nd Edition, 2005, by John Catsoulis. This covers the hardware side of things, an excellent complement to White's book. It provides the microcontroller information to complement Platt's books.

Test Driven Development for Embedded C, 2011, by James Grenning. This is a spectacular book on designing and writing high quality code for embedded systems. See Review: Test Driven Development for Embedded C, James W. Grenning for full details. Just as White's book applies concepts from the OO world to embedded systems, Grenning applies Robert C. Martin's "Clean Code" concepts that are typically associated with OO to embedded systems. We'll all be better off for it.

Modern C++ Programming with Test-Driven Development: Code Better, Sleep Better, 2013, by Jeff Langr. This is an equally spectacular book on software development. It reinforces and goes into additional detail on the topics covered in Grenning's book, so the two complement each other well. Even if you don't know C++, it's generally easy enough to follow and the material still applies.

Taming Embedded C (part 1), 2016, by Joe Drzewiecki. This YouTube video is part of the Microchip MASTERs conference series. It covers some of the things that can be risky in embedded code and some methods for avoiding them. This gets into the characteristics that make embedded systems more challenging. I like to watch videos like this at 2X speed initially. Then I go back through sections at normal speed if I need to watch them more carefully.

Interrupt and Task Scheduling - No RTOS Required, 2016, by Chris Tucker. Another MASTERs video, this covers a critical set of topics for working in embedded systems.

Some Advanced Resources

Ready to dig in further and deeper?

MC/OS the Real-Time Kernel, 1992, by Jean Labrosse. Labrosse decided to write his own real-time operating system when he had trouble getting support for a commercial one he was using. The rest is history. You can hear some of that history in this podcast interview with him, "How Hard Could It Be?". This not only explains how things work under the hood, it gives you the source code.

MC/OS III, The Real-Time Kernel for the Texas Instruments Stellaris MCUs, 2010, by Jean Labrosse. This covers the 3rd generation of MC/OS, as well as details on the Stellaris microcontroller covered in Samek's video series. You can also download a free PDF version of this, as well as companion software. The MC/OS II and other books are also available there. The value in getting multiple versions is to see how the software evolved over time.

Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications, 2nd edition, 2019, edited by Robert Oshana and Mark Kraeling. This is a broad survey of topics by various authors (Labrosse wrote the chapter on real-time operating systems).

Some Hardware

The items listed below include some of the inexpensive boards and evaluation kits used in the resources above. There are a bazillion microcontroller boards out there that are useful for learning how to work on embedded systems. It's worth getting some from different vendors so you can learn their different microcontrollers, different capabilities, and different toolchains.

That also helps you appreciate the importance of abstracting low-level hardware differences in writing your code. Each vendor provides a range of support tools as part of the package.

Note that large vendor websites can be a pain, because they want you to create an account with profile, asking questions like your company name (call it "Independent"), what your application is, how many zillion parts you expect to order, when you expect to ship your product, etc. They're setup for industrial use, not hobbyist/individual use. They also may work through distributors like Mouser or Digi-Key for shipping and orders. Just roll with it!

The hardware:

Arduino Uno - R3, $22.95. This is the board used in the Arduino video listed above. There's also a wide array of "shields" available, external devices that connect directly to the board. Exploring these is one of the great educational values that Arduino offers. Remember that because Arduino takes care of many of the details for you, you can be up and learning about new devices faster. Then you can take that knowledge and apply it to other boards. You can also download the Arduino IDE there.

Raspberry Pi 3 - Model B+, $35. This is an updated version of the boards used in Simon Monk's books above. You will also need the 5V 2.5A Switching Power Supply with 20AWG MicroUSB Cable, $7.50, and the 8GB Card With full PIXEL desktop NOOBS - v2.8. You may also want the Mini HDMI to HDMI Cable - 5 Feet, $5.95, and the Ethernet Hub and USB Hub w/ Micro USB OTG Connector, $14.95. These are sufficient to connect it to a monitor, keyboard, and mouse, and use it as a desktop Linux computer.

Texas Instruments MSP430F5529 USB LaunchPad Evaluation Kit, $12.99 (16-bit microcontroller). An evaluation kit is a complete ready-to-use microcontroller board for general experimentation. Christopher Svec uses this kit in his blog series above, where he also covers using the free downloadable development software. If you buy directly from the TI site, register as an "Independent Designer".

Texas Instruments Stellaris LaunchPad Evaluation Kit, was $12.99 (32-bit microcontroller). This is the kit Miro Samek started out with in lesson 0 of his video series above. However, as he points out at the start of lesson 10, TI no longer sells it, and has replaced it with the Tiva C LaunchPad, which is an acceptable replacement (see next item below).

You might be able to find the Stellaris offered by third-party suppliers. But you have to be careful that you're actually going to get that, and not the Tiva C kit, even though they list it as Stellaris. I now have two Tiva C boards because of that, one that I order directly from TI, and one that was shipped when I specifically ordered a Stellaris from another vendor.

Fortunately, that doesn't matter for this course, but it highlights one of the problems you run into with embedded systems, that vendors change their product lines and substitute products (sometimes it's just rebranding existing products with new names, which appears to be what TI did here). That can be confusing and annoying at the least, and panic-inducing at the worst, if something you did in your project absolutely depends on a hardware feature of the original product that's not available on the replacement.

One of the design lessons you should learn is to future-proof your projects and try to isolate hardware-specific features so that you can adapt to the newer product when necessary.

Texas Instruments Tiva C TM4C123G LaunchPad Evaluation Kit, $12.99 (32-bit microcontroller). This is TI's replacement for the Stellaris LaunchPad, that you can use with Miro Samek's video series. Samek addresses the replacement issue at the beginning of lesson 10. The good news is that he says the Tiva C is equivalent to the Stellaris (apparently all TI did was rename the product), so it's usable for the course. You'll notice that some parts of the toolchain (the software you use to develop the software for the board, in this case the IAR EWARM size-limited evaluation version) still refer to it as TI Stellaris.

The specific TI device on the board is the TM4C123GH6PM, so when you set the EWARM Project->Options->General Options->Device, you can select TexasInstruments->TM4C->TexasInstruments TM4C123GH6PM, not theLM4F120H5QR that's on the Stellaris board. However, Samek shows that you can continue to use the toolchain configured for Stellaris.

That's one of those details that can be maddening when vendors swap parts around on you. Getting it wrong can produce subtle problems, because some things may work fine (you selected a device variant very similar to the one you need), but others won't. Welcome to the world of embedded development! Small details matter. The alphabet soup and sea of numbers in the product names can also drive you batty and be a source of mistakes. PAY CLOSE ATTENTION!

A related detail: the file lm4f120h5qr.h that Samek supplies in his projects for working with the Stellaris board's processor also works with the Tiva C board's processor. However, there is also a TM4C123GH6PM.h file for the Tiva processor. Both files are in the directory C:\Program Files (x86)\IAR Systems\Embedded Workbench 8.2\arm\inc\TexasInstruments (or whichever version of EWARM you have).

You can copy them to your project directory, or have the compiler use that directory as an additional include directory by selecting Project->Options->C/C++ Compiler and clicking the ... button next to the "Additional include directories:" box.

STMicroelectronics STM32F746 Discovery Board, $54 (ARM Cortex-M7 microcontroller). This is used briefly in Daniele Lacamera's book above. It's relatively expensive compared to the other evaluation kits here, but includes a 4.3" LCD capacitive touch screen and other hardware elements, making it a much more capable platform, and still an outstanding value.

NXP Semiconductor FRDM-KL25Z Freedom Development Board, $15 (ARM Cortex-M0+ microcontroller). This is the board that Alexander Dean uses in his book above.

uC32: Arduino-programmable PIC32 Microcontroller Board, $34 (Microchip PIC32 32-bit processor). This isn't covered specifically by any of the resources above, but the PIC32 microcontroller is a popular family that offers a different hardware environment. This is programmable using the Arduino IDE, and can also be programmed using Microchip's MPLAB IDE.

Adafruit Parts Pal, $19.95. This is a small general parts kit for working with the various boards above. It includes LEDs, switches, resistors, capacitors, simple sensors, a small breadboard, and jumper wires for interconnecting things, plus a few other interesting items.

RoboGrok parts kit, $395. This is the parts kit for Angela Sodemann's course above. While you can gather the parts yourself for less, she saves you all the work of doing that, and buying her kit is a nice way of compensating her.

Extech EX330 Autoranging Mini Multimeter, $58. There are also a bazillion multimeters out there. This is one reasonable mid-range model. The multimeter is a vital tool for checking things on boards.

One of the following logic analyzers. A logic analyzer is an incredibly valuable tool that allows you to see complex signals in action on a board. They used to cost thousands of dollars and need a cart to roll them around. These are miraculously miniaturized versions that fit in your pocket, at a price that makes them a practical, must-have personal tool. They plug into your USB port and are controlled via free downloadable software you run on your computer:

Saleae Logic 8 Logic Analyzer, 8 D/A Inputs, 100 MS/s, $199 with awesome "enthusiast/student" discount of $200, using a discount code that you can request and apply to your cart when checking out, thanks guys! This is also covered briefly in Svec's blog series. You can play with the software in simulation mode if you don't have an analyzer yet.

Digilent Analog Discovery 2, 100MS/s USB Oscilloscope, Logic Analyzer and Variable Power Supply, Pro Bundle, $299. As amazing as the Saleae is, this one adds oscilloscope, power supply, and signal generator functions, combining a number of pieces of equipment into one tiny package. They also have academic discounts for those who qualify (36% discount on the base unit).

For a full shopping list to equip a personal electronics lab, see the Shopping List heading at Limor Fried Is My New Hero. That page also has many links to resources on how to use the tools.

Glossaries

It can be a bit maddening as you learn the vocabulary, with lots of terms, jargon, and acronyms being thrown around as if you completely understood them. As you get through the resources, the accumulation of knowledge starts to clarify things. Sometimes you'll need to go back and reread something once you get a little more information.

Barr Group Embedded Systems Glossary.
Maxim Integrated Glossary Of Electrical Engineering Terms.
Byte Craft Glossary Of Embedded Systems Terminology.
Embedded Artistry Glossary.

Other Links

These sites have articles and links useful for beginners through advanced developers.

Embedded.fm: podcast and blogs by Elecia White, Christopher White, Andrei Chichak, and Chris Svec.
Embedded Artistry Resources For Beginners.
Quantum Leaps: Miro Samek's website, including his free downloadable book Practical UML Statecharts in C/C++, 2nd Edition: Event-Driven Programming for Embedded Systems, which is just as spectacular as his video series.
The Ganssle Group: Jack Ganssle is a legendary pioneer in the embedded systems field.
Barr Group: Barr is another expert in the field.
Better Embedded System SW: Prof. Philip Koopman, Carnegie-Mellon U., course notes and blog posts. His book Better Embedded System Software is another good resource, you can get it half-off at this site.
Wingman Software: James Grenning's website.
Langr Software Solutions: Jeff Langr's website.

Final Thought

Our society is becoming more and more dependent on embedded systems and the various backend and support systems they interact with. It's our responsibility as developers to build security in to make sure that we're not creating a house of cards ready to collapse at any moment. Because people's lives can depend on it.

If you think I'm overstating that, see Bruce Schneier's new book. We are the ones on the front lines.

Learning About Electronics And Microcontrollers

2018-09-22T02:26:00.000-07:00

Tutorial books: to read beginning to end. "Make: Electronics" is the best introduction to electronics I've ever seen in any form, a must read!

Reference books: to jump around and get more details as necessary.

If you're new to electronics and microcontrollers and want to learn about them, here's my recommended reading list, in order (Amazon links)

Tutorial books:

Reference books:

There's been an evolutionary leap forward in the affordability and ease of use of microcontrollers as a result of inexpensive open-source hardware and free open-source software.

These have removed many of the traditional roadblocks that made learning to use microcontrollers daunting. Working with them is now within reach of everyone from elementary school children to adults.

Microcontrollers are then excellent platforms for learning to code, because they allow you to interact directly with the physical world. Making the hardware do what your code told it to do is very satisfying.

Tutorial Books

First Two Books

These cover basic electronics, and make an outstanding starting point. You can read my review of them to see why I like them so much (as well as information on components kits for the experiments in the first book).

They don't focus on microcontrollers. Instead, they focus on the other parts that surround microcontrollers, as well as projects that don't need microcontrollers.

If you're impatient to get on to the microcontroller information, save Make: More Electronics until later. But you should definitely start with Make: Electronics no matter what.

Third Book

This briefly covers some of the same basics as the first two, then covers some other useful basics. The first two and this one complement each other very well.

It then gets into working with Arduino and Raspberry Pi microcontrollers. It includes a number of simple projects for working with external modules and sensors.

Arduino is programmed in C on "bare metal", i.e. without an operating system, and Raspberry Pi is programmed in Python on embedded Linux, so the two illustrate the variety of programming and runtime environments for microcontrollers.

I found this to be a nice gentle introduction to the practicalities of working with microcontrollers and the huge array of third-party modules available. While it only skims the surface of a vast topic, it makes an excellent jumping off point for learning about embedded systems.

Reference Books

First Three Books

These gather in one place information from a wide array of resources on how to use a wide array of electronic components. They focus on practical concerns rather than theory, and are illustrated with the same excellent color diagrams and photos as Make: Electronics.

For each component, they contain the following sections: What It Does, How It Works, Variants, Values, How To Use It, and What Can Go Wrong.

Fourth Book

This is like multiple smaller books bound into one. It starts with an extensive chapter on theory and related math. The authors point out that much of the math throughout the book is simply to prove the theory, so if you're not interested in that level of detail, you can skip over it.

The remainder of the book covers a broad range of devices, providing both theory and practical material. It has a chapter on microcontrollers that makes a good follow-up to Hacking Electronics.

Using The Books

The tutorial books are meant to be read from beginning to end as you tinker with their projects. They are easy reading with hands-on experiments, where each chapter builds on previous material.

The reference books are meant to read here and there, jumping around as you need more details on a specific topic.

Even though some of the topics are duplicated between all the books, each author and each book has a different perspective. Each has a different emphasis and presentation.

They complement each other to give a more complete picture because one author may delve deeper into details that another glosses over. You may prefer one author's explanation over another's. No single resource is ever able to give the whole story, so it helps to have multiple perspectives.

These books will give you a good foundation so that you'll be able to understand other books and resources.

If you're interested in doing embedded systems software development, see So You Want To Be An Embedded Systems Developer.

Electronics Suppliers

There are two outstanding suppliers of discrete electronics, microcontrollers, tools, modules, sensors, and breakout boards that cater to the small-scale needs of hobbyists, students, and experimenters:

You can read my paean to Adafruit at Limor Fried Is My New Hero,which includes the shopping list for setting up my small-scale electronics lab. For a simple example of using this equipment, see First Use Of New Tools.

There are several suppliers for industrial scale, but who also supply at small scale (do you need 10 pieces, or 10 million?):

Digi-Key Electronics
Mouser Electronics
Newark element14 (for the curious, element 14 is Silicon (chemical symbol Si), a major element in electronics)
McMaster-Carr: not for electronics, but for all other mechanical parts, supplies, raw materials, and tools.

Supplier Learning Resources

All these suppliers have extensive online learning resources. However, the industrial suppliers don't have much for the absolute beginner; they're good once you've built up some background knowledge.

Adafruit Learn and SparkFun Learn resources are in both written and video form. You can scan through videos for a quick overview pass by setting the speed in the YouTube window settings (the gear icon) to 2x, then come back and watch at normal speed for a second pass.

There's a lot of duplication between them (and between these resources and the books above), but it's useful to see how different people approach the same topics. Just like reading books by different authors, they provide additional perspectives to help fill in the gaps.

Both sites can be a bit overwhelming to dig through, so I've selected a number of beginner resources below, organized by supplier and then type of resource. Many of them have links to additional information.

These Adafruit videos by Collin Cunningham cover basic electronics lab skills:

Soldering and Desoldering: how to solder components together properly, and how to pull them apart for salvage and rework.
Surface Mount Soldering: how to solder surface-mount components.
Multimeters: how to use a meter for basic measurements.
Oscilloscopes: how to use an oscilloscope for advanced measurements and waveforms.
Hand Tools: the basic hand tools used for assembling and disassembling electronics.
Schematics: how to read schematics (no, they're not Greek!).
Breadboards and Perfboards: how to combine the parts on a schematic into a functioning circuit.
Ohm's Law: understanding the relationship between voltage, current, and resistance.

He also has these videos on the basics of various components:

Batteries: the basics of using batteries to supply DC power to projects.
Solar Cells: using solar cells to keep the batteries charged.
Power Supplies: using an AC power supply to supply DC power to projects.
Pulse Width Modulation: using a PWM converter to change DC input voltage to lower effective DC voltage, or as a simple digital-to-analog converter (DAC).
Switches: understanding the different types of switches for manually controlling projects.
The Transistor
The Capacitor
The Diode
The Inductor
The Resistor
The LED
The Integrated Circuit (IC)
The Arduino

These are Adafruit written guides by various contributors:

These SparkFun videos by Shawn Hymel cover basic electronics lab skills:

How to Use a Multimeter: how to use a digital multimeter (DMM) to make basic measurements.
How to Use a Power Supply: how to use a bench power supply unit (PSU) to power a project instead of batteries.
How to Use an Oscilloscope: how to use an oscilloscope to look into circuit operation.

He also has these videos on electronics basics:

These are SparkFun written guides by various contributors:

Another great YouTube resource is Dave Jones' EEVblog.

Review: Test Driven Development for Embedded C, James W. Grenning

2018-08-14T14:49:00.000-07:00

The TL;DR:

Test Driven Development for Embedded C by James W. Grenning is an outstanding book.
The title says C, but if you work in C, C++, C#, Go, Objective-C, Java, Javascript, or anything else, this is worth reading.
It says embedded, but if you work in embedded systems, front end web apps, mobile apps, desktop apps, backend servers, or anything else, this is worth reading.
And it's not just TDD, it's all the concepts that go into good design.
Get it, read it, USE it. You won't regret it.

Background

I first learned about XP (eXtreme Programming) concepts in 2007, when I was introduced to Kent Beck's Test-Driven Development: By Example. I used TDD (Test-Driven Development) to develop a major component on a server system. I learned more in 2013, when I read Michael Feathers' Working Effectively With Legacy Code. I used that to apply TDD to an existing server codebase.

Over the past 3 months, I've been on a reading binge, triggered by reading Robert C. Martin's 2017 book Clean Architecture: A Craftsman's Guide to Software Structure and Design. I have an hour-long commuter rail ride, so I have lots of time to read and work on my laptop, plus a little lunchtime reading, and I always have a book open at home.

I read his Clean Code: A Handbook of Agile Software Craftsmanship, The Clean Coder: A Code of Conduct for Professional Programmers, and am currently in the middle of his Agile Software Development, Principles, Patterns, and Practices.

I read Sandro Mancuso's The Software Craftsman: Professionalism, Pragmatism, Pride, and am in the middle of Mike Cohn's Agile Estimating and Planning, both from Martin's series.

I read Andrew Hunt and David Thomas' The Pragmatic Programmer: From Journeyman to Master, and am halfway through Pete McBreen's Software Craftsmanship: The New Imperative; Martin Fowler's Refactoring: Improving the Design of Existing Code is waiting on the shelf.

I've encountered bits and pieces of this material over the years, but this was a chance to go back to primary sources, get the full details and parts I've missed out on, and really understand them. I highly recommend it.

Review

But maybe you don't have time for all that. Maybe you'd like to cut to the chase and see how to apply their principles in practice.

Test Driven Development for Embedded C by James W. Grenning does that. It draws from many of those sources and more, showing you real-world examples to put them into practice.

Grenning is one of the original authors of the Agile Manifesto (as are Beck, Fowler, Hunt, Martin, and Thomas). He contributed the chapter "Clean Embedded Architecture" to Clean Architecture, and is the inventor of the Agile planning poker estimation method.

The book was published in 2011, so is now 7 years old, but it remains as timely as ever. That's especially true as IoT vastly expands the number of embedded systems that we rely on in our daily lives. Effective testing is critically important. For instance, see Testing Is How You Avoid Looking Stupid.

If you work on embedded systems in C, this is a must read.

If you work in a different language besides C, or on a different type of system than embedded systems, you may not think that a book on embedded C programming applies to you. But it's broadly applicable and worth reading.

The book is organized as an introductory chapter, the remaining chapters grouped into 3 parts, and appendices. I see it as three distinct portions, plus appendices: Chapter 1; Parts I and II (chapters 2-10); and Part III (chapters 11-15).

Throughout, Grenning addresses the common concerns people have with applying TDD to embedded systems. Embedded systems are a particular challenge, with particular target system constraints, so people might be skeptical.

This is a very hands-on, how-to book. I've included a number of lists from it, including those Grenning draws from other sources, because they illustrate the practical, pragmatic, disciplined approach. You can use this as a cheat sheet to remember them after you've read the book.

It might be tempting to think you can get by just with the information I've provided here and skip the book. But I've included it specifically with the hope that you'll realize you must read the book, and that it will be a worthwhile investment.

First Portion

This is the motivational portion, the appetizer. Grenning introduces TDD, its benefits in general, and the specific benefits for embedded systems.

He lists Kent Beck's TDD microcycle:

Add a small test.
Run all the tests and see the new one fail, maybe not even compile.
Make the small changes needed to pass the test.
Run all the tests and see the new one pass.
Refactor to remove duplication and improve expressiveness.

The microcycle is critically important to the technique, so Grenning reminds you of it several times as he works through examples. This is what makes TDD effective, and I know from my own experience is also what makes it fun and extremely satisfying. He has a sidebar titled "Red-Green-Refactor and Pavlov's Programmer", which is very apt. That Pavlovian drive to take the next step in the cycle draws you into the zone and keeps you cranking.

For embedded systems, in addition to all the benefits that apply to other types of software, the primary benefits include being able to develop tested, working code when the target hardware isn't available; being able to test off-target (i.e. not on the target embedded system), where you have all the benefits of a general-purpose system and none of the constraints of an embedded one, including speed of development turnaround cycle; being able to isolate hardware/software interactions; and decoupling software from hardware dependencies.

That last point is part of the Big Lesson (see below) from all this. TDD in general, for any type of software, results in testable and tested software. But more than that, it drives development in a way that improves the design significantly.

That improved design means a much longer and happier life for the software and the systems that use it. They will be able to adapt to changes much more easily. It's not just about getting V1.0 done. It's about getting to V10.0.

In Software Craftsmanship, Pete McBreen starts off with the origin of the term software engineering. It was coined by a 1967 NATO study group working on "the problems of software." A 1968 NATO conference identified a software crisis and suggested that software engineering was the best way out of that crisis. They were concerned with very large defense systems. McBreen gives the example of the SAFEGUARD Ballistic Missile Defense System, developed from 1969 through 1975.

He says, "These really large projects are really systems engineering projects. They are combined hardware and software projects in which the hardware is being developed in conjunction with the software. A defining characteristic of this type of project is that initially the software developers have to wait for the hardware, and then by the end of the project the hardware people are waiting for the software. Software engineering grew up out of this paradox."

McBreen is questioning the value of that style of large-scale software engineering in the development of commercial products, suggesting that a different approach is needed.

But doesn't that situation sound familiar? Doesn't that sound like the problem embedded systems developers face all the time, that Grenning is addressing? This was a situation where TDD and off-target testing could have significantly alleviated the software crisis.

Granted, it was more complicated, since they were also developing the very processors and programming languages they would use, while modern systems rely on COTS (Commercial Off The Shelf) processors and languages. But we see that this has been a pervasive problem for some 50 years.

All types of systems, from embedded to frontend mobile apps to high-scale backend servers, in all those languages, from C to C++, Objective-C, Go, Java, Javascript, etc., can benefit.

All that code can be removed from its normal production environment and run off-target, off-platform, in a unit test environment that allows you to exercise every code path you want easily and quickly. That includes the obscure dark corners of the code trying to handle unusual error cases that are hard to produce on the target system.

For some of my own experience testing off-target, see Off-Target Testing And TDD For Embedded Systems.

Second Portion

This portion is the meat of the book, applying TDD to real-world embedded development and going through the mechanics with practical examples.

Following the lead of Martin's book, Grenning makes restrained use of UML diagrams. While some people dislike UML because they associate it with the heavyweight BDUF (Big Design Up Front) software engineering methodologies that McBreen was talking about, this is a very effective use of it that communicates information quickly. Which is the whole point of UML.

Grenning presents two unit test harnesses, Unity and CppUTest (of which he is one of the authors). All of the material applies just as well to other test harness tools, such as Google Test/Google Mock. It's equally applicable to other languages and their language-specific test harnesses.

He uses Gerard Meszaros' Four-Phase Test pattern to structure tests:

Setup: Establish the preconditions to the test.
Exercise: Do something to the system.
Verify: Check the expected outcome.
Cleanup: Return the system under test to its initial state after the test.

The rubber meets the road in his five examples of using TDD to develop embedded code:

LED driver
Light scheduler for a home automation system
Circular buffer
Flash driver for ST Microelectronics 16 Mb flash memory device
OS isolation layer (aka OSAL, OS Abstration Layer) for Linux/POSIX, Micrium uC/OS-III, and Win32 (this is actually an appendix and only covers thread control, but establishes the pattern)

Clearly, these have real hardware dependencies on both the processor I/O interface and the attached devices, as well as the system clock, and real OS dependencies. Those are critical concerns for the embedded developer. The LED driver is very simple behavior, so makes for a gentle introduction. The others are more complex.

Grenning discusses driver requirements, then shows the initial tests and code. Notice I said tests first. That's an important concept in TDD. You always write the test first, that uses the code in the way that you want the code to work. Then you write the code that satisfies that usage. He emphasizes the save-make-run cycle that you do repeatedly during this process. Then you repeat for the next test and bit of code. That's how you make fast progress.

The key concept is faking out portions of the system, so that the Code Under Test (CUT) can run as if it was running on the real system. That's critical for making TDD work off-target and off-platform. There are several strategies for doing this. In the case of the LED driver, he uses virtual registers to simulate memory-mapped I/O. This is simply a variable under the control of the test suite.

He also talks about test-driving the interface before test-driving the internals. That's another critical concept, integral to the whole design process. That's design-for-change. Because things will change. A product with a long, useful life, that represents an ongoing revenue stream for a company, will change over that time to adapt to changes in underlying technologies, user requirements, and usage. TDD means you can make changes without fear of breaking things (because you'll find and fix breakage as a result of performing the microcycle).

He talks about the strategy of incremental progress and refactoring as you go. This is in the heat of development. Final code does not flow directly from your fingertips. It evolves in incremental steps as you work. Did you ever look at someone's code and marvel at how clean and easy to follow it was, despite the complexity of the job it was achieving? You might think you could never do something that easily. This process results in that kind of code. Like a novelist in the heat of writing a scene, the first draft is never the final product, and the story arc evolves over time.

This is where he covers several important guidelines for driving the TDD process effectively. He lists Robert Martin's Three Laws of TDD:

Do not write production code unless it is to make a failing unit test pass.
Do not write more of a unit test than is sufficient to fail, and build failures are failures.
Do now write more production code than is sufficient to pass the one failing unit test.

He describes Kent Beck's snappy acronym DTSTTCPW: Do The Simplest Thing That Could Possibly Work, which initially means just faking it (for instance, hard code a function to return false in order to get the test that uses it to pass). Then keep tests small and focused, and refactor on green (many unit test setups show a failing result in red, and a passing result in green).

As this evolves, the faked out code turns into real code (the hard coded false is changed to actual code that does something and returns true or false under the appropriate conditions). That builds out a verified test suite as it builds out verified code.

This leads to the TDD State Machine, which tells you what to do next. The guidelines above and the state machine take you through the mechanics of working in the TDD style. They answer the questions:

How should you start?
What should you do next?
How do you know when you're done?

Whenever you write some production code, ask yourself, "Do you have a test for that?". If not, stop, go back, and write the missing test.

He also covers Dave Thomas and Andrew Hunt's DRY principle: Don't Repeat Yourself. This mantra helps drive the refactoring so that you keep the code lean and clean. I'll throw in additionally the DAMP principle: use Descriptive And Meaningful Phrases, a concept the book applies without calling out by name. This favors readable function and variable names that express intent over cryptic abbreviations and syntax. The result is code that reads with a narrative flow.

Keeping your code DRY and DAMP makes it easy for others to understand and modify (which might be you when you come back to it six months or a year later). This is the same as Beck's microcycle step 5.

To some degree this all turns TDD into a very mechanistic process. But that's a good thing. It's not a random, ad hoc process where you're constantly questioning yourself about what do to. Instead it's an orderly stepwise process that makes effective progress. You quickly see and appreciate the value.

It's also very fun and satisfying, because that mechanistic aspect actually drives your creativity. What's the next thing you can add to it? What's the next test, the next bit of functionality? When you finish, you feel like you've accomplished something, and you have the evidence to prove it. It's addicting.

That leads to Grenning's Embedded TDD Cycle, which starts with TDD on the development system, then advances to the target processor and eval hardware, then the actual target hardware:

Stage 1: Write a unit test; make it pass; refactor. This is red-green-refactor, the TDD microcycle on the development platform.
Stage 2: Compile unit tests for target processor. This is a build check that verifies toolchain compatibility.
Stage 3: Run unit tests on the eval hardware or simulator.
Stage 4: Run unit tests on target hardware.
Stage 5: Run acceptance tests on target hardware. These are automated and manual tests of the integrated system.

This sequence gives you confidence in the code under test quickly, then you can address any hardware-dependent issues that start to arise, such as compiler, library, or primitive data type differences. Next you to start exercising the hardware-dependent code.

Testing separately on eval hardware and actual target hardware helps shake out hardware issues in the actual target, since the eval hardware is presumably known good. One of the challenges in embedded development is always trying to determine if problems are due to the software or due to the hardware, since both are in active development and haven't had much soak time to prove them out.

For the other TDD examples, Grenning goes through a progression of different collaborator strategies. These are the test doubles, the fakes, that are substitutable for real components. They stand in for those components to break the test dependencies and allow you to simulate and monitor interactions. An important point is that they are much lighter weight than full-scale simulators. Full simulators can themselves require significant development. These fakes have only enough behavior to support the tests (part of the DTSTTCPW mindset).

He uses these types of doubles:

Spies
Stubs
Mocks
Exploding fakes

He goes through the following substitution methods, showing how to do them and discussing when they are appropriate:

Link-time substitution
Function pointer substitution
Preprocessor substitution
Combined link-time and function pointer substitution

These are fully-worked-out examples, although he starts omitting intermediate steps as he progresses in the interest of brevity. All the code is available online.

Third Portion

This portion completes the meal, complementing the meat in the second portion. It addresses design issues. This is important because design for testability also means design for flexibility and long product life.

Grenning starts out with Martin's SOLID principles:

S: Single Responsibility Principle (SRP)
O: Open Closed Principle (OCP)
L: Liskov Substitution Principle (LSP)
I: Interface Segregation Principle (ISP)
D: Dependency Inversion Principle (DIP)

He covers both how the previous chapters have incorporated these principles, and how to use them to guide the development process. TDD is closely intertwined with them.

Don't be put off by the apparent difference between non-object oriented and object-oriented languages. The specific language used is irrelevant. The syntactic mechanics may be different, but the concerns and concepts are all the same. C can be every bit as object-oriented as Java, it just takes a little more developer discipline. That means that all of the concepts of the various principles above apply.

He uses the SOLID principles in four module design models of increasing complexity, applicable in different embedded system design cases:

Single-instance module: Encapsulates a module's internal state when only one instance of the module is needed.
Multiple-instance module: Encapsulates a module's internal state and lets you create multiple instance of the module's data.
Dynamic interface: Allows a module's interface functions to be assigned at runtime.
Per-type dynamic interface: Allows multiple types of modules with the same interface to have unique interface functions.

You'll probably recognize more than one of these in the systems you work on. You may also recognize object-oriented concepts, and in fact he shows how to implement, use, and test a C++ virtual function table (vtable) in C.

Part of good design is adapting to change. He covers Martin Fowler's concepts of refactoring, both the code smells that point to things that need to be refactored, and the strategies for doing it with TDD. He describes a disciplined stepwise process that avoids burning bridges.

This then leads into Michael Feathers' concepts of working on legacy code (which Feathers defines as "code without tests"). He lists Feathers' legacy code change algorithm:

Identify change points.
Find test points.
Break dependencies.
Write tests.
Make changes and refactor.

He describes how to apply this to embedded systems. Two important types of unit tests during this process are characterization tests that establish how the legacy code behaves, and learning tests that help you learn how to work with third-party code.

The final chapter covers test patterns and antipatterns. This is useful for helping to build good, effective unit tests that are maintainable over the long term.

The Big Lesson

For embedded systems, working with the specific hardware is a critical detail. But as Martin points out in Clean Architecture, it's just a detail. For GUI-based mobile, web, and desktop apps, the GUI is just a detail. For either of these, as well as backend servers, the OS (or lack thereof on a bare metal system) is just a detail. The network is just a detail. The database or the filesystem is just a detail. The frameworks or third-party packages are just details.

All of those details, critical though they may be, can be isolated and segregated from the code that defines what it is your system is about. That code is called the business logic, which sounds a little too dry for me. But's it's the stuff that makes your system something that other people want to use. So it's the stuff that makes your system drive a meaningful business.

Your business logic interacts with all those details to make a functioning system. TDD allows you to test that logic, in all its happy, twisty, and unhappy paths, separated from its dependencies on the details. The details are represented by test doubles: dummies, stubs, spies, mocks, and fakes.

This is where the Gang of Four's concept of programming to an interface, not an implementation, stated in their book Design Patterns, comes into play. You write your business logic to work to an interface to accomplish the detail interactions. In the production environment, you use the real detail components, the real implementations, with a thin adaptation layer that conforms to the interfaces.

In the test environment you can substitute test doubles that conform to the interfaces; these are alternate implementations. Since you're in control of the test doubles, you can drive any scenario you need to in order to exercise the business logic.

That isolation also allows you to substitute in other versions of production details, so it's a design strategy, not just a testability strategy. Maybe you want to use some different hardware in your embedded system, or run your app on a different mobile device with a different GUI, or deploy the system on a different OS, or use a different database.

By defining your details as abstract data types or abstract services, you can drop in replacements, with just the effort of implementing the interface layers.

Off-Target Testing And TDD For Embedded Systems

2018-07-21T13:38:00.000-07:00

I've recently started reading things by James Grenning (Wingman-sw.com), one of the authors of the Agile Manifesto. My interest in his work relates to Test-Driven Development (TDD) for embedded systems.

A copy of his book Test Driven Development for Embedded C is currently winging its way to me. His site links to a webinar he gave last summer, Test-Driven Development for Embedded Software, that makes a great introduction to the topic.

I found one of his answers on Quora interesting. The question was: Can I perform a unit test when creating C firmware for ARM Cortex-M MCUs? The answer, of course, is yes. Specifically, testing can be done off-target (i.e. not on the target embedded system).

I wrote a long comment on the answer, and decided it might make an interesting blog post. So the remainder of this post reproduces it substantially as it appears there, with some cleanup. He very kindly asked if I would be interested in adding it to his Stories From The Field.

My Three Stories

I can offer three anecdotes that show why I give a big thumbs up to off-target testing. Off-target testing puts you on target!

The first case was back in 1995. I had recently transferred to the DEChub group at Digital Equipment Corporation to work on networking equipment.

They had a problem with their popular DECbridge 90 product, an office- or departmental-scale stackable Ethernet bridge running an in-house custom RTOS on Motorola 68K, all written in C. It would run for weeks or months at a customer site, then suddenly crash. That would throw the LAN into a tizzy as it went through a Spanning Tree Protocol (STP) reconfiguration event. Then the LAN would do it again once the bridge came back up and advertised its links.

So it could be very disruptive to the local network and everyone running on it, completely unpredictable. No one had been able to reproduce the problem in the lab.

I was tasked with finding and fixing it. This platform had very little in the way of crash dump and debug support, and software update was done by burning and replacing an EPROM. It did have an emulator pod, so that was how most debugging was done.

The problem here was the long run time between failures. That made trying to collect useful information from repeated test runs, either real or via emulator, impractical to the point of impossibility.

The one clue we knew from the crash log was that it was an OOM condition (Out Of Memory). The question was why. Other than supporting STP, which is a bit of complex behavior, a bridge is a pretty simple device, just L2 forwarding. Packet comes in, look it up in the bridge tables, forward it out the appropriate interfaces.

The key dynamic structure was the MAC address table. A bridge is a learning device in that it learns which MAC addresses are attached to which links. It builds up the table as it runs, learning the local network topology and participating in STP. So this table was certainly a prime suspect, but it had capacity for thousands of entries, yet it was crashing in LANs with only tens or hundreds of nodes.

The table used a B-tree implementation that was public-domain software from some university. We speculated that it was a memory leak in either the B-tree itself, or our interfacing to it.

So I pulled out the B-tree code and built a test program for it that would go through tens of thousands of adds and deletes in various simple patterns. This is similar to the type of test fixture that Brian Kernighan and Rob Pike later talked about in their book The Practice Of Programming.

I ran this off-target, on a VAX/VMS. VMS supported a simple Ctrl-T key in the terminal interface that would show the process memory consumption, similar to what the Linux ps command shows. The turnaround time on playing with this setup was minutes, build and run, with the full support of an OS to help me chase things down, including good old printf logging and customized dumping of data structures (VMS also had a debugger similar to gdb).

With this I could see that under some patterns, memory consumption was monotonically increasing. So yeah, a memory leak. Further exploration allowed me to home in on the right area of the code.

It was right there in the B-tree memory release code: it would free the main B-tree nodes (the main large data element being managed), but not the associated pointer nodes that it used for bookkeeping. So on every B-Tree node release, it would leak 8 bytes of memory.

This was a case of a very slow memory leak, that only manifested with lots of table changes. In a customer environment, it could take a long time to chew through the memory. In the lab running on-target, it was even slower, since we didn't know what the cause was, so we didn't know how to trigger and reproduce it.

Off-target, it took less than an hour to find. Code change: add two lines to free the pointer nodes. This was after many man-weeks of effort for all the people involved in trying to reproduce and chase down the problem, plus all the aggravation caused at customer sites. Ten minutes to code, build, and verify the fix.

The second case was just recently. I implemented an FSM based directly on the book Models to Code: With No Mysterious Gaps, by Leon Starr, Andrew Mangogna, and Stephen J. Mellor (backed up by Executable UML: A Foundation for Model Driven Architecture, by Stephen J. Mellor and Marc J. Balcer, and Executable UML: How to Build Class Models by Leon Starr; I highly recommend the trio of books). Thank you, Leon, Andrew, Stephen, and Marc!

I was also reading Robert C. Martin's Clean Architecture: A Craftsman's Guide to Software Structure and Design and Clean Code: A Handbook of Agile Software Craftsmanship at the time, which heavily influenced the work (and finally motivated me to fully commit to TDD; Grenning contributed the chapter "Clean Embedded Architecture" to Clean Architecture). Mellor and Martin are both additional Agile Manifesto authors.

A product of all this reading, the FSM was a hand-built and -translated version of the MDA (Model -Driven Architecture) approach, in C on a PIC32 running bare metal superloop.

The FSM performed polymorphic control of cellular communication modules connected via a UART. The modules use the old Hayes modem "AT" command set to connect to the cell network and perform TCP/IP communications.

It was polymorphic because it had to support 4 different modules from 2 different vendors, each with their own variation of AT commands and patterns of asynchronous notifications (URC's, Unsolicited Result Codes).

If you think LANs and WANs are squirrelly, just wait till you try cellular networks. I could hardly get two test runs to repeat the same path through the FSM. Worse, there were corner cases that the network would only trigger occasionally.

It was horribly non-deterministic. How can I be sure I've built the right thing when I can't stimulate the system to produce the behavior I want to exercise?

The solution: build a nearly-full-coverage test suite to run off-target. I built a trivial simulator with fake system clock and UART ISR that I ran on an Ubuntu VM on my Mac. That gave me full support for logging and gdb.

This wasn't quite TDD, but it was one step away: it was Test-After Development, and instead of Google Test/Mock or some other framework, I built my own ad-hoc fakes and EXPECT handling.

With this I was able to create scenarios to test every path in the FSM, for all the module variants. Since I had control of the fake clock and ISR, I could drive all kinds of timing conditions and module responses. It did help that the superloop environment was pure RTC (Run To Completion, which coincidentally is required for Executable UML state machines), rather than preemptive multitasking/multithreading. But I could have faked that as well if necessary.

I was able to fix several bugs that would have been hell to reproduce and debug on-target. In all cases, just as with the B-tree, the code changes were trivial. The time-consuming and hard part is always the debug phase to figure out what's going wrong and what needs to be changed. Doing the actual changes is usually simple.

That debug phase is where non-TDD approaches run into trouble, especially when they have to be done on-target. It can consume unbounded amounts of development time. The time required to do TDD is far shorter, and for a significant number of problems can either completely eliminate the debug phase, or narrowly direct it to the specific code path of a failing test.

The third case was this past week, when I did my first true TDD off-target thing for some embedded code. The platform is embedded Linux on ARM, so full OS, with cross-compiled C++.

I built the code in my Ubuntu VM and used Google Test/Mock, mocking out the file I/O (standard input stream and file output) and system clock. The code wasn’t particularly complex, but it did have a corner case dealing with a full buffer that represented the greatest bug risk.

I used very thin InputStreamInterface, OutputFileInterface, and ClockInterface classes as the OSAL (Operating System Abstraction Layer) to provide testability (thank you, Robert and James!).

It was gloriously wonderful and liberating to build it TDD style, red-green-refactor, and I knew I had all the paths in the code covered, including the unusual ones. That instills great confidence in what I did. No more worrying that I got everything right. I was able to demonstrate that I did.

Did it take a little extra time? Sure, mostly because I’m still on the learning curve getting into the TDD flow. But if I hadn’t used TDD and this code had produced failures, it would take me longer after the fact to chase down the bug. Plus I was able to avoid the impact on all the other people in the organization affected by the development turnaround cycle.

And just today, I added more behavior to that component using the TDD method. I was able to work fully confident that I wasn't breaking any of the stuff I had already done, and just as confident in the new code.

So I'm definitely a believer in off-target testing, and from now on I'll be doing it TDD.

Another benefit of this off-target TDD model? Working out of that Ubuntu VM on my Mac, I'm totally portable. I can work anywhere, at a coffee shop, on the train commuting, at the airport, on a plane, at home. I can be just as productive as if I had my full embedded development environment in front of me. Then once I'm back at my full environment, I have tested, running code ready to go.

For reference, these are the books that taught me TDD while in different jobs, both highly recommended:

Test Driven Development: By Example, by Kent Beck (yet another Agile Manifesto author). I was introduced to the book and TDD in general by new coworker Steve Vinoski in 2007, whose cred in my eyes went way up when I noticed his name in the acknowledgements of James O. Coplien's Advanced C++ Programming Styles and Idioms.
Working Effectively With Legacy Code, by Michael C. Feathers. Amazon tells me I bought this in 2013. At the time I used it to start adding unit test coverage to our codebase at work. What makes this book particularly useful is the fact that nearly all software development requires working with legacy code to some degree, even on brand new projects. It also helps you avoid creating a legacy of code that future developers will curse.

Review: Make: Electronics and Make:More Electronics

2018-05-22T18:28:00.000-07:00

Charles Platt's Make: Electronics and Make: More Electronics.

Amazon links:

If you're interested in learning electronics, I highly recommend these two books by Charles Platt. They are hands down the best books I have ever seen on the subject, spectacular resources for the beginner.

Rather than focusing on theory, Platt jumps right into hands-on experimentation. The books are organized as a series of experiments and circuit-building projects that build knowledge incrementally.

He calls it "learning by discovery". He then follows up with just enough theory to explain what's going on. This is an extremely effective method that avoids getting bogged down.

I first learned electric theory in high school science. I learned Ohm's Law and basic circuit layout, including the equations for computing series and parallel resistance. I learned further details in college physics.

But these didn't really cover the practical details of electronics. They didn't address detailed circuit design, combining components into useful projects.

I started to learn some of those details from the books of Forrest M. Mims III and George Young's book Digital Electronics: A Hands-on Learning Approach. The latter introduced me to integrated circuit chips (IC's) and digital logic, as well as breadboard experimentation.

That constituted the bulk of my electronics knowledge for the past 35 years. But there was still a lot missing, particularly an intuitive understanding of electronics and all those other random parts surrounding the IC's.

Then I found Platt's books. Platt has a real gift for explaining things at an intuitive level in just a few concise paragraphs and clear diagrams.

He delves deep into the practical details. No detail is too small. For a beginner trying to learn from a book, this is critical.

He explains the most basic things so that you know how to wire up a breadboard and check things with a meter. He shows how things work internally, both mechanically and electrically, so a component isn't just an opaque black box.

The color diagrams are outstanding. One thing I really like is the way he steps from a circuit schematic diagram, to a breadboard-friendly schematic, to a breadboard component and wiring diagram, to a component value diagram, to an under-the-covers diagram illustrating all the electrical paths in the wiring and the breadboard connections hidden beneath.

He takes several projects from breadboard to final soldered board built into a simple enclosure. This shows how you can turn your experiment into a completed useful or fun gadget.

The diagrams and project builds really show where other books fall short. Most books show a schematic, and maybe a completed breadboard or a completed wired-up project. But they don't show the stepwise process to get from the start to the end.

That process is not always obvious and is full of opportunities for mistakes, so having it laid out in detail is a huge benefit. He also covers some of the things that can go wrong and how to diagnose and fix them.

The clarity of the diagrams and overall layout make the books very readable. This is another improvement over other books.

If you send a registration email to Platt, he'll add you to his email list for a bonus project and book updates.

Microcontrollers

Platt doesn't emphasize microcontrollers in these books. In an age where Arduinos, Raspberry Pi's, and other microcontrollers allow you to solve nearly any problem with a little embedded software, he mostly shows you what you can do without them. One experiment does cover using an Arduino.

He also discusses the pros and cons of replacing the discrete components with microcontrollers in several experiments. This is actually very useful from an engineering standpoint, giving you choices in how to implement things.

That also ensures that if you do incorporate microcontrollers into your projects, you understand how to integrate them with external components. There are many books on getting started with microcontrollers, but they tend to gloss over the details of those other components, assuming you already understand them. Which you will if you read these books!

Component Kits

Component kits for all the experiments in the first book are available online. You can certainly gather parts on your own, but the kits offer you one-stop shopping of the correct parts.

I used ProTechTrader, the supplier he recommends in his email, and I recommend them highly based on my experience. Make sure you get the kits for the 2nd edition.

The kits are available for the best price directly from the ProTechTrader website. They offer 3 kits, covering experiments 1-11, 12-24, and 25-34. Each is available in regular and deluxe versions. The deluxe versions add things like a digital multimeter, soldering iron, 9V power supply, and upgraded magnet.

I purchased the regular version of each kit, since I already had most of the deluxe items, with free economy 3-10-day shipping. The kits arrived in 3 days.

While you pay a little extra per part for convenience vs. buying everything separately, it was well worth it. The parts are extremely well organized. They're bagged and labeled by value, stored in compartmentalized containers, and identified by experiment.

Don't underestimate the value of the labor that went into that. Platt dedicates several pages in his book to organization of workspace and parts. That's key to efficient work. Rummaging around in a box of loose parts will make you tear your hair out.

Additional Books

Platt's Make: Tools and 3-volume Encyclopedia of Electronic Components.

Platt has several additional books that make useful companions to this pair:

The first book (no, it's not about how to make tools, it's just part of the Make: series) covers the basic hand and small power tools you'll find at home centers and hardware stores, showing how to use them to build small projects. It feature Platt's usual deep attention to practical details.

The book contains a number of simple projects in wood and plastic. The methods for working with plastic are particularly noteworthy, because while there are many books about woodworking, there aren't many about plastic.

These are the skills you need to build different styles of enclosures and stands for your electronics projects, and can also be applied to other mechanical aspects such as robotics.

The remaining books are a 3-volume encyclopedia of electronic components. This is all the information that he didn't have room for in the other books, plus more. Where those books were written as tutorials, this is a reference set.

He's compiled a vast trove of information culled from manufacturer data sheets, tutorials, reference books, and other sources to create a centralized, practical one-stop resource.

Need to know pinouts, sample circuits, voltage levels, alternative packages? You can find them here, in Platt's signature level of detail.

How To Ace Calculus

2018-04-29T09:16:00.004-07:00

This is the method I used to ace 3 semesters of calculus. Also linear algebra, differential equations, and a semester of physics. It should work for any math or science class, at earlier or later level.

When I say ace, I mean getting a grade of 100 on most homework, quizzes, tests, midterms, and finals. In all cases, the top grade in the class. Yes, I was the one breaking the curve.

Sounds arrogant? Well, I didn't start off in that lofty position. So before I give you the recipe for success, let me give you the recipe for failure.

The Recipe For Failure

In 1978, I entered Northwestern University, in Evanston, IL, as a mechanical engineering major. My dream was to work for NASA.

This was a major in the Technical Institute, requiring calculus, physics, and mechanics (statics and dynamics).

The recipe:

Show up for all classes and pay attention.
Complete all reading assignments on time.
Complete all assigned homework on time.
Study for tests, reviewing homework.

This was the recipe that had gotten me through high school, where I was usually able to do most of the homework in the last 5 minutes of class allocated for that purpose. Then just a few minutes in "study hall" period or at home to complete it, and another 10 or 15 to read the next section.

Sounds like a pretty good plan, right? Sounds like a good student, right?

The problem was that the material in college was more difficult and faster paced. Doing just the assigned problems was barely enough to keep your head above water, sometimes not even enough for that. It didn't give you enough practice thinking through and performing the work.

Back in algebra, problems were simple, they had one procedure to follow to the solution. Calculus wasn't like that. There were multiple procedures depending on the style of the equation. A good portion of the battle was classifying the equation to determine what approach to bring to it.

Physics was similar. Both topics required a more analytical approach. That meant building up a problem database in your mind so you could pick the approach. That meant experience doing lots of problems.

The result of following that recipe? C's, D's, and finally, an F in physics. Where I had prided myself on my math and science abilities, my favorite subjects, I had failed. Distraught, I dropped out of Northwestern.

The Recipe For Success

About 5 years later, I started part-time classes at Richland College, part of the Dallas County Community College District.

Oh sure, you may say, community college. That's easy, it's not a real college.

Negative. Richland used exactly the same textbooks as Northwestern, just the next editions. So it was exactly the same material. And I had Ralph Esparza as instructor for calculus I and III. Ralph was feared among students as a tough math teacher, who cares if it's community college or Ivy League.

I was determined to repeat all those classes in my favorite subjects, and do well in them. Somewhere in hindsight, I had realized the need to do more than the minimum.

The recipe:

Show up for all classes and pay attention.
Complete all reading assignments on time.
Complete all assigned homework on time.
Complete all remaining odd-numbered problems in the section and check against answers in the back.
Complete all remaining even-numbered problems in the section.
Study for tests (regular tests, midterms, and finals), redoing all problems that the tests cover.

This is a great way to work in study groups, too. When someone in the group has difficulty, everyone can contribute to helping them understand it. Or maybe the one person in the group who understands it is able to help everyone else.

This boils down to doing every problem in every section of the textbook at least twice, more like three or four times.

You may say, that's a lot of work. Yes, it is.

In Nike ads, athletes show how tough they are. Just do it. Be tough.

The result? Redemption.

Sodoto: See One, Do One, Teach One

2018-04-21T11:06:00.003-07:00

Here's a useful strategy on this learning path: see one, do one, teach one. Sodoto.

Sodoto is a learning method and a teaching method rolled into one. It's the cycle of knowledge.

I'm familiar with it from medicine. My wife is a surgical nurse, and this is the traditional method of teaching in surgery. Obviously, safety concerns mean that you don't just watch a brain surgeon at work and then go try it yourself.

But this forms a useful pattern of mentoring and learning and passing knowledge along. It applies to any kind of knowledge- or skill-based activity.

It works with a single student at a time, or a whole group. You don't have to be a formal teacher.

See one: watch someone do a procedure.

Do one: do what you saw.

Teach one: show it to someone else.

Once you learn a procedure, you're primed to teach it. That's how knowledge spreads.

Here's the real kicker: the teaching step is actually a powerful learning step for you as the teacher. It locks the knowledge into your brain.

You have to have sorted out what you're talking about in order to teach it. You can't just vaguely know it and wave your hands in the air glossing over details. Your students will be annoyed and you'll feel stupid.

The process of getting ready to teach and then doing the teaching forces you to organize your thoughts and chase down details, because you don't want to look stupid, and you want to be prepared for questions.

That motivates you to dig deeper. As a result, you end up learning more yourself.

There are two keys to making this work: background knowledge, and the experience of doing it.

Background Knowledge

Background knowledge applies at each stage of see, do, and teach. Note that "see" can mean live and in person, or on video.

Whatever the subject, medicine, coding, climbing, sailing, scuba diving, physical fitness, martial arts, building anything from woodworking to electronics, any knowledge you have before seeing the procedure will help you understand it.

You can bet that surgeon learning how to do brain surgery brought a huge amount of background knowledge.

Some things take minimal background, just the random skills and knowledge you already have from life. But more difficult subjects benefit from whatever time you can invest beforehand. Videos, books, blogs, and articles, in print and online, are all good resources, as well as online forums.

That establishes the background knowledge you'll bring to seeing the procedure.

Once you've seen the procedure, as you prepare to do it yourself, it's useful to go back to your resources. Now that you know better what to look for, you can get more details. You can reinforce what you saw.

That expands the background knowledge you'll bring to doing the procedure.

Once you've done the procedure, as you prepare to teach it, go back to your resources again. As a result of doing, there will be details you want to fill in, and you may understand the material better. You may have run into some things that you wished you knew more about. You may anticipate additional questions from your students.

That further expands the background knowledge you'll bring to teaching the procedure.

Experience Of Doing

The experience of doing the procedure is critical. That's where you have the opportunity to work through mistakes and see what works and doesn't work for you. That's where you start to lock it into your brain.

Don't be afraid to make mistakes! Mistakes are great learning opportunities. As long as there's no injury and no damage, there's no harm done. And a little blood on the deck isn't an injury.

This is also where you can work out your own changes to the procedure. Just because you saw it done one way doesn't mean that's the only way to do it. That was one way. You can use it as your starting point, and add your own tweaks.

Or maybe you'll realize what you saw really was a good way and you shouldn't mess with it.

You might need to do the procedure more than once before teaching it. Some procedures take practice before you feel confident teaching them to someone else.

The experience of teaching the procedure will be different from the experience of doing it for yourself. Your students may have questions or difficulties that force you to think about things in different ways.

Plus there's the pressure of performing for an audience. But as you gain experience teaching, that will get easier. It's just a different kind of doing.

The experience of teaching is where you finish locking it into your brain.

Teamwork

Sodoto is a great method for dividing up a project. Whether at work, at school, or with your friends, you can divide up the project and have each person take on a part.

They go off and see how it's done, do it themselves until they feel ready, then bring it back to the team to teach everyone else.

What if you can't agree how to divide it up because multiple people want to do the same thing? Fine! Let them!

Each person will have their own take on the experience and teach it slightly differently. That helps explore all the possibilities in the procedure.

More C+-

2018-04-17T08:58:00.000-07:00

In The Case For C+-, I talked about writing quick tools in a simple C style, but taking advantage of the C++ standard library, primarily the dynamic data structures. It ends up being C++ without any (or just a few) user-defined classes, so is something of a lightweight object-oriented approach (yes, yes, I'm sure OO purists are barfing at the thought). The main benefit is fast coding.

There I showed as an example the msgresolve tool, which I used to resolve messages logged by an IOT device (the client) and its server. This is a lot of string processing and cross-indexing, with logs containing potentially thousands or tens of thousands of messages.

Shortly after I had completed msgresolve, I needed to have a tool to help me sift through large text files of server logs, logging the TCP connections made by clients and their subsequent activity. I was chasing down a problem where some of the connections were shutting down sooner than expected.

I wasn't sure what was causing the early shutdowns, and wasn't even sure initially which connections had experienced it, so I wanted to be able to gather all the lines for a given connection and list them out for tracing through, for each connection.

That would help me identify the ones that were live at the end of the log sample vs. the ones that had ended early. The log entries for hundreds of connections were all intermixed.

Armed with the methods I had used in msgresolve.cpp, conceptual design was easy. I wanted an ordered list of connections, and associated with each one, the sequential list of log entries associated with the connection.

There were also connections with some internal addresses I wanted to ignore. I could have done this filtering with grep, but it was easy enough to build the capability into the program so that it could stand alone. That also helped me explore some additional string processing functions.

Given that architecture, the data structure I needed was a std::map that mapped a string (the connection identification) to a std::list of strings (the log lines for the connection).

I had the program working in less than an hour. Then I spent at least another hour screwing around with the timestamps in the log entries, figuring out how to process them and deciding what to do with them. Then a little more time on refactoring and cleanup.

Throughout, I used a sample log file that had entries for several connections, including addresses I wanted to skip. I used that as a simple unit test to exercise the code paths.

The resulting code provided the impetus for a simple generalized string processing module, which I'll cover in another post. But you can see some clear patterns emerging in this code.

Doing quick tools like this is fun and very satisfying. It makes your serotonin flow. You have a problem you need to deal with, so you sit down and spew a bunch of code in a short time, refine it, and use the results.

This is actually quite different from long-term product development. That kind of work has its intense coding phases, but once the initial version of the product is out, a lot of the work is much smaller surgical changes.

Even fitting a major new feature in often involves many small bits of code scattered throughout the larger code base, integrating the tendrils. Getting that to work has a different kind of satisfaction.

Design Choices

These tools also give you a chance to think about different approaches. You can balance the variables of memory consumption, CPU consumption, I/O consumption, time, and code complexity (that is, ease of writing and maintaining the code, and compiled code space consumption, not algorithmic complexity) for a given situation.

For instance, the log files I was dealing with had over a million lines of data, some 200MB worth covering hundreds of connections.

That meant I had several choices:

I could load all the data into memory and then print it out in an orderly manner. This is a single-pass solution, that consumes large amounts of memory.
I could scan the file once, identifying all the individual connections, then for each connection, scan the file from beginning to end to read and print their lines. This a multi-pass solution that requires little memory but significant file I/O.
I could scan the file once, and for each identified connection, track the file position of the first and last line, then for each connection, just scan that range of the file. This is a multi-pass solution that reduces the total file I/O for a negligible increase in memory.
I could do the same thing, but instead of tracking just the first and last line file positions, build a list of the file position and length of each line, then on each pass, just skip directly to the locations of the lines. This is still multi-pass, but significantly reduces the total file I/O because it only visits each file position twice, requiring a bit more complexity and a bit more memory.

The decision on which choice to use is system-dependent. If memory is cheap and plentiful, and file I/O is relatively expensive, either in terms of time or charges to transfer data over a data link (maybe the data is remote, accessed over a cellular link), then the single-pass solution is best, choice #1.

On a small-memory system, a multi-pass solution is better, and you just have to live with the extra I/O. In that case, choice #4, which is the most complicated code, has the best compromise of low memory and low I/O consumption.

Although if you're really pressed for code space, the simplest multipass solution that iterates over the entire file for each connection is the better choice, #2.

Realistically though, you don't run tools such as this locally on small systems. Where possible, you offload the data to a larger system and run the tools offline.

In this case, I'm running on a Mac with 16GB of memory. Slurping up a 200MB text file and holding everything in memory is nothing. So the single-pass solution is the way to go.

Loading The Data

The log file may just be a chronological sample of all the activity logged by the server, so some connections will already exist at the start of the log, and some will remain at the end. Meanwhile, connection starts and terminations will appear in the log.

The program parses the lines from the log to find connection lines (i.e. some activity for a given connection). It identifies them by looking for a connection ID, which consists of an IPv4 address/port pair (a remote socket ID), and may optionally include a hexadecimal client ID.

It uses just the socket ID as the connection ID, which is the key to the connection map. When it finds a new connection ID, it adds it to the map with an empty list. For each connection line, it finds the connection map entry and appends the line to the list of lines for that connection.

As it loads connection lines, it filters them against the set of IP addresses to skip (these skip address are due to logging of other types of connections besides the client connections). That helps reduce the noise from a large log file.

Printing The Data

The program prints the connections by first iterating through the map and printing out a summary of each one: the connection ID, the number of lines, the duration of the connection data found in the log, and how its lifetime relates to the overall log. Since the map is an ordered structure, printing connections is always in sorted order by connection ID (though string sorted, not numeric sorted).

Then it makes a second pass through the connection map, iterating through each line for each connection and printing it out, with separators between connections. It prints a header and summary line before and after the activity lines for each connection, showing where the connection starts and ends relative to the start and end of the log.

As a last quick change, I added a threshold time value to clearly identify connections that ended at least 60 seconds before the end of the log. This would be a good candidate for a command-line parameter to override the default threshold.

All the output lines have unique greppable features or keywords so you can use other tools for additional postprocessing or extraction. For instance, I could grep out the end summary line of each connection, and maybe the last couple of activity lines before it, to see how each connection ended up. I could use the "threshold exceeded" indication to identify the ones that had ended early.

Some Design Evolution

This program adds the isMember() function, which determines whether a string is a member of a set of strings. Since my usage here was intended to deal with a small set and I had other functions that had similar iterative structure, in the heat of battle I quickly coded it as a linear search of a vector of strings.

That worked fine here, but as I pulled a bunch of this code out into a general string processing module, I realized that was a bad choice, because it's an O(N) search.

That became especially bad when I wanted an overload that took a vector of strings and determined if they were all members. That meant an O(M) repetition of O(N) searches: an O(M*N) or effectively O(N^2) algorithm.

That gets out of hand fast as M and N get larger. Meanwhile, the std::unordered_set is perfect for this, an O(1) algorithm for single searches, and an O(M) algorithm when repeated for M items.

I've left the original isMember() implementation here as an example of the evolution of a concept as you generalize it for other uses.

I also threw in a few overloads that I didn't end up using, but that set the stage for running with the concept in the string processing module. More discussion of that in the post containing the module.

The Result

This program turned out to be another fun exercise in string processing once I had built the basic support functions and could see the problem in those terms. It felt more like working in Python, and in fact just as the dict structure from msgresolve.cpp was inspired by Python, so are the split() and join() functions here.

The funny thing is that it took doing stuff in Python to make me see this approach. That points out one of the advantages of working in multiple different languages: you start seeing opportunities to apply some of the common idioms from one language in another language.

Here's logsplit.cpp:

// Usage: logsplit <serverLog>
//
// Splits a server log file by IPv4 connection. Prints a
// summary list of the connections, then the log lines for
// each separate connection.
//
// This is an example of a C++ program that is written mostly
// in plain C style, but that makes use of the container and
// composition classes in the C++ standard library. It is a
// lightweight use of C++ with no user-defined classes.
//
// 2018 Steve Branam <sdbranam@gmail.com> learntocode

#include <iostream>
#include <iomanip>
#include <sstream>
#include <vector>
#include <list>
#include <map>

enum ARGS
{
    ARGS_PROGNAME,
    ARGS_SERVER_LOG,
    ARGS_REQUIRED,
    ARGS_SKIP = ARGS_REQUIRED
};

enum SERVER
{
    SERVER_DATE,
    SERVER_TIME,
    SERVER_THREAD,
    SERVER_SEVERITY,
    SERVER_FUNC,
    SERVER_CONN,
    SERVER_TIME_LEN = 16,
    SERVER_TIMESTAMP_LEN = 28
};

enum CONN
{
    CONN_IP,
    CONN_PORT,
    CONN_CLIENT_ID
};

enum
{
    END_TIME_THRESHOLD = 60
};

typedef std::string String;
typedef std::vector<String> StringVec;
typedef std::list<String> StringList;
typedef std::map<String, StringList> ConnMap;
typedef std::pair<String, StringList> ConnMapEntry;

const char *timeFormat = "%Y-%m-%d %H:%M:%S";
StringVec skipIps;
size_t lines = 0;
size_t skipped = 0;
String firstTimestamp;
String lastTimestamp;
ConnMap connections;

StringVec split(const String& str, const char* delim)
{
    char buffer[str.size() + 1];
    StringVec strings;

    strcpy(buffer, str.c_str());

    char *token = std::strtok(buffer, delim);
    while (token != NULL) {
        strings.push_back(token);
        token = std::strtok(NULL, delim);
    }
    
    return strings;
}

String join(const StringVec& strings, const String& sep,
            size_t start = 0, size_t end = 0)
{
    String str;

    if (!end) {
        end = strings.size();
    }
    for (size_t i = start; i < end; ++i) {
        str.append(strings[i]);
        if (i + 1 < end) {
            str.append(sep);
        }
    }
    return str;
}

bool isMember(const String&str, const StringVec& set)
{
    for (size_t i = 0; i < set.size(); ++i) {
        if (str == set[i]) {
            return true;
        }
    }

    return false;
}

typedef int (*CharMatch)(int c);

bool isToken(const String& token, CharMatch isMatch)
{
    if (token.empty()) {
        return false;
    }
    else {
        for (size_t i = 0; i < token.size(); ++i)
        {
            if (!isMatch(token[i])) {
                return false;
            }
        }
    }
    return true;
}

bool isToken(const StringVec& tokens, CharMatch isMatch,
             size_t start = 0, size_t end = 0)
{
    if (!end) {
        end = tokens.size();
    }
    for (size_t i = start; i < end; ++i) {
        if (!isToken(tokens[i], isMatch)) {
            return false;
        }
    }
    return true;
}

bool isNumeric(const String& token)
{
    return isToken(token, isdigit);
}

bool isHex(const String& token)
{
    return isToken(token, isxdigit);
}

bool isNumeric(const StringVec& tokens,
               size_t start = 0, size_t end = 0)
{
    return isToken(tokens, isdigit, start, end);
}

bool isIpv4Address(const String& str)
{
    StringVec tokens(split(str, "."));

    return ((tokens.size() == 4) &&
            isNumeric(tokens));
}

bool isIpv4Port(const String& str)
{
    return ((str.size() <= 5) &&
            isNumeric(str));
}

bool isIpv4Socket(const StringVec& strings)
{
    return ((strings.size() >= 2) &&
            isIpv4Address(strings[0]) &&
            isIpv4Port(strings[1]));
}

time_t getTime(const String& strTime, const char* format)
{
    std::tm t = {};
    std::istringstream ss(strTime);
    ss >> std::get_time(&t, format);
    return mktime(&t);
}

time_t getTime(const String& field)
{
    // Skip opening and closing brackets.
    return getTime(field.substr(1, SERVER_TIMESTAMP_LEN - 2),
                   timeFormat);
}

size_t getDuration(const time_t& start, const time_t& stop)
{
    size_t seconds(difftime(stop, start));
    return seconds;
}

size_t getDuration(const String& start, const String& stop)
{
    return getDuration(getTime(start), getTime(stop));
}

size_t getDuration(const time_t& start, const String& stop)
{
    return getDuration(start, getTime(stop));
}

size_t getDuration(const String& start, const time_t& stop)
{
    return getDuration(getTime(start), stop);
}

bool isServerTime(const String& str)
{
    if (str.size() == SERVER_TIME_LEN) {
        for (size_t i = 0; i < str.size(); ++i)
        {
            if (!isdigit(str[i]) &&
                (str[i] != ':') &&
                (str[i] != '.') &&
                (str[i] != ']')) {
                return false;
            }
        }
        return true;
    }
    return false;
}

bool isConnId(const String& str)
{
    StringVec fields(split(str, ":"));

    return (isIpv4Socket(fields) &&
            (fields.size() < CONN_CLIENT_ID + 1 ||
             isHex(fields[CONN_CLIENT_ID])));
}

bool isServerConn(const StringVec& fields)
{
    return ((fields.size() > SERVER_CONN) &&
            isServerTime(fields[SERVER_TIME]) &&
            isConnId(fields[SERVER_CONN]));
}

bool loadServer(const char* fileName)
{
    FILE* file = std::fopen(fileName, "r");
    
    if (file) {
        char buffer[1000];
        while (std::fgets(buffer, sizeof(buffer), file) != NULL) {
            String line(buffer);
            StringVec fields = split(buffer, " \t");

            if (isServerConn(fields)) {
                ++lines;
                lastTimestamp = line.substr(0, SERVER_TIMESTAMP_LEN);
                if (firstTimestamp.empty()) {
                    firstTimestamp = lastTimestamp;
                }
                
                strncpy(buffer, fields[SERVER_CONN].c_str(),
                        sizeof(buffer));
                StringVec conn = split(buffer, ":");

                if (isMember(conn[CONN_IP], skipIps)) {
                    ++skipped;
                }
                else {
                    String key(conn[CONN_IP]);
                    key.append(":");
                    key.append(conn[CONN_PORT]);

                    ConnMap::iterator match;
                    match = connections.find(key);
                    if (match == connections.end()) {
                        connections.insert(ConnMapEntry(key,
                                           StringList()));
                        match = connections.find(key);
                    }
                    match->second.push_back(line);
                }
            }
        }
        std::fclose(file);
        if (connections.empty()) {
            std::cout << "No connections found" << std::endl;
            return false;
        }
        return true;
    }
    std::cout << "Failed to open server file"
              << fileName << std::endl;
    return false;
}

void printSeparator()
{
    std::cout << std::endl
              << "=-=-=-=-" << std::endl
              << std::endl;
}

void listConnections()
{
    std::cout << connections.size() << " connections "
              << firstTimestamp << "-" << lastTimestamp << " "
              << lines << " lines, "
              << getDuration(firstTimestamp, lastTimestamp) << " sec:"
              << std::endl;
    if (skipIps.size()) {
        std::cout << "(skipped " << skipped << " connections with "
                  << join(skipIps, ", ") << ")" << std::endl;
    }
    std::cout << std::endl;
    for (ConnMap::iterator curConn = connections.begin();
         curConn != connections.end();
         curConn++) {        
        String conn(curConn->first);
        StringList connLogs(curConn->second);
        std::cout << conn << "\t"
                  << connLogs.front().substr(0, SERVER_TIMESTAMP_LEN)
                  << "-" << connLogs.back().substr(0, SERVER_TIMESTAMP_LEN)
                  << " " << connLogs.size() << " lines, "
                  << getDuration(connLogs.front(), connLogs.back())
                  << " sec" << std::endl;
    }
    printSeparator();
}

void logConnections()
{
    time_t timeFirst = getTime(firstTimestamp);
    time_t timeLast = getTime(lastTimestamp);
    
    for (ConnMap::iterator curConn = connections.begin();
         curConn != connections.end();
         curConn++) {        
        String conn(curConn->first);
        StringList connLogs(curConn->second);
        size_t duration(getDuration(connLogs.front(), connLogs.back()));
        
        std::cout << "Connection " << conn
                  << " " << connLogs.front().substr(0, SERVER_TIMESTAMP_LEN)
                  << "-" << connLogs.back().substr(0, SERVER_TIMESTAMP_LEN)
                  << " " << connLogs.size() << " lines, "
                  << duration << " sec:" << std::endl << std::endl;

        size_t seconds = getDuration(timeFirst, connLogs.front());
        std::cout << firstTimestamp << " Starts " << seconds
                  << " sec after start of log." << std::endl;

        for (StringList::iterator curLog = connLogs.begin();
             curLog != connLogs.end();
             curLog++) {        
            std::cout << *curLog;
        }

        seconds = getDuration(connLogs.back(), timeLast);
        std::cout << lastTimestamp
                  << " " << connLogs.size() << " lines, "
                  << duration << " sec. Ends "
                  << seconds << " sec before end of log.";
        if (seconds > END_TIME_THRESHOLD) {
            std::cout << " Exceeds threshold.";
        }
        std::cout << std::endl;
        printSeparator();
    }
}

int main(int argc, char* argv[])
{
    if (argc < ARGS_REQUIRED ||
        String(argv[1]) == "-h") {
        std::cout << "Usage: " << argv[ARGS_PROGNAME]
                  << " <serverLog> [<skipIps>]" << std::endl;
        std::cout << "Where <skipIps> is comma-separated list "
                  << "of IP addresses to skip." << std::endl;
        return EXIT_FAILURE;
    }
    else {
        if (argc > ARGS_REQUIRED) {
            skipIps = split(argv[ARGS_SKIP], ",");
        }
        
        if (loadServer(argv[ARGS_SERVER_LOG])) {
            listConnections();
            logConnections();
        } else {
            return EXIT_FAILURE;
        }
    }
    return EXIT_SUCCESS;
}

The sample log file, conns.log:

[2018-04-03 13:16:29.469659] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket receive_buffer_size=117708
[2018-04-03 13:16:29.469678] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket send_buffer_size=43520
[2018-04-03 13:16:29.867381] [0x00007fb3ff129700] [debug]   set_idle_send_timeout() 10.1.180.206:30450 set idle send timeout to 60 seconds
[2018-04-03 13:16:29.867394] [0x00007fb3ff129700] [debug]   set_idle_receive_timeout() 10.1.180.206:30450 set idle receive timeout to 120 seconds
[2018-04-03 13:16:29.867450] [0x00007fb3ff92a700] [info]    handle_connected() 10.1.180.206:30450 remote connected [2423/8951]
[2018-04-03 13:16:29.959877] [0x00007fb3ff129700] [debug]   install_connection() 10.1.180.206:30450:00003bd4 linkType 0
[2018-04-03 13:16:29.966599] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x4a2137a6, 231 bytes
[2018-04-03 13:16:29.966935] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xa11f878a, 35 bytes, msg type 3
[2018-04-03 13:17:29.967117] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x1c35386e, 29 bytes, msg type 1
[2018-04-03 13:17:29.967228] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:17:30.086722] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:30.086813] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:40.086722] [0x00007fb3ff129700] [debug]   receive() 10.2.80.206:3050:00000bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:40.086813] [0x00007fb3ff129700] [debug]   send() 10.2.80.206:3050:00000bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:30.139377] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xa0f8a8e1, 78 bytes
[2018-04-03 13:18:29.867494] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 127.0.0.1:32450:00003bd4 60 seconds
[2018-04-03 13:18:29.967315] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:18:30.086988] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x8fcf53f6, 29 bytes, msg type 1
[2018-04-03 13:18:30.087101] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:18:30.197029] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x515f1d4e, 29 bytes
[2018-04-03 13:18:30.197120] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x6827d190, 29 bytes, msg type 17
[2018-04-03 13:18:30.249027] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xda66c5c7, 78 bytes
[2018-04-03 13:19:30.087189] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:19:30.139486] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 10.1.180.206:30450:00003bd4 60 seconds
[2018-04-03 13:19:30.197322] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x812afb16, 29 bytes, msg type 1

Sample run:

$ g++ logsplit.cpp -o logsplit
$ ./logsplit conns.log 127.0.0.1,3.3.3.3
2 connections [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 25 lines, 181 sec:
(skipped 1 connections with 127.0.0.1, 3.3.3.3)

10.1.180.206:30450 [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 22 lines, 181 sec
10.2.80.206:3050 [2018-04-03 13:17:40.086722]-[2018-04-03 13:17:40.086813] 2 lines, 0 sec

=-=-=-=-

Connection 10.1.180.206:30450 [2018-04-03 13:16:29.469659]-[2018-04-03 13:19:30.197322] 22 lines, 181 sec:

[2018-04-03 13:16:29.469659] Starts 0 sec after start of log.
[2018-04-03 13:16:29.469659] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket receive_buffer_size=117708
[2018-04-03 13:16:29.469678] [0x00007fb3ff129700] [debug]   start() 10.1.180.206:30450 TCP socket send_buffer_size=43520
[2018-04-03 13:16:29.867381] [0x00007fb3ff129700] [debug]   set_idle_send_timeout() 10.1.180.206:30450 set idle send timeout to 60 seconds
[2018-04-03 13:16:29.867394] [0x00007fb3ff129700] [debug]   set_idle_receive_timeout() 10.1.180.206:30450 set idle receive timeout to 120 seconds
[2018-04-03 13:16:29.867450] [0x00007fb3ff92a700] [info]    handle_connected() 10.1.180.206:30450 remote connected [2423/8951]
[2018-04-03 13:16:29.959877] [0x00007fb3ff129700] [debug]   install_connection() 10.1.180.206:30450:00003bd4 linkType 0
[2018-04-03 13:16:29.966599] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x4a2137a6, 231 bytes
[2018-04-03 13:16:29.966935] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xa11f878a, 35 bytes, msg type 3
[2018-04-03 13:17:29.967117] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x1c35386e, 29 bytes, msg type 1
[2018-04-03 13:17:29.967228] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:17:30.086722] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:30.086813] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:17:30.139377] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xa0f8a8e1, 78 bytes
[2018-04-03 13:18:29.967315] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:18:30.086988] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x8fcf53f6, 29 bytes, msg type 1
[2018-04-03 13:18:30.087101] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 59 seconds
[2018-04-03 13:18:30.197029] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0x515f1d4e, 29 bytes
[2018-04-03 13:18:30.197120] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x6827d190, 29 bytes, msg type 17
[2018-04-03 13:18:30.249027] [0x00007fb3ff129700] [debug]   receive() 10.1.180.206:30450:00003bd4 RX sum 0xda66c5c7, 78 bytes
[2018-04-03 13:19:30.087189] [0x00007fb3ff129700] [debug]   handle_idle_send_timeout() 10.1.180.206:30450:00003bd4 0 seconds
[2018-04-03 13:19:30.139486] [0x00007fb3ff129700] [debug]   handle_idle_receive_timeout() 10.1.180.206:30450:00003bd4 60 seconds
[2018-04-03 13:19:30.197322] [0x00007fb3ff129700] [debug]   send() 10.1.180.206:30450:00003bd4 TX sum 0x812afb16, 29 bytes, msg type 1
[2018-04-03 13:19:30.197322] 22 lines, 181 sec. Ends 0 sec before end of log.

=-=-=-=-

Connection 10.2.80.206:3050 [2018-04-03 13:17:40.086722]-[2018-04-03 13:17:40.086813] 2 lines, 0 sec:

[2018-04-03 13:16:29.469659] Starts 71 sec after start of log.
[2018-04-03 13:17:40.086722] [0x00007fb3ff129700] [debug]   receive() 10.2.80.206:3050:00000bd4 RX sum 0xf6e3f3bf, 29 bytes
[2018-04-03 13:17:40.086813] [0x00007fb3ff129700] [debug]   send() 10.2.80.206:3050:00000bd4 TX sum 0xfb208d9f, 29 bytes, msg type 17
[2018-04-03 13:19:30.197322] 2 lines, 0 sec. Ends 110 sec before end of log. Exceeds threshold.

=-=-=-=-

First Use Of New Tools

2018-04-14T17:16:00.000-07:00

Using my new tools to fix my Droid MAXX, lying open front and center.

Sometimes life just amazes me. Three days after receiving my Adafruit tools, I needed them. Twice.

This isn't the first time that I've learned something completely new out of the blue, and almost immediately needed it. It's just incredibly serendipitous.

First Problem

Yesterday, Friday, I worked from home. The project I'm working on is embedded system firmware to control a daughter board. The main smart chip on the board was shutting down unexpectedly, which I could tell initially from the timeouts reading responses from it, and then from its status output LED going off.

There's a power enable line for the chip in the cable pigtail that I control via software, and scouring through my code assured me that no code path should be shutting it off during the scenario I was exercising.

Yet the LED was going off and the chip was unresponsive. Maybe it was a bug in the chip, and my commands to it were triggering the shutdown.

Or maybe something besides my code was affecting the power enable. So I pulled out my new meter, opened up the cable to the daughter card, and clipped the meter into the ground and enable lines, switched to VDC.

With the code in the state where power was enabled, the LED was on, the chip was responsive, and the meter showed 3.2V. With the code in the state where power was disabled, the LED was off, the chip was unresponsive, and the meter showed 0.0V. Ok, good, correct setup.

Running my tests, with additional debug logging verifying the code was NOT running through the parts that shutoff power, I saw the LED go off and the meter drop from 3.2 to 0. So something unrelated was indeed dropping the power enable.

After more testing, I noticed a pattern in the logging, unrelated to my code, that correlated with the shutoff. Now, correlation does not imply causation. But it's a hint about where to look.

So running a couple more debug commands that didn't exercise my code path at all, I was able to confirm that the correlated activity did indeed result in disabling power to the chip.

Yes! Phew, it's not the chip crashing on me, and it's not a problem in the code I just wrote.

Monday, I'll get together with the hardware guys and chase the problem down. First, I'll try it with another unit.

For all I know, my unit is damaged, since it's all pulled apart on my desk to connect the daughter board. Open frame hardware is always vulnerable.

That's an example of diagnosing a problem and ruling out two large potential root cause areas.

Monday update: Another unit showed the same problem. So it wasn't damage in my unit.

I discussed the evidence in the logging with the hardware engineer. We identified some register accesses used to control GPIO (General Purpose Input/Output) pins on the processor that were affecting more pins than intended. These were causing the GPIO pin controlling the power enable to be cleared.

This shows one of the challenges inherent in embedded systems. You're working at a much lower, more direct level in the hardware, with lots of opportunities to create strange conflicts between different parts of the system.

Second Problem

Today, Saturday, I was down in the basement, and as I put my phone in my pocket, it slipped and fell flat on its face on the concrete floor with a sharp thud. DOHH!

It's a Droid Maxx in a case that goes in a holster. The case protected the screen and phone from overall damage, but the power had shut off (yeah, another power problem). I held the power button in, but it wouldn't power on, even after several attempts.

I plugged the charger cord in. The display came on, showing 0% power. It tried to boot, then quickly shutoff. It just repeated this cycle continuously. So there was some level of system function, but no power.

The shock of impact had apparently knocked something loose inside, which I figured was most probably the battery connection. Dead phone. Stupid reason. Words were said.

And this is a sealed phone, not a removable battery like older models. More words were said.

I emailed my wife who was visiting some friends that my phone was dead. Ugh. I look forward to going to the cell phone store the way I look forward to going to a used car lot (although I'm being unfair, my last several experiences with them have been excellent).

But Internet to the rescue, I Googled "access battery in droid maxx" and found a page showing how to do it, plus someone in the comments had pointed to a YouTube video they liked better.

I read the page and watched the video. Yep, I can do that. Worst case, the phone is already dead, I can't make it any worse.

Since I didn't have a spudger or plastic pry tool as shown in the tutorials, I improvised one from a piece of 1/8" thick padauk, which is a very hard tropical wood. I split off a strip about 1/2" wide and beveled the end of it with a block plane.

Between that and a small flat screwdriver, I was able to get the case open. The metal driver tip did break a couple bits of plastic off the edge of the back, but no major damage.

This also shows why I ordered multiple tool sets (I've added an iFixit set to the shopping list on my Adafruit post, because it has pry tools). I used the smallest flat screwdriver from one set for lifting tabs, the T5 from another set to remove most of the screws, and the even tinier flat tip from a third set to get into the points of a T4 screw.

I used the compartments in the base of my Panavise to hold all the loose parts.

With all the screws removed, I separated the board from the display, and sure enough, I could see immediately that the tiny connector on the flat battery cable was loose on the back of the board. This wasn't the first time I had dropped it, so it probably loosened up progressively. Everything inside the phone is secured in place pretty tightly.

I pressed on it and it clicked positively into place. I pulled it back off to look at it, then clicked it back in. So I was confident that it was secure.

As I was handling the phone with everything folded together loosely, I accidentally held the power button in. I noticed the display come on, so I let it continue. It booted up (SIM removed), so power appeared to be restored.

I shut it down and reassembled the phone. The tweezers were a big help dropping the tiny screws back into their holes.

Popped the case back together, reinserted the SIM, and held the power button in. Voila! It powered on, connected to the network, and I was back in business, with a couple of quick tests to verify that things were working right and it was staying on. I texted my wife that I had fixed it.

That's an example of digging in and not being afraid to play around and see what's going on. It's all a learning experience!

Limor Fried Is My New Hero

2018-04-08T17:17:00.003-07:00

Meet Limor Fried, founder of Adafruit.

I'm cross-posting this to both my woodworking blog www.CloseGrain.com and my software engineering blog FlinkAndBlink.blogspot.com (under the LearnToCode label), because even though there's no woodworking in it, this is all about building stuff, so it bridges the worlds. It's the maker ethos.

If you're interested in learning to code, and actually building the stuff that you're coding on, this is for you. This is all about working on embedded systems, from the hobby level to the professional.

In Which I Find Out About Limor Fried

Allow me a moment of self-indulgent gushing admiration here. Or you can skip down to the real information that starts at the Electronics Learning Resources heading.

I admit to instant and total nerd-crush. Limor Fried, who goes by the name Ladyada online (for Lady Ada Lovelace, The First Programmer) is the founder of Adafruit.

Adafruit is a small electronics manufacturing company in Manhattan, NY, that focuses on teaching electronics to makers of all ages. You can read about them here.

Electronics is another of those hobbies that I wanted to pursue as a teenager, but never could due to lack of funds. Fortunately I've advanced beyond that impecunious stage of life, and seeing this has fired instant obsession (hence the shopping list below!).

I'm familiar with that feeling of obsession settling on my shoulders. It propelled me into hand tool woodworking, turning into a book. It propelled me into violinmaking. It propelled me into boatbuilding.

Each time, the pattern is the same. I buy a bunch of books, watch a bunch of videos, dig through a bunch of blogs and forums, then buy a bunch of tools and start playing. Last year it propelled me into small engine repair and oxy-acetylene welding after I found Taryl Dactyl (yes, blog posts will be forthcoming).

Now, in my copious free time (that's a joke, son), I'll finally be realizing that dream to get my hands dirty with electronics.

I owe this to Matt Pandina, whom we recently hired at work. It quickly turned out that Matt is a maker and likes sharing information. He has some nice stuff on Google Groups under the moniker artcfox (in fact, one of his articles was coincidentally the answer to the embedded systems programming problem I use when interviewing candidates!).

He made a comment about how Adafruit is doing manufacturing in Manhattan, and I asked, "Who's Adafruit?". That was all it took. Thanks, Matt!

I was tickled to read Fried's favorite quote in the Entrepreneur Magazine article about her:

“We are what we celebrate.” —entrepreneur and inventor Dean Kamen.

Kamen is one of my other heroes. She whose hero is my hero is my hero!

I managed to score his autograph at the 2015 MassMEDIC conference. I was at the 2015 Embedded Systems Conference (ESC Boston), which was being held concurrently at the Boston Convention Center.

When I saw Kamen listed as keynote speaker, I scooted down early and got a chance to talk to him and tell him I wanted to work for him (he probably gets a lot of stalker geeks like that!). Came close the following year, but logistics didn't work out.

Electronics Learning Resources

On the business side, Adafruit sells kits, parts, tools, and books. That's pretty cool (along with being able to pull off a manufacturing operation in Manhattan). But what's truly spectacular about them is their online learning resources.

Fried is a big proponent of open source, sharing the knowledge. So the Adafruit website is chock full of information. There's also an extensive YouTube channel.

You'll also finds lots of cross-pollination with others in the maker community. There are magazines, blogs, and videos by the score, by independent makers like Matt, and by larger organizations.

I've just barely begun to scratch the surface. This is great, because I know how to program embedded systems, but I don't know much about the components that go into them and connect to them. It's the combination of hardware and software that really makes something work.

Pretty much everything I know about digital electronics I owe to Forrest M. Mims and George Young 35 years ago. Now, after that brief hiatus, I can take the next step.

Basic Electronics Lab Skills

Step into Collin's lab!

Among the resources is a series of very accessible quick guides and videos by Collin Cunningham. Of particular interest to the electronics beginner such as myself is this set of basic electronics lab skills (you can scan through all these for quick grok of the big picture by setting the speed in the YouTube window settings (the gear icon) to 2x, then come back and watch at normal speed for a second pass):

Soldering and Desoldering: how to solder components together properly, and how to pull them apart for salvage and rework.
Surface Mount Soldering: how to solder surface-mount components.
Multimeters: how to use a meter for basic measurements.
Oscilloscopes: how to use an oscilloscope for advanced measurements and waveforms.
Hand Tools: the basic hand tools used for assembling and disassembling electronics.
Schematics: how to read schematics (no, they're not Greek!).
Breadboards and Perfboards: how to combine the parts on a schematic into a functioning circuit.
Ohm's Law: understanding the relationship between voltage, current, and resistance.

Once you have these skills, you are unleashed. Just like hand tool woodworking, it takes a little investment in tools and equipment, and a little time practicing with them.

These form the basis of the shopping list below. And of course they lead to lots of other interesting videos, like Collin's videos on the basics of various components:

Batteries: the basics of using batteries to supply DC power to projects.
Solar Cells: using solar cells to keep the batteries charged.
Power Supplies: using an AC power supply to supply DC power to projects.
Pulse Width Modulation: using a PWM converter to change DC input voltage to lower effective DC voltage, or as a simple digital-to-analog converter (DAC).
Switches: understanding the different types of switches for manually controlling projects.
The Transistor
The Capacitor
The Diode
The Inductor
The Resistor
The LED
The Integrated Circuit (IC)
The Arduino

There are also a number of other introductory Adafruit written guides by various contributors (as well as oceans of more specialized and advanced guides, check them out!):

Shopping List

These are the tools, equipment, supplies, and books to do the work. With the exception of the oscilloscope and logic analyzer, these are all links to the Adafruit shopping pages. Prices as of April 8, 2018.

Tools and equipment:

Hakko FX-888D Digital Soldering Iron, $109.95
Hakko Soldering Tip: T18-D24 Screwdriver, $7.95
Hakko Soldering Tip: T18-C2 Hoof, $7.95
Hakko Soldering Tip: T18-S4 Fine SMD, $9.95
Panavise Multi-Purpose Work Center, $99.95
Third Hand Pana Hand Workstation Add-On, $54.95
Helping Third Hand Magnifier W/Magnifying Glass, $6.00
Flush diagonal cutters, $7.25
Simple pliers, $3.00
Hakko Professsional Quality 20-30 AWG Wire Strippers, $14.95
Adafruit Pocket Screwdriver, $1.50
Precision screwdriver set (6 pieces), $7.95
Precision Torx Screwdriver Set (6 pieces), $6.95
65 Piece Ratchet Screwdriver and Tool Bit Set, $29.95
iFixit Essential Electronics Toolkit, $19.95
Super Scissors, $14.95
Solar Digital Calipers, $14.95
Fine tip straight tweezers - ESD safe, $3.95
Fine tip curved tweezers - ESD safe, $3.95
ESD-Safe PCB Cleaning Brush, $2.95
Solder sucker, $5.00
Professional IC Extraction Tool, $14.95
Full sized breadboard, $5.95
Breadboarding wire bundle, $4.95
Small Alligator Clip Test Lead (set of 12), $3.95
2.1mm DC Barrel Jack to Alligator Clips, $1.95
In-line power switch for 2.1mm barrel jack, $2.50
5V 2A (2000mA) switching power supply, $7.95
9 VDC 1000mA regulated switching power adapter, $6.95
Extech EX330 12-function autoranging multimeter, $59.95
Rigol DS1054Z Digital Oscilloscope - Bandwidth: 50 Mhz, Channels: 4, $349.00 (this is an Amazon link, since the 4 channel scope is less than the Adafruit 2 channel Rigol. Sorry, Limor!)
Saleae Logic 8 Logic Analyzer, 8 D/A Inputs, 100 MS/s, $199 with discount code that you can request and apply to your cart when checking out (this is a direct Saleae link, since they offer an awesome "enthusiast/student" discount of $200 off through their site; thanks, guys!)

Consumable supplies:

Bakelite Universal Perfboard Plates - Pack of 10, $4.95
Hook-up Wire Spool Set - 22AWG Solid Core - 10 x 25ft, $27.50
Hook-up Wire Spool Set - 22AWG Stranded-Core - 10 x 25ft, $27.50
Mini Solder spool - 60/40 lead rosin-core solder 0.031" diameter - 100g, $7.95
Solder Wire - 60/40 Rosin Core - 0.5mm/0.02" diameter - 50 grams, $5.95
Solder Wire - RoHS Lead Free - 0.5mm/.02" diameter - 50g, $11.95
Solder Wire - SAC305 RoHS Lead Free - 0.5mm/.02" diameter - 50g, $14.95
Chip Quik SMD Removal Kit, $16.00
Chip Quik SMD Removal Kit with Lead-Free Alloy, $17.00
Solder wick - 3S 5ft., $3.00
Heat Shrink Pack (24 pieces), $4.95
Multi-Colored Heat Shrink Pack (30 pieces), $4.95
Breadboard-friendly 2.1mm DC barrel jack, $0.95
Adafruit Parts Pal, $19.95

Books:

Make: Electronics (Charles Platt) - 2nd Edition, $34.95 (spoiler alert: this is a fantastic book for beginners and those with a little knowledge!)
Make: More Electronics by Charles Platt, $39.95
Getting Started in Electronics by Forrest M. Mims III, $19.95 (remember him?)
Practical Electronics for Inventors, Fourth Edition, by Paul Scherz and Simon Monk, $40.00
Hacking Electronics by Simon Monk, $29.95
Learn Electronics with Arduino - by Jody Culkin and Eric Hagan, $24.95
Python for Microcontrollers: Getting Started with MicroPython by Donald Norris, $19.95
Programming the Raspberry Pi: Getting Started with Python - 2nd Edition by Simon Monk, $15.00

Finally, here are some additional random useful items that they don't carry, all via Amazon:

StarTech.com 24x27.5-Inch Desktop Anti-Static Mat, $17.44. Anti-static mats are important for ESD safety, to avoid damaging sensitive components. Use this larger one as you primary work surface.
Velleman AS4 Anti-Static Mat with Ground Cable - Desktop static dissipative mat - 11.8" x 22", $11.18. Smaller secondary mat for second work area.
Rosewill ESD Anti-Static Wrist Strap Components, $5.99, one per mat.
Elenco Electronics LP-560 Logic Probe, $18.00. For checking logic levels on IC pins, including catching quick pulses that you would miss using a meter.
Silvertronic 501784CS Solid Copper Alligator Clip w/Barrel (10 pieces), $19.99. Because copper absorbs heat well, these are used as heat sinks on component leads when soldering to avoid heat damage.
Elenco Electronics TL-21 Minigrabber to Minigrabber 5 pc Test Lead Set, $8.95. Minigrabbers are good for grabbing onto closely-spaced IC and header pins without shorting to adjacent pins.
3M Scotch #35 Electrical Tape Value Pack (5 colors), $10.10, because how can you build electrical stuff without electrical tape, color coded?
Scotch Super 33+ Vinyl Electrical Tape (black), $3.98
Sharpie Fine Point Asst Colors (8 colors), $6.30, for color coded marking.
Permatex 80050 Clear RTV Silicone Adhesive Sealant, 3 oz, $11.18. RTV is a universal technician's friend. Gobs of it serve as adhesive, sealers, hole plugs, gaskets, wire holders, vibration dampers, etc. There are a number of different formulations.

Total cost: $1567 for everything (I ordered 2 spools leaded solder and 1 leaded Chip Quik, no lead-free items, 10 DC barrel jacks, and all the screwdriver/tool sets, since you never know which tips and shanks will fit, and some cases need special access tools to open), with free shipping from both Adafruit and Amazon, $10 for Saleae. Plus Adafruit threw in a free half-size breadboard and a Circuit Playground Express.

Back in my teenage days, $10 was a major expenditure, and $100 was simply inconcievable. This is starting to add up to some real money, but it will leave you armed with the tools, knowledge, and skills sufficient to launch a career.

The really nice thing is that Adafruit provides a curated list of things to choose from, so you're getting the benefit of their experience and recommendations, all guided by that maker ethos. That was a big plus for me.

Bridging three centuries of maker technology in my workshop.

You can read about my first use of these tools, since I needed them almost immediately.

For a review of the outstanding Charles Platt books listed above, see Review: Make: Electronics and Make:More Electronics.

For a useful set of resources to help you learn electronics, see Learning About Electronics And Microcontrollers.

The Case For C+-

2018-03-28T20:32:00.000-07:00

No, that's not a typo. I really do mean C+-.

When using C++, there's a tendency for C programmers to think they have to use all the facilities of the language at once. Particularly user-defined classes, since C++ takes C into object-oriented programming. "If I'm using C++, I have to define some classes."

The danger in that is a risk of over-engineering, over-complicating things, by forcibly looking for ways to use classes when there may not be a need for that.

In a large, complex piece of software, there are many places that benefit from user-defined classes. In a bigger beast like that, the user-defined object-oriented approach helps abstract the problem.

But in small, quick tools, that's not necessarily the case, so plain C with a few simple structs is probably sufficient. However, there's still a lot of value to be found in the C++ standard library.

Specifically, the managed string and container classes. One of the big complaints about C is the need for explicit memory management. Because of that need, the C language and runtime library don't offer any native containers other than the statically-sized array, with the simple character array as an implementation of strings. If you need any dynamic structures, you have to implement your own, with explicit memory management on top of malloc() and free().

There are no native or standard library lists, hash tables, trees, or other dynamic structures. There are no dynamically-sized arrays or strings. There's no automatic deallocation of heap when you're finished using it.

As a result, C usage has been plagued by decades of buffer overflows and memory leaks. It also means a lot of time required to roll your own basic dynamic structures (and iron out the buffer overflows and memory leaks in their implementation).

But the C++ standard library provides all of those things. It also provides building blocks that can be used to layer more complex structure on them. That provides a lot of opportunities to build things without having to define any classes of your own.

I'm not saying there's anything wrong with classes. I'm just saying there's a whole class of programs that don't need any extra classes. The C++ standard library already provides a rich set of resources to choose from, often sufficient on their own to build useful programs that are faster to implement and debug than if you did everything in C.

You probably already treat templates that way. Even though C++ offers the ability to define templates, you may write tons of code without ever defining your own templates. Just because the language offers a feature doesn't mean you have to define any of your own things with it. Yet you still probably make extensive of templates through the library.

And so it is with classes. You can write tons of code without ever defining your own classes. Doing lots of string processing, common in software tools? The C++ standard library provides a whole host of classes that will help, starting with std::string.

Need to keep lists of those strings, in the order you got them? How about a std::list<std::string>? Need fast associative storage keyed off the string, or a portion of it? How about a std::unordered_map<std::string, yourThingHere>? Need to keep a set of sorted strings? How about a std::map<yourThingHere, std::string>? Or something sorted by the strings? How about a std::map<std::string, yourThingHere>?

Then if you need something a little more complex than simple strings and structs, you can use std::pair<thingA, thingB> or std::bind<callableThing, args>.

The other benefit to this is that at some point you may realize that perhaps there are some user-defined classes that would make sense in your progam after all, it's not just strings and structs and pairs and binds. The infrastructure you've already built into the program is OO-ready. And you have std::shared_ptr<yourClassHere> to automate memory management and support RAII, avoiding memory leaks.

So making the switch to a more heavily object-oriented program is a small step, a refinement, rather than throwing it all out and starting over again.

Meanwhile, you're already in the mindset of using just the minimum of appropriate user-defined classes, and not going overboard trying to beat everything into the shape of an OO nail just because you have an OO hammer.

That's adding just the amount of design and implementation complexity necessary to help abstract the appropriate parts of the problem, while maintaining a simple, pared-down elegance. Make things only as complex as you need to, and no more (as well as following Einstein's advice to make things as simple as possible, but no simpler).

Meanwhile, you're relying on a large body of fully-implemented and debugged composable, modular elements to speed the job to completion. In many ways, that right there is going a long way to meeting the promise of the "software IC".

So that's what I'm calling C+-. It's C++ minus the user-defined classes. Which is more than just writing plain C that you compile with the C++ compiler. It's simply object-oriented code that relies entirely on someone else's classes.

You can argue about whether that's a good thing or a bad thing in the grand scheme of things, but I see it as just another practical tool in your toolbox.

There are three situations where this approach is useful:

Quick tools where you need to get it done as fast as possible so you can use it to help you get on with your main work.
Competitive programming, where you're working under the gun.
Coding interviews, which are essentially competitive programming under a time limit, whether on a whiteboard, in a shared editing session, or in an automated coding assessment system.

As an example of this, here's a tool I've been wanting to have for a while. I work on IOT projects, distributed systems where small embedded system client devices communicate with large backend servers.

Debugging these can be challenging as you try to sift through the logs each side produces. Because many IOT systems lack real-time clocks, they may not know what actual time it is, so it's hard to match up activity in the client log with the activity in the server, especially when there are communication errors and voluminous logs.

The tool below, msgresolve, resolves the messages logged by a client IOT device and its server. The client tracks time since booted, in msec, and logs that timestamp on each line. The server tracks real time GMT to msec resolution, logging the data and time on each line.

The example logs here contain a very small amount of data, but it's not unreasonable for a log to have hundreds or thousands of messages.

In order for this to work, the message logging must have a way of identifying each message uniquely. This is known as the message signature, a short string that summarizes the message contents. The signature may be a cryptographic hash or message digest such as MD5, or a checksum or polynomial such as Fletcher or CRC.

The messages must have some degree of randomization in the the contents so that no two messages in the same direction every produce the same signature (at least for the duration of the logging). This randomization might be due to encryption, some incrementing field such as a timestamp or counter, or a randomized nonce.

Users of git will be familiar with this concept. The commit hash acts as the identifier for changes to file content, and is affected by only a single-byte change in the file contents.

Here the signature is formed from the message hash and the message length. Appending the length adds a little insurance in case messages of different lengths, with different contents, hash to the same value, known as a hash collision. Two messages of the same length should always hash to different values if at least one bit is different in them, so the hash conditioned by the length ensures a unique signature.

I had a couple thoughts on how to approach the algorithm. One was to treat it as a difference-matching problem, such as the Unix diff utility. The other was a kind of match-and-merge approach. However that seemed like it might head toward an O(N^2) algorithm (for each client message, run down the list of server messages to find a match), which would rapidly get too slow for large logs.

But that made me think about an indexed lookup method, where a faster lookup method would make that approach manageable.

Part of what made it tricky is the fact that even though the two logs have parallel, time-ordered sets of messages, there might be lost or corrupted messages, and the two logs might not cover the exact same range of time. So just because a message appeared in one log, there was no guarantee that it opposite appeared exactly as is in the other log.

The other thing that helped crystallize it was the realization that matching up a set of parallel ordered log entries could be viewed as three parts from the perspective of the client messages:

Handle any messages in the server log that preceded the messages matching the client messages.
Handle all the messages in the client log, which may or may not have matching server messages (along with intervening server messages that didn't have any matching client messages).
Handle any messages in the server log that followed the messages matching the client messages.

So this algorithm uses a hash table (std::unordered_map) to index a list (std::list) of log entries. The hash table (which I call a dict, as in a Python dict) is indexed by message signature. Ideally, for every transmitted message, there is a received message with matching signature. That's the basis of the lookup. Iterating linearly through the time-ordered list deals with the unmatched server messages. Reordered messages can produce some interesting results.

For every message, lookup the signature in the other side's dict to find its matching message. That makes it an O(N) algorithm (the hash lookup done for each message is O(1)).

I did have to separate transmit from receive messages for each side, since it's possible for a received message to have the same signature as a transmitted message if all the randomizing factors are the same in both directions. Thus the signature on a client TX message would be used to lookup the corresponding message in the server RX message dict.

The actual string storage for the log lines for each side is in the list, which is a time-ordered list. The dict entries contain references to those strings, so a dict is simply the index, by signature, of the list of strings.

All of this can be managed with standard library objects, using std::pairs to bind cross reference information with the log strings. For simple composition, this works well. As you need to compose more complex objects, navigating pairs of pairs rapidly gets out of hand, so that's when to define some structs, or maybe some simple data classes.

The other thing that was very useful was to define a split() function, equivalent to the split() function in Python. I use split() and join() quite a bit in Python for similar text processing tools. They really speed up string processing, allowing you to tear apart and reassemble strings easily. That also crosses the C/C++ string boundary: split() takes a C-style character array and splits it into a vector of strings (std::vector<std::string>).

I used a number of typedefs of the standard objects as syntactic sugar. That's a big help when declaring an iterator for an unordered map of composed pairs.

With the split() function and the typedefs of the standard objects acting as power tools, the code was straightforward.

The resulting output of the tool makes it easy to navigate the logs and correlate activity. One useful modification would be to have it group all the other non-message logs line with the nearest message (though that brings up the problem of deciding whether the lines should be grouped with the nearest subsequent message, or the nearest previous message). That would be especially useful behind a GUI like tkdiff (see, there's that diff thinking again...).

For another example of code like this, see More C+-.

The source, msgresolve.cpp (I had to do a little odd line-folding to make it fit in the width below):

// Usage: msgresolve <clientLog> <serverLog>
//
// Resolve client/server logs from the client perspective. That
// treats the total sequence of messages as 3 sections:
//   1) Initial unmatched server messages.
//   2) Client messages that may be matched or unmatched,
//      interspersed with unmatched server messages.
//   3) Remaining unmatched server messages.
//
// This is an example of a C++ program that is written mostly
// in plain C style, but that makes use of the container and
// composition classes in the C++ standard library. It is a
// lightweight use of C++ with no user-defined classes.
//
// 2018 Steve Branam <sdbranam@gmail.com> learntocode

#include <iostream>
#include <vector>
#include <list>
#include <unordered_map>

#define SERVER_PREFIX "    "

enum ARGS
{
    ARGS_PROGNAME,
    ARGS_CLIENT_LOG,
    ARGS_SERVER_LOG,
    ARGS_REQUIRED
};

enum CLIENT
{
    CLIENT_TIMESTAMP,
    CLIENT_FILE,
    CLIENT_LINE,
    CLIENT_SEVERITY,
    CLIENT_DIRECTION,
    CLIENT_HASH_KEYWORD,
    CLIENT_HASH,
    CLIENT_LEN,
    CLIENT_BYTES_KEYWORD,
    CLIENT_TIMESTAMP_LEN = 10
};

enum SERVER
{
    SERVER_DATE,
    SERVER_TIME,
    SERVER_THREAD,
    SERVER_SEVERITY,
    SERVER_FUNC,
    SERVER_CLIENT,
    SERVER_DIRECTION,
    SERVER_HASH_KEYWORD,
    SERVER_HASH,
    SERVER_LEN,
    SERVER_BYTES_KEYWORD,
    SERVER_TIME_LEN = 16
};

typedef std::string String;
typedef std::vector<String> StringVec;
typedef std::pair<String, String> StringPair;
typedef std::list<StringPair> MsgList;
typedef std::unordered_map<String, String&> MsgDict;
typedef std::pair<String, String&> MsgDictEntry;

MsgList clientTimestamps;
MsgDict clientReceives;
MsgDict clientTransmits;

MsgList serverTimestamps;
MsgDict serverReceives;
MsgDict serverTransmits;

StringVec split(char* str, const char* delim)
{
    StringVec strings;

    char *token = std::strtok(str, delim);
    while (token != NULL) {
        strings.push_back(token);
        token = std::strtok(NULL, delim);
    }
    
    return strings;
}

bool isClientTimestamp(const String& str)
{
    if (str.size() == CLIENT_TIMESTAMP_LEN) {
        for (int x = 0; x < str.size(); ++x)
        {
            if (!isdigit(str[x])) {
                return false;
            }
        }
        return true;
    }
    return false;
}

bool isServerTime(const String& str)
{
    if (str.size() == SERVER_TIME_LEN) {
        for (int x = 0; x < str.size(); ++x)
        {
            if (!isdigit(str[x]) &&
                (str[x] != ':') &&
                (str[x] != '.') &&
                (str[x] != ']')) {
                return false;
            }
        }
        return true;
    }
    return false;
}

bool isClientRxTx(const StringVec& fields)
{
    return ((fields.size() > CLIENT_BYTES_KEYWORD) &&
            isClientTimestamp(fields[CLIENT_TIMESTAMP]) &&
            (fields[CLIENT_DIRECTION] == "RX" ||
             fields[CLIENT_DIRECTION] == "TX") &&
            (fields[CLIENT_HASH_KEYWORD] == "hash") &&
            (fields[CLIENT_BYTES_KEYWORD] == "bytes\n" ||
             fields[CLIENT_BYTES_KEYWORD] == "bytes,"));
}

bool isServerRxTx(const StringVec& fields)
{
    return ((fields.size() > SERVER_BYTES_KEYWORD) &&
            isServerTime(fields[SERVER_TIME]) &&
            (fields[SERVER_DIRECTION] == "RX" ||
             fields[SERVER_DIRECTION] == "TX") &&
            (fields[SERVER_HASH_KEYWORD] == "hash") &&
            (fields[SERVER_BYTES_KEYWORD] == "bytes\n" ||
             fields[SERVER_BYTES_KEYWORD] == "bytes,"));
}

bool loadClient(const char* fileName)
{
    FILE* file = std::fopen(fileName, "r");
    
    if (file) {
        char buffer[1000];
        while (std::fgets(buffer, sizeof(buffer), file) != NULL) {
            String line(buffer);
            StringVec fields = split(buffer, " ");

            if (isClientRxTx(fields)) {
                // Remove trailing comma.
                fields[CLIENT_HASH].pop_back();

                String key(fields[CLIENT_HASH]);
                key.append(fields[CLIENT_LEN]);

                String xref(fields[CLIENT_DIRECTION]);
                xref.append(key);

                clientTimestamps.push_back(StringPair(xref, line));
                if (fields[CLIENT_DIRECTION] == "RX") {
                    clientReceives.insert(MsgDictEntry(key,
                                   clientTimestamps.back().second));
                } else {
                    clientTransmits.insert(MsgDictEntry(key,
                                   clientTimestamps.back().second));
                }
            }
        }
        std::fclose(file);
        return true;
    }
    std::cout << "Failed to open client file "
              << fileName << std::endl;
    return false;
}

bool loadServer(const char* fileName)
{
    FILE* file = std::fopen(fileName, "r");
    
    if (file) {
        char buffer[1000];
        while (std::fgets(buffer, sizeof(buffer), file) != NULL) {
            String line(buffer);
            StringVec fields = split(buffer, " ");

            if (isServerRxTx(fields)) {
                // Remove trailing comma.
                fields[SERVER_HASH].pop_back();

                String key(fields[SERVER_HASH]);
                key.append(fields[SERVER_LEN]);

                String xref(fields[SERVER_DIRECTION]);
                xref.append(key);

                serverTimestamps.push_back(StringPair(xref, line));
                if (fields[SERVER_DIRECTION] == "RX") {
                    serverReceives.insert(MsgDictEntry(key,
                                   serverTimestamps.back().second));
                } else {
                    serverTransmits.insert(MsgDictEntry(key,
                                   serverTimestamps.back().second));
                }
            }
        }
        std::fclose(file);
        return true;
    }
    std::cout << "Failed to open server file"
              << fileName << std::endl;
    return false;
}

void printRxSeparator()
{
    std::cout << "   /" << std::endl
              << "  <" << std::endl;
}

void printTxSeparator()
{
    std::cout << "  \\" << std::endl
              << "   >" << std::endl;
}

void printTransactionSeparator()
{
    std::cout << std::endl
              << "---------" << std::endl
              << std::endl;
}

// Find next server match for client processing, processing any
// unmatched server messages along the way.
void findNextServerMatch(MsgList::iterator& curServer)
{
    for (bool found = false;
         !found && curServer != serverTimestamps.end();) {
        std::string& xref(curServer->first);
        std::string key(xref.substr(2));
        std::string& server(curServer->second);
        
        if (xref[0] == 'R') {
            found = (clientTransmits.find(key) !=
                     clientTransmits.end());
            if (!found) {
                std::cout << "Client transmit not found"
                          << std::endl;
                printTxSeparator();
                std::cout << SERVER_PREFIX << server;
            }
        } else {
            found = (clientReceives.find(key) !=
                     clientReceives.end());
            if (!found) {
                std::cout << SERVER_PREFIX << server;
                printRxSeparator();
                std::cout << "Client receive not found"
                          << std::endl;
            }
        }
        
        if (!found) {
            printTransactionSeparator();
            curServer++;
        }
    }
}

// Process all client messages, checking for unmatched server
// messages along the way.
void processClient(MsgList::iterator& curServer)
{
    for (MsgList::iterator curClient = clientTimestamps.begin();
         curClient != clientTimestamps.end();
         curClient++) {
        std::string& xref(curClient->first);
        std::string key(xref.substr(2));
        std::string& client(curClient->second);
        MsgDict::iterator match;

        if (xref[0] == 'R') {
            match = serverTransmits.find(key);
            if (match == serverTransmits.end()) {
                std::cout << SERVER_PREFIX
                          << "Server transmit not found" << std::endl;
            } else {
                std::cout << SERVER_PREFIX << match->second;
            }

            printRxSeparator();
            std::cout << client;
        } else {
            std::cout << client;
            printTxSeparator();
            
            match = serverReceives.find(key);
            if (match == serverReceives.end()) {
                std::cout << SERVER_PREFIX
                          << "Server receive not found" << std::endl;
            } else {
                std::cout << SERVER_PREFIX << match->second;
            }       
        }
        printTransactionSeparator();
        
        if (match != serverReceives.end()) {
            // Matched, advance server iterator and find next
            // matching server msg.
            findNextServerMatch(++curServer);
        }
    }
}

void resolve()
{
    MsgList::iterator curServer = serverTimestamps.begin();

    // Handle any initial unmatched server messages.
    findNextServerMatch(curServer);
    
    // Handle client messages interspersed with any unmatched
    // server messages.
    processClient(curServer);

    // Handle any remaining unmatched server messages.
    findNextServerMatch(curServer);
}

int main(int argc, char* argv[])
{
    if (argc < ARGS_REQUIRED ||
        String(argv[1]) == "-h") {
        std::cout << "Usage: " << argv[ARGS_PROGNAME]
                  << " <clientLog> <serverLog>" << std::endl;
        return EXIT_FAILURE;
    }
    else {
        if (loadClient(argv[ARGS_CLIENT_LOG]) &&
            loadServer(argv[ARGS_SERVER_LOG])) {
            resolve();
        } else {
            return EXIT_FAILURE;
        }
    }
    return EXIT_SUCCESS;
}

Sample client log (the 0x11111111 hashes are ones I deliberately changed to break the match):

0345604820          comm.c, 1529, D: TX hash 0x47e21fdd, 185 bytes, msg type 3
0345605799          comm.c, 1426, D: RX hash 0xd331bb95, 35 bytes
0345605916          comm.c, 1529, D: TX hash 0x2f66bbd6, 180 bytes, msg type 15
0345606875          comm.c, 1426, D: RX hash 0x11111111, 28 bytes
0345607011          comm.c, 1529, D: TX hash 0x6924ebfd, 69 bytes, msg type 16
0345607146          comm.c, 1426, D: RX hash 0x183d710c, 33 bytes
0345607215          comm.c, 1529, D: TX hash 0x5c4b78f4, 504 bytes, msg type 18

Sample server log:

[2018-02-05 20:50:04.093798] [0x00007f3412dc8700] [debug]   send_msg()  00000062 TX hash 0xfcf3f009, 33 bytes, msg type 19
[2018-02-05 20:50:04.101101] [0x00007f3412dc8700] [debug]   send_msg()  00000062 TX hash 0xca5c8aea, 53 bytes, msg type 15
[2018-02-05 20:51:45.796547] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x47e21fdd, 185 bytes
[2018-02-05 20:51:45.812284] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0xd331bb95, 35 bytes, msg type 3
[2018-02-05 20:51:46.894310] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x2f66bbd6, 180 bytes
[2018-02-05 20:51:46.894661] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0x7495ff13, 29 bytes, msg type 17
[2018-02-05 20:51:46.894829] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0x183d710c, 33 bytes, msg type 19
[2018-02-05 20:51:46.903009] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0xc1575ef6, 53 bytes, msg type 15
[2018-02-05 20:51:47.894246] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x11111111, 68 bytes
[2018-02-05 20:51:48.732482] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x5c4b78f4, 504 bytes
[2018-02-05 20:52:39.990683] [0x00007f34125c7700] [debug]   handle_msg()  :00000062 RX hash 0x15667979, 185 bytes
[2018-02-05 20:52:39.999387] [0x00007f34125c7700] [debug]   send_msg()  :00000062 TX hash 0x3b1bf5ec, 35 bytes, msg type 3

Sample output:

$ ./msgresolve client.log server.log
    [2018-02-05 20:50:04.093798] [0x00007f3412dc8700] [debug]   send_msg()  00000062 TX hash 0xfcf3f009, 33 bytes, msg type 19
   /
  <
Client receive not found

---------

    [2018-02-05 20:50:04.101101] [0x00007f3412dc8700] [debug]   send_msg()  00000062 TX hash 0xca5c8aea, 53 bytes, msg type 15
   /
  <
Client receive not found

---------

0345604820          comm.c, 1529, D: TX hash 0x47e21fdd, 185 bytes, msg type 3
  \
   >
    [2018-02-05 20:51:45.796547] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x47e21fdd, 185 bytes

---------

    [2018-02-05 20:51:45.812284] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0xd331bb95, 35 bytes, msg type 3
   /
  <
0345605799          comm.c, 1426, D: RX hash 0xd331bb95, 35 bytes

---------

0345605916          comm.c, 1529, D: TX hash 0x2f66bbd6, 180 bytes, msg type 15
  \
   >
    [2018-02-05 20:51:46.894310] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x2f66bbd6, 180 bytes

---------

    [2018-02-05 20:51:46.894661] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0x7495ff13, 29 bytes, msg type 17
   /
  <
Client receive not found

---------

    Server transmit not found
   /
  <
0345606875          comm.c, 1426, D: RX hash 0x11111111, 28 bytes

---------

0345607011          comm.c, 1529, D: TX hash 0x6924ebfd, 69 bytes, msg type 16
  \
   >
    Server receive not found

---------

    [2018-02-05 20:51:46.894829] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0x183d710c, 33 bytes, msg type 19
   /
  <
0345607146          comm.c, 1426, D: RX hash 0x183d710c, 33 bytes

---------

    [2018-02-05 20:51:46.903009] [0x00007f34135c9700] [debug]   send_msg()  :00000062 TX hash 0xc1575ef6, 53 bytes, msg type 15
   /
  <
Client receive not found

---------

Client transmit not found
  \
   >
    [2018-02-05 20:51:47.894246] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x11111111, 68 bytes

---------

0345607215          comm.c, 1529, D: TX hash 0x5c4b78f4, 504 bytes, msg type 18
  \
   >
    [2018-02-05 20:51:48.732482] [0x00007f34135c9700] [debug]   handle_msg()  :00000062 RX hash 0x5c4b78f4, 504 bytes

---------

Client transmit not found
  \
   >
    [2018-02-05 20:52:39.990683] [0x00007f34125c7700] [debug]   handle_msg()  :00000062 RX hash 0x15667979, 185 bytes

---------

    [2018-02-05 20:52:39.999387] [0x00007f34125c7700] [debug]   send_msg()  :00000062 TX hash 0x3b1bf5ec, 35 bytes, msg type 3
   /
  <
Client receive not found

---------

We Need To Build Security In

2018-03-17T10:18:00.000-07:00

The Old Priorities

For a long time, the basic priorities for software were:

Functional: does it work right?
Performance: is it fast enough?

The first is obvious. If software doesn't work right, it isn't going to be useful. It has to be a correct design, correctly implemented.

Once the first has been achieved, the second has become critically important. As systems scale up, performance has become an enormous driver. Do everything you can to make it fast, as long as it still works right (and there's an argument to be made for putting performance first, then make it work right subject to maintaining performance).

This is often expressed as "first make it work, then make it fast".

Failure to achieve either of these can mean failure in the marketplace.

In a few cases, security was also a requirement. But often, it wasn't. Or it was a distant third priority or an afterthought, always on the losing side of compromises for the first two.

The result is that we've sacrificed security on the three-legged altar of time to market, convenience, and performance. We've built everything with the assumption that everyone out there is well-behaved, using things only as they were intended to be used.

I've got some bad news for you sunshine, there are bad people out there. They're all too happy to slip into our insecure systems and have their way. These are people who actively search out ways to abuse, confuse, and misuse systems for their own purposes.

They have a variety of motivations and goals, with a variety resources at their disposal, from the single script kiddie just wanting to impress his friends to criminals, terrorists, and state-sponsored cyberespionage and cyberwarfare groups.

The potential consequences of these attacks range from minor annoyances to financial disaster to service outages to outright physical destruction, ranging in scope from personal to national. They can ruin lives.

We see real cases of this daily in the news, in data breaches; botnet recruiting (taking over legitimate machines for use as bots); DDOS attacks; identity theft; account takeovers; social media fake news, fake accounts, and fake followers; ATM and POS skimming, siphoning, and jackpotting; all manner of large and small financial attacks and scams; ransomware of critical data systems; industrial espionage; and other attacks and disruptions.

Just Google any of those terms if you want some depressing reading. Every new technology just seems to bring a whole new raft of attack opportunities.

At the risk of sounding overly alarmist, we've built an incredibly fragile house of cards, completely permeable to bad actors. The Big Bad Wolf doesn't even need to huff or puff. All he has to do is inhale to bring it down.

It's equivalent to doing all your banking by storing your money in grocery bags outside your front door.

And yet our lives increasingly depend on these systems. We've made ourselves completely vulnerable. We've left ourselves completely exposed.

Security Needs To Be The New Top Priority

Especially with the adoption of ubiquitous network connectivity over the past decade, that needs to change. Security needs to be the primary requirement, and the other two need to compromise to support it:

Security: is it secure?
Functional: does it work right?
Performance: is it fast enough?

Now getting it to work right and performance need to be subject to security. Does it work right, and still maintain security? Do everything you can to make it fast, as long as it's still secure and still works right.

First make it secure, then make it work, then make it fast. And make sure it stays secure.

That means when making design and implementation decisions, they need to done in such a way as to favor security. There are choices and ways of doing things that lead to insecure software. Make the choices that lead to secure software.

Security has really been a wholly overlooked critical segment of software engineering. In retrospect, that's irresponsible.

In other types of engineering, safety is the analogous property. In automobile or aircraft design, safety is a critical area. Imagine what would happen to a car company that ignored safety.

We need to add a fourth leg to that altar: security, time to market, convenience, and performance.

Build Security In

Here I'm adopting Gary McGraw's mantra: build security in. That means you address security first, then achieve proper functioning and performance while maintaining it.

I'll temper that with Bruce Schneier's key point: security is a trade-off. That means there's no such thing as absolute security, and you get security by giving something up.

I look at the combination of the two like this: we must focus on security from the start, but we have to realize that it can only get us so far within the context of the larger environment, and we're going to have to give up something in functional convenience and performance.

I'm not a security expert. I'm a student of security, so that I can become a practitioner. That's what we all need to do, become students of security so that we can become practitioners, looking to experts like McGraw and Schneier to guide us in the appropriate practices.

Real security engineering requires you to think from both sides of the fence. You need to think like a good guy defender ("white hat") and a bad guy attacker ("black hat").

On the white hat side, you need to know the proper security practices to follow. On the black hat side, you need to know what attacks will be arrayed against you; otherwise you end up creating the software version of the Maginot line, an ineffective defense against the actual attack.

Security isn't something you bolt on after the fact. There is no "security layer". It has to be built in from the beginning. It has to be interwoven throughout, part of the raw fabric.

And just because one part is secure doesn't mean that all the rest is safe. It's all too easy to undermine the security by not maintaining vigilance system-wide, throughout all uses of the system and the data it produces, in all environments and contexts, over its entire life.

Security is easy to get wrong and hard to get right, and easy to get wrong again once you get it right. There are a lot of details. Understanding those details and how they all fit together takes effort. That's why you have to study the literature and learn how to apply the techniques properly.

Some of the recommendations may seem arbitrary. For instance, a recommendation not to use a particular library function, because it's been the source of many security vulnerabilities in the past. You can say, well, I'm going to use it correctly in my code so that doesn't happen.

But what about a year from now, when you've moved on to another project, or you've left the company, and someone else comes in and has to make some changes to add a new feature? Or they lift your code out to a different context. They may not notice the potential for a problem and end up making your formerly safe code unsafe.

Borrowing a line from the top 10 security design flaws document in the reading list below, designing for security should take into account that code typically evolves over time, resulting in the risk that gaps in security are introduced in later stages of the software life-cycle.

What Causes Vulnerabilities?

Vulnerabilities are problems that can be exploited by attackers. They are the unlocked doors that allow entry. Not all software problems result in security vulnerabilities. But software problems are a rich ground for finding vulnerabilities. What causes them?

We can look at software correctness in two dimensions, design and implementation. Each can be either correct or incorrect. Adopting McGraw's terminology, "flaws" are problems in design. "Bugs" are problems in implementation.

Note that I'm lumping requirements in with design, so incorrect requirements implies incorrect design. You could treat requirements as a third independent dimension that can be correct or incorrect, but the results are really the same for this discussion.

This gives us four quadrants into which software may fall:

Software is problem-free in only one quadrant: correct design (free of flaws), and correct implementation of that design (free of bugs).

It's very important to realize that in two of the quadrants where one dimension is correct, you are still doomed to have software problems. You can have a correct design, but incorrect implementation. Or, you can have a perfect, bug-free implementation, but of an incorrect design.

If you treat requirements as a third dimension, that produces a cube of eight octants. You can see that this discussion generalizes to the same thing. Software is problem-free in only one octant: correct requirements, correct design to meet those requirements, and correct implementation of that design. If the requirements are incorrect, no matter how perfect the design and implementation, the software has problems.

So for simplicity, we can collapse it down to the two-dimensional discussion. Just be aware that if you get the requirements wrong, the design is by definition incorrect (since it is designed for the wrong thing, no matter how perfectly done).

What all this means is that there are many opportinities to create a problem, and a potential vulnerability.

That's part of what I mean when I say security is hard to get right, and easy to get wrong. The other part is that there are lots of subtle details, and getting any single one wrong risks undermining all the rest.

That's what real engineering is about, dealing with all that, being rigorous and thorough and getting it all right top to bottom, beginning to end. That's what it means to be a responsible professional. Yeah, it's complicated. Yeah, it's hard work.

Are the odds really as bad as just a 1 in 4 chance of getting it right, or even 1 in 8? That may be abusing probability and statistics to overstate the situation, but it does show that the odds are against you.

And if you aren't testing for security vulnerabilities, you can bet that those bad people are. They're out there actively searching for your systems and probing them for vulnerabilities. They will find them. Then all you've done is added to the problem.

The tools for evaluating and implementing security are useful to both defenders and attackers. Regardless of how you use those tools to improve security (if at all), adversaries are using them to pick your systems apart.

That's why you need to learn how to use them, and why you have to put on the black hat and think that way. Where attackers will use the results to attack your system, you can use those same results to feed back into the development process to improve the design and implementation of the system from a security standpoint.

Next Steps

The first step is awareness. That's what this post is about. The second step is learning. The third step is putting the knowledge into practice. The fourth step is maintaining continuous vigilance.

It starts with us, the developers. It also ends with us, because no one else is going to do it.

Reading List

This is the reading list I've accumulated for the second step, learning, that I'm working my way through. There's some overlap here with my reading list from Testing Is How You Avoid Looking Stupid. Once again, the market in used books helps keep the cost down.

Interestingly, most of these are over 10 years old. Yet they remain as timely as ever. The same vulnerabilities still show up repeatedly. But their potential impact on our real daily lives has grown significantly. These are no longer abstract problems.

There are two nice starting points. They help set the background necessary to appreciate the others:

First, because it's free, relatively brief, and a nice overview of security design flaws, is the IEEE Computer Society Center for Secure Design paper AVOIDING THE TOP 10 SOFTWARE SECURITY DESIGN FLAWS.
Second, another free overview of common security risks, the Open Web Application Security Project paper OWASP Top 10 Application Security Risks - 2017.
Third, McGraw's book Software Security: Building Security In, 2006, is a nice overview of the various considerations and methods for incorporating security, including the white hat/black hat duality.

Here's the remainder of the list, in no particular order, which will no doubt lead to many others:

Building Secure Software: How to Avoid Security Problems the Right Way, 2001, John Viega, Gary McGraw.
Exploiting Software: How to Break Code, 2004, Greg Hoglund, Gary McGraw.
Schneier on Security, 2008, Bruce Schneier.
Update Oct 1, 2018: Click Here to Kill Everybody: Security and Survival in a Hyper-connected World, 2018, Bruce Schneier. This is Schneier's newest book, focused on IOT-related security, a must-read given how pervasive Internet-connected systems have become to our daily lives.
Security Engineering: A Guide to Building Dependable Distributed Systems, 2008, Ross Anderson.
Writing Secure Code: Practical Strategies and Proven Techniques for Building Secure Applications in a Networked World (Developer Best Practices), 2004, Michael Howard, David LeBlanc.
SEI CERT C Secure Coding Standard: Rules for Developing Safe, Reliable, and Secure Systems (free downloadable PDF), 2016, multiple authors.
Secure Coding in C and C++ (2nd Edition) (SEI Series in Software Engineering), 2013, Robert C. Seacord. Seacord is one of the authors of the SEI CERT C Secure Coding Standard above.
The Shellcoder's Handbook: Discovering and Exploiting Security Holes, 2nd Ed. 2007, Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte.
Forensic Discovery, 2005, Dan Farmer, Wietse Venema.

For a little perspective on the nature of vulnerabilities, see C.A.R. "Tony" Hoare's presentation on null references, what he calls his "billion dollar mistake" (though perhaps karma and cost balance out, since he also invented the quicksort algorithm, among many other brilliant contributions to computer science).

In addition to McGraw's and Schneier's websites, several good sources for security-related news and information:

Risks Digest, Forum on Risks to the Public in Computers and Related Systems, ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator. This is where it all starts for me, fascinating reading (in the way watching a train wreck is fascinating).
CMU SEI Cybersecurity, Carnegie Mellon University Software Engineering Institute cybersecurity main page.
CMU SEI CERT Division, CMU SEI Computer Emergency Response Team.
Krebs On Security, Brian Krebs.
Threatpost.
Open Web Application Security Project (OWASP)
Others? Probably, but also be aware that this is a topic ripe for abuse, so CHECK YOUR SOURCES AND CORROBORATE YOUR INFORMATION. 'Nuff said.

First Code

2017-12-10T13:04:00.001-08:00

(Go back to Learn To Code Introduction)

Let's jump into some code. I'll be spewing terms right and left here. I'll deal with some quickly, just enough detail to get by, but defer others for later discussion.

I'm also going to be a bit wordy, saying the same things in different ways so that you pick up the terminology and the different ways people express these things. Forgive me if I beat a concept to death. As you'll see, many terms also get reused for different things in a mix of formal and informal usage. That's why terminology can be so confusing.

And for everything I say, there are exceptions, arguments, counter-arguments, and many more details. I'll address some of those in later posts. For now just bear with me so we don't get too far off into the weeds.

For every characteristic of a given language, there are those who think it's great, and those who think it's terrible. Good or bad, some things are certainly a source of confusion. I take a neutral approach. A language is a tool, it is what it is. Understand the pros and cons and bear them in mind when using it.

An important point to remember is that there are always multiple ways to do something. From the formatting of source code to data structures, design, and organization of a program, you have virtually infinite choices. These rapidly escalate into religious wars. Some choices are worth arguing over. Some aren't.

The C Language

C is a high-level language (as opposed to a low-level language), which means it is a human-readable text language where source code specifies the instructions for how a program runs. However, computers don't execute high-level code, they execute binary machine instructions, also known as machine code. At some point, high-level source code needs to be translated to machine code so that it can be executed.

This brings up the critically important concept of abstraction, which will show up in many ways in this series. It takes multiple machine instructions to carry out each source instruction.

High-level languages elevate your thinking to a higher level of abstraction, allowing you to abstract the low-level details. This is a huge benefit, because instead of having to think about all the tiny details of machine instructions, you can focus on the higher-level concepts of your program logic.

Contrast this to assembly language, which is a human-readable low-level language. Assembly language statements translate one for one to machine code, so you have to deal with all those low-level details.

C is a compiled language (as opposed to an interpreted language), which means a software tool called a compiler translates source code into machine code, also called object code. The compiler compiles source code instructions and generates object code corresponding to their logic.

Each source file produces an object file. These files are often referred to simply as sources and objects (but don't confuse this use of the term object with its use in object-oriented programming (OOP)).

A tool called a linker then links your objects with objects from pre-built runtime libraries to produce the final complete program. The result is an executable, also known simply as a binary.

This set of tools and libraries comprise the toolchain, and are specific to the type of system where you'll be running the binary. That's why there are separate versions of programs for Mac and for Windows.

Once you've built the binary with this build process, you can run it any number of times without having to run it through the toolchain again. You only have to rebuild if you make a change to the source.

You can distribute the binary to other people who have the same type of system without having to give them your source code. They don't need to have the toolchain to be able to use the binary.

C is a statically-typed language (as opposed to a dynamically-typed language, static meaning constant, fixed, unchanging, and dynamic meaning varying, changing), which means you have to declare an item of data in a type declaration to tell the compiler what type it is (integer numeric, floating point numeric, character string, etc.) before you can use it, and you can only store that type of data in it (that's the static part, the fact that the type is fixed ahead of time).

C has rules about what words are reserved as part of the language, known as key words, how you can name your own things as user-defined names, punctuation, and how to form statements. This is the syntax of the language, just as grammar and spelling rules are the syntax of spoken languages.

Those statements have some particular meaning and cause the program to behave in a particular way. This is the semantics of the language, just as the meaning and implications of sentences are the semantics of spoken languages.

And like spoken languages, you can construct statements that are syntactically correct, but semantically incorrect, such as saying, "The sky is fast." That's a perfectly legal sentence, but it doesn't make any sense. Similarly, you can write code that is legal, but doesn't do what you want.

To make a program that does what you want, you have to write code that is both syntactically correct, and semantically correct. The compiler will tell you if you make syntax errors, so you correct the code and try again, but once you have correct syntax, it can't tell you anything about the semantics. You have to run the program and test it to tell if you got the semantics right.

That's the real challenge of software development. The compiler will quickly help you find and correct syntax errors. But a complex piece of software can have many behaviors, and testing and verifying them can be as much work as writing it in the first place.

Further complicating things, while there's only one set of correct behaviors that you want it to to, there's an infinite variety of random incorrect things it can do if the semantics are wrong. When it does strange and unexpected things, you have figure it out so you can correct the semantics. This can be time-consuming and frustrating. "Well, I know what I wanted it do, but what did it actually do? And why?"

The classic book The C Programming Language by Kernighan and Ritchie (known as K&R) established the tradition of the "hello, world" program before getting into slightly more complex examples. I'll buck that tradition by skipping right to the latter. Like their examples, this provides the framework to start presenting lots of details.

K&R described the initial version of the C language, defining the original syntax and semantics. Over the years, the language has been changed to improve and standardize it. The current version is known as C11, for the 2011 standard. The previous version was C99.

You need to know which standard your compiler supports so that you write the code it will understand. Some language version differences are minor, making the syntax a little more convenient, and some are major, changing the way you do things. I'll use C11 here, but I have a tendency to use older style when I don't think about it, and you'll run into code written that way out in the real world.

Source Control

You can find all the source code for this series in a public GitHub repository at https://github.com/sdbranam/learntocode. GitHub is an online version control system (VCS), a place that stores source code, also known as a source control system. It can store multiple versions of the code. There are other VCS's besides GitHub, which is actually an online version of git.

A VCS has two main purposes: protecting code against loss, and sharing code among multiple developers. Source code can be lost two ways: by deleting a file (losing the entire file), or by changing its contents (losing that particular version of the code).

By storing multiple versions, a VCS allows any version to be recovered. It also allows tracing specific changes to specific developers, so you can tell who did what to the code, known as the annotate or blame function.

Sharing allows multiple people to work on code, or distribution of code from the authors to others, as I'm doing here. Settings on the repository, known as a repo, control who is allowed to see its contents (you might want it to be private within your company), and who is allowed to make changes to it (you might only want authorized developers to be able to change it).

This is my repo, that I've made publicly accessible. Anyone can create their own public repo for free, and I'll show you how to do that later, since you'll want to have one for working on this series. A public repo is also a good way to showcase your work to others.

The Program

This is file printargs.c , containing the entire program:

/* 
 * Printargs prints the list of arguments from the command line.
 * It returns EXIT_SUCCESS if at least one argument was specified,
 * or EXIT_FAILURE if no arguments were specified.
 *
 * 2017 Steve Branam <sdbranam@gmail.com> learntocode
 */

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    int i;

    if (argc < 2) {
        printf("Usage: %s <arguments>\n"
               "Prints command line arguments.\n",
               argv[0]);
        return EXIT_FAILURE;
    }
    else {
        for (i = 0; i < argc; ++i) {
            printf("%d: %s\n", i, argv[i]);
        }
    }
    return EXIT_SUCCESS;
}

Note the different colors. This is called syntax highlighting, which helps in visually navigating through code. Many source code editors do this, along with other convenience features that help speed up working on code.

I used the online source code formatter http://hilite.me/ to produce this listing, with CSS: "border:solid gray;border-width:1px 1px 1px 1px;padding:.1em .2em;" and Style: "emacs" (although it appears the blog settings override the 1px border). I ran the output of that through a crude Python script to extract specific lines below.

The way to read code is to scan visually looking for the blocks, to get a feel for the overall shape. And when I say shape, I mean that literally. The way it's spaced out and indented is a visual guide to its logic.

Spacing and indentation help you follow that structure. Code without spacing or indentation is very hard to follow. It's like trying to read a book where all the text is jammed together. Paragraphs help break up the page. Spacing and indentation help break up the source code.

The C compiler ignores spacing and indentation. It simply reads through the file, skipping over them as it parses the statements. With a couple of exceptions that I'll cover below, line boundaries are irrelevant, so you can arrange the code any way you want.

Identify the boundaries between blocks, then pick the ones to dig into further. That's more of that abstraction, allowing you to focus on the bigger things before you get into the finer details, like seeing the forest before you see the trees.

The first blocks to look for are the functions. These are the modular building blocks of code. They provide the overall separation of logic into manageable chunks.

Deciding how to divide things up into those chunks is a major part of the art of coding. It's one of those things that can be difficult to describe, but you know it when you see it. When you do see it, look for more by that person. Good code, like good art or good music, is something to be appreciated and emulated.

There are many guidelines for what constitutes good structuring of functions. For now, think in terms of division of labor. Rather than one big chunk that does it all from start to finish, divide up the work, like delegating a big job to a team of workers, each with their own responsibility, with some of them providing helper services that the others can use.

And just as it's a bad idea to overload an individual worker with too much, it's a bad idea to make a function too long. Break up the work into smaller functions that the larger function can call.

Just like a large team of workers that needs to be organized into a hierarchy of different levels that depend on each other to carry out their responsibilities, organize functions into a hierarchy where they depend on each other to get their work done.

A top level function depends on the next level of functions for the overall program, and those functions depend on a third level of functions for their responsibility, and so on, as deeply nested as necessary. That's how you manage complexity in real software.

Now I'll break down the different sections in the file, known as snippets. This is how I'll go through code throughout this series. I'll go into excrutiating gory detail on this one because there are so many concepts to introduce.

/* 
 * Printargs prints the list of arguments from the command line.
 * It returns EXIT_SUCCESS if at least one argument was specified,
 * or EXIT_FAILURE if no arguments were specified.
 *
 * 2017 Steve Branam <sdbranam@gmail.com> learntocode
 */

Lines 1-7: a comment, free-form text that helps the reader understand the code. The comment is delimited by the /* and */ markers. Everything between theses comment delimiters is ignored by the compiler.

Another style of comment delimiter is the double slash //. Everything from the double slash to the end of the line is a comment. This is one place where line boundaries mean something to the compiler.

This particular comment describes the program. It also lists author information. A brief header comment like this at the top of a file is a big help to readers sifting through files.

9 10	#include <stdio.h> #include <stdlib.h>

Lines 9-10: preprocessor directives that direct the preprocessor to include two system header files at this point in the file. These are also known as system headers, header files, or simply headers. File inclusion is a way to pull other code into the file, breaking things up into modular parts.

These headers are from the standard library, which contains predefined code required to make your source a complete program. The headers themselves contain various declarations, including forward declarations that tell the compiler about the functions in the library.

File stdio.h contains declarations for the standard input/ouput (I/O) functions. File stdlib.h contains declarations for various constants, fixed data values that don't change as the program runs (i.e. they remain constant).

Preprocessor directives are the other place where line boundaries are significant. The preprocessor is actually an initial stage of the compiler that processes the source code text before compiling, executing directives as it finds them.

Each preprocessor directive takes one line, although that can be extended by putting a backslash \ at the end of the line to form a multi-line directive.

int main(int argc, char *argv[])
{
    int i;

    if (argc < 2) {
        printf("Usage: %s <arguments>\n"
               "Prints command line arguments.\n",
               argv[0]);
        return EXIT_FAILURE;
    }
    else {
        for (i = 0; i < argc; ++i) {
            printf("%d: %s\n", i, argv[i]);
        }
    }
    return EXIT_SUCCESS;
}

Lines 12-28: the main function, the top level function in the function call hierarchy. C requires one function in the program to be named main(), which defines the program entry point. This is where the program starts when you run it. Note that I use the function name with empty parenthesis when I refer to it informally.

You can define other functions with any name you want as long as they conform to the C naming syntax.

C encloses things in braces {}. They delimit the function body itself, and blocks within the function. Any number of lines may appear in a block.

There are places where the braces aren't required, when a block contains only one line, but I put them in anyway because a common source of bugs is expanding a one-line block into a multi-line block and forgetting to add the braces.

Look at the shape of the function. You can see it's outline and the shape of the blocks inside, hinting at its logical flow. After identifying the blocks, look at the details inside them. This on is pretty simple, but others can get more complex, with blocks nested within blocks.

Just as the functions work in layers, the code inside them does. As with the hierarchical layers of functions, these are layers of logic, like peeling back the layers of an onion.

But unlike layers of functions that can nest to any depth, you don't want to have too many layers within a single function, or it can be difficult to follow. Code that's difficult to follow has a higher likelihood of bugs (or you end up adding bugs when you try to change it).

Now lets dig down a level into the function.

12	int main(int argc, char *argv[])

Line 12: the function declaration, telling the compiler what the function's call interface is. This is how other code must call the function to invoke it.

The function main() has an interface that's predefined by C, but other functions allow you to specify the interface yourself.

The items separated by commas in the parenthesis () are the function parameters. These are a type of variable that holds the arguments passed into the function when it is called.

Each parameter is identified by its data type, shown in bold green, and its name, shown in black. The asterisk * and square brackets [] indicate characteristics of the arguments. The asterisk means pointer. The brackets mean array. I'll cover pointers later; they're often a source of confusion, but once you understand how to visualize them, they're easy.

A data type specifies what kind of values are used. The int type means integer numbers, such as 0, 1, and -2. The char type means a character, as in a letter, digit, or punctuation mark.

An array is a contiguous block of elements of the same data type. An array of characters forms a character string, or simply string.

Because these types are predefined by the language, they are known as primitive types. You can use these as building blocks to define your own user-defined types.

In the case of main(), the parameters are the command line arguments that are passed to the program itself when you run it. This is one way to get information from the outside world into the program.

The argc parameter is the argument count, and argv is the argument vector that contains the list of all the arguments, including the program name. Vector is another term for an array. Thus argv is an array of pointers to characters.

Because of the way strings work in C, a pointer to a character is often interpreted as a pointer to a whole string of characters, not just a single character. I'll cover that more when I talk about pointers. But that means argv is an array of pointers to strings.

Once you've built the program, typing "printargs hello world" on the command line results in running the program and calling main() with argc containing the value 3, and argv containing the strings "printargs", "hello", and "world". Notice that the first string is the name of the program as it appeared on the command line.

What about that very first int on line 12? That's the function return type, indicating the type of value the function returns to whatever called it. So just as the parameters had a data type and name, the function has a data type and name.

Since main() is called by the operating system (OS), through some extra layers that we won't worry about now, the int return type means that main() returns an integer value to the OS. Since this is the value the program returns when it exits, this is called the exit code.

The term caller refers to whatever code is calling the function, and the term callee refers to the function being called. The caller calls the function with specific arguments. The function, as callee, receives those specific argument values in its parameters, and returns a value to the caller. The caller may use the return value in some way.

Some functions exist purely to produce a value to return to the caller, such as square(x), which computes the mathematical square of x and returns it. The value produced is the primary purpose of the function, and any work performed producing it is a consequence of achieving that goal.

Other functions are intended to do some sort of work, and then return a result indicating the status of the work. The work performed is the primary purpose of the function, and the value returned from it is merely a report of what happened.

The first type of function is closer to the mathematical concept of functions. In its purest form, such a function has no side effects, meaning it does not affect anything else. The only output of the function is the return value, which is based solely on its input parameters. There is an entire field of functional programming based on this.

The second type of function is intended to produce side effects. Such a function is intended to affect other things in the program or the outside world. In this case, main() has the side effect of printing something out. The value returned by main() indicates the success or failure of the program.

A return value that is intended as a status indicator is often referred to as a status code. Some status codes simply indicate a binary status, "true" or "false", "yes" or "no", "success" or "failure". Other status codes may convey more detailed information, often used to discriminate various errors, such as "success", "failure, bad filename", "failure, full disk", or "failure, not authorized".

It's also possible to have a function that doesn't return a value. The side effects of running the function are the only thing that happens. In this case, the return type is declared as void, so this is known as a void function.

What line 12 means is "Main is a function that accepts an integer argument count and an array of character pointers to argument strings, and returns an integer." Everything in line 12, except for the parameter names, forms the function signature. That is, the signature consists of the function name, its parameter types, and its return type.

Saying a signature out loud is a mouthful, so in informal usage you just use the name. But when you write code that calls the function, you have to know the precise signature so that you call the function the right way.

    int i;

    if (argc < 2) {
        printf("Usage: %s <arguments>\n"
               "Prints command line arguments.\n",
               argv[0]);
        return EXIT_FAILURE;
    }
    else {
        for (i = 0; i < argc; ++i) {
            printf("%d: %s\n", i, argv[i]);
        }
    }

Lines 14-26: the function body, everything enclosed in the braces that follow the signature. This defines the function. It's where the work of the function gets done.

This function uses two control structures, an if-else decision in lines 16-26, and a for-loop in lines 23-25, nested in the else block of the decision. Control structures direct the flow of control of execution.

An if-else decision checks some condition, in this case whether the argument count is less than 2, and does something based on the result. It goes one way if the condition is true, and the other way if the condition is false.

If there's nothing to do when the condition is false, you can omit the else portion. You use a simple if decision, that only doing something if the condition is true.

A for-loop repeats the block it contains for some number of times. Each repetition cycle is known as an iteration. It is therefore often used for iterating through something, cycling through it. Iterating through the elements of an array is a common use of for-loops.

Line 14 is another variable of type int, named simply i. This one is a local variable, a variable that is local to the function; it exists only within the scope of the function. After the function returns, it no longer exists and no longer has a value.

This line is both a variable definition and a variable declaration. It declares the type and name of the variable, and defines the memory for it.

The function parameters are also local variables, the difference being that their values are set by the arguments that are passed in.

What exactly is a variable? It's a small portion of memory that contains a value that can change over time based on what the program does. The fact that it can change is what makes it a variable.

C uses call by value, meaning that it passes the values of things into function parameters. But pointers provide a way to call by reference.

The name "i" is very simple and doesn't convey much meaning. However, it's common to use single-letter names for local variables used as simple for-loop controls. For other variables, used in more complex ways, it's better to use more descriptive names.

Line 16 tests the value of argc to see if it's less than 2. If so, it calls library function printf() to print a formatted message. This function was forward-declared in stdio.h, so the compiler knows its signature and can check that I used it correctly (syntactically, not necessarily semantically).

The arguments to printf() are a string that describes the format of the message, and the data values to be formatted.

In this case, the string is a hard-coded constant, meaning the actual string is coded right there where it's used. It's delimited by double-quotes. The \n at the end of each line is an escape sequence that contains a control character called newline. Newline causes a new line to be started in the program output. The %s is a conversion specification that shows where the value of another string should be substituted into the output; the process of substituting values for markers in a string is called string substitution.

There's another subtle thing going on with this function call. Notice that the arguments in a function call are separated by commas. But the comma is missing after the first string in line 17. This is a syntactic convenience called string concatenation, where the compiler joins together all the strings in the source that aren't separated by commas or semi-colons into a single string. This allows you to break up long strings in the source code for readability. So lines 17 and 18 only contain a single string, the first argument to printf().

The second argument, argv[0], is the first element (i.e. the first entry) of the of the argv array. The square brackets [] contain the index of the element, which is 0. You might think that the first one would be 1, but C uses 0-based indexing. It's like the years in a century; the first year of a century is the 0 year, such as 1900 or 2000.

Recall that argv was declared to be an array of pointers to strings. The first element is a therefore a single pointer to a string, so it matches up with the %s conversion specification.

If you run the program with just the name on the command line, no arguments, argc will be 1. When line 16 checks that argc is less than 2, the condition will be true, and the function will execute the block in lines 17-21. The printf() will print out:

Usage: printargs <arguments>
Prints command line arguments.

A usage message like this is a common way to inform the user that they didn't supply all the command line arguments expected, or that the arguments were in some way unacceptable.

After printing that message, line 20 will return from the function, with the value EXIT_FAILURE. This is a symbolic constant that was defined in stdlib.h. It's symbolic because we don't know its actual value here, all we know is a symbolic name that's been given to it. This indicates the program completed with some kind of error.

If you run the program with additional arguments on the command line, the condition in line 16 will be false, and the function will execute the else block in lines 22-26. This consists of the for-loop in lines 23-25.

The for-loop iterates through the items in array argv, printing each one with printf().

The for-loop uses i as the control variable, which it also uses as the index into the array. The for statement has three control expressions in the parenthesis, separated by semicolons, that control how it runs:

The loop initialization, executed once before starting the loop, here initializing i to 0.
The loop condition, executed at the beginning of each cycle, here checking that i is less than argc.
The loop update, executed at the end of each cycle, here pre-incrementing i by 1.

As long as the condition is true, the loop keeps executing. Here, with i starting at 0 and incrementing on each iteration, it will execute until i reaches whatever count is in argc.

You can have an empty initialization expression, if the condition that is being checked is already initialized before the for-loop. You can have an empty update expression, if the condition that is being checked is updated within the loop.

The format string for the printf() in line 24 has a %d conversion specification, which means to substitute a decimal integer value, and a %s for a string. The remaining printf() arguments are the array index, and the array value at that index. So the printf() prints out a number and a string.

It's important to be aware of how the 0-based indexing relates to the specific check in the for-loop condition. Otherwise the loop may not execute enough times, or may execute one time too many. This is a common source of off-by-one bugs.

Incorrect control expressions can also cause dead loops, that never run through any iterations, or infinite loops, that never end.

The easiest way to figure this out is to step through the iterations yourself, remembering that this update expression increments i after every cycle. If the command line is "printargs hello world", argc will be 3. Therefore:

On the first iteration, i will be 0, so the condition is true, and it will print "0: printargs".
On the second iteration, i will be 1, so the condition is true, and it will print "1: hello".
On the third iteration, i will be 2, so the condition is true, and it will print "2: world".
On the fourth iteration, i will be 3, so the condition is false, and the loop terminates.

A simple way to model this on paper is with a table that steps through the index values I, the actual values used in the condition C, the condition result R (t for true, f for false), and the resulting value V represented by that iteration:

I C R V

0 0 < 3 t printargs

1 1 < 3 t hello

2 2 < 3 t world

3 3 < 3 f

I	C	R	V
0	0 < 3	t	printargs
1	1 < 3	t	hello
2	2 < 3	t	world
3	3 < 3	f

Drawing things out like this and stepping through the code yourself is a great way to work out the details, even on a simple example, so that you get the initial conditions and termination conditions right. It's even more helpful when the initialization is something other than 0, or the condition or resulting value is more complex.

Notice also that i isn't used anywhere except in the for-loop, yet I declared it at the top of the function, where any parts of the function could access it (maybe when they shouldn't). A reader might reasonably wonder why I did it that way.

This is one of those cases where my old-version C habits take over when I'm not thinking. That was a requirement of old C. New hotness allows i to be declared where used. So I could have put it right in the for statement:

for (int i = 0; i < argc; ++i) {

That limits the scope of code that can access it, and also makes it clear that this simply-named variable is just the loop control, not used for anything else. That's just one of the subtleties of coding to minimize the potential for errors and maximize understanding, especially as a function gets complex.

The moral here is not only to keep up to date on language versions, but to remember to take advantage of them!

27	return EXIT_SUCCESS;

Line 27: if the function reaches this point, it returns EXIT_SUCCESS, indicating it successfully printed out the arguments.

It's important to remember that since you declared the function as returning a value, you must make sure that every possible return from the function actually does return a value. It's possible for the function to "run off the end" and return without explicitly returning a value. The result is that the caller will get back some random value.

This can be a nasty type of bug, because sometimes that random value might be acceptable to the caller as a valid return value, even though it has no relation to what the function actually did. This can cause very mystifying behavior.

For a function that returns a value, always put an explicit return statement at the end of the function. For a void function, which doesn't have a return value, you can simply let the function run off the end and return implicitly; you can also use a return statement with no value at the end of the function, but that's considered redundant and unnecessary.

C also allows you to have different return points in a function (for a void function, these would be return statements without values). Some people like to code that way, as I did here, with two return statements. Others prefer to have only one return statement at the end of a function, using a local variable to keep track of the value to return.

That's an awful lot to talk about a mere 28 lines of code. But now you're armed with a lot of terminology that will make getting through subsequent code faster. Some of the concepts may be a little shaky, but they'll firm up as we proceed.

Building And Running The Program

The toolchain I'm using is GCC, the GNU C Compiler. It both compiles and links the program. It comes with Linux and Mac OS systems.

There are also free online tools that allow you to build and run C code (and other languages) on a server. These are sandboxed environments that allow you to play around with code without any risk of affecting anyone else. These are useful if you're using a Chromebook or OS that's not setup for software development.

One example of such an online IDE (Integrated Development Environment) is CodingGround. Some include complete online courses, and some include integration with GitHub. My experience with them shows that they're a great way to practice coding, but some can be a bit buggy, resulting in lost coding sessions or other problems. So don't rely on them to save your code.

That's another good reason to create an account on GitHub, which is free to use for public and open source projects, and create your own public repo like my learntocode repo. You can upload and download your files, or even edit them directly on GitHub. If you use an online tool that doesn't have GitHub integration, you can copy-and-paste from a GitHub window into the tool window.

$ gcc -v
Configured with: --prefix=/Library/Developer/CommandLineTools/usr 
 --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.3.0
Thread model: posix

$ gcc printargs.c -o printargs

Lines 1-6: show the version of gcc. I'm building and running on a Mac, so the compiler and standard library themselves are built to run under Mac OS X on an x86 processor, generating code that will run under Mac OS X on an x86 processor.

Line 8: the build command. This is about the simplest possible build command, directing gcc to compile source file printargs.c and output an executable in file printargs. Builds can get quite complex, allowing you to construct software from a number of parts.

If gcc finds a syntax error, it prints it out along with the line number, and won't produce an executable. It may also print warnings, which indicates things that are at risk of being an error. If there are warning but no errors, it will go ahead and produce an executable.

Depending on the severity of an error, the compilation may end prematurely. But the compiler will try to get as far as it can, reporting as many errors as it can.

That's both good and bad. It's nice to get all the errors at once so you can fix them all. But some errors can have a cascading effect that causes the compiler to report many other things as errors because incorrect syntax has thrown it off. With a little experience, you'll learn to pick out the real errors quickly.

Sometimes error messages are obscure. The compiler may be trying to report a very technical issue with your code, but it's not clear what the error means, so you don't know how to fix it. Googling error messages is a useful way to get help interpreting them. You're probably not the first person who had that problem.

Once you have a successful build, indicated by the absence of error messages, you can run it. Here's where it's useful to start thinking about test cases. Because you want to make sure your code works, right? You'll feel stupid if you let someone else run it and it doesn't work properly.

In fact, you should think about test cases before you even write the code, so that you write the code in a way that makes it easy to test. Testability affects how you design the code.

How many possible paths are there through the code? Ideally, you should run the program in a way that exercises each one to test it. That's easy for a simple program like this.

For more complex software, however, that becomes a big job, and it can be difficult to achieve that ideal. Just identifying all the paths can be tricky. Then coming up with inputs that guarantee you cover all those paths is even trickier.

It's further complicated by the fact that some paths are meant to handle error conditions that are hard to produce. And what if the code has poor testability? These issues get off into the whole art of testing.

For this program there are two test cases, going through the two possible paths of the if-else decision:

You run the program with no additional arguments.
You run the program with additional arguments.

There's no need to worry about differentiating between 1 additional argument, 2 additional arguments, 3, etc. They all generalize to the single case "with additional arguments". That simplifies testing so you don't have to keep going with 98 additional arguments, 99 additional arguments, 100...

You can identify test cases by the different control structures in the code, the various decisions and loops that it contains. These create a combinatorial set of possible execution paths. That set gets large quickly, known as combinatorial explosion, because real code has lots of control structures to deal with the various combinations of inputs the many functions may handle.

What about the for-loop in this program? Well, we can see logically that the loop gets executed only if there are additional arguments. Testing loops often includes test cases where the loop has nothing to do. In this program, we can see that such a scenario is impossible. So all the possible loop test cases are already covered by the if-else test cases.

Coming up with a suitable set of test cases is very much an art form. An incomplete set of cases risks missing some bug, that then shows up when someone else uses the code. But excess test cases that are redundant just waste time without telling you any further useful information.

I'll cover more about testing later, because it's an important part of being an RPSD. If you do a poor job of testing, it can have consequences for your career. It can also have consequences for the people who depend on your code. It becomes a matter of being ethical and responsible.

If you think that's overblown, think about the problems caused by software failures you've experienced or heard about in the news. The security breaches, the software crashes, the system failures, and the inconvenience, aggravation, frustration, and misery heaped on people's lives as a result.

I'll be covering more about testing in later posts, but for a separate presentation on it, see Testing Is How You Avoid Looking Stupid.

I can exercise the two test cases for this program by running it with and without additional arguments:

$ printargs
Usage: printargs <arguments>
Prints command line arguments.

$ printargs hello world everywhere
0: printargs
1: hello
2: world
3: everywhere

Lines 1-3: test case 1. As expected, the program prints the usage message, referring to the program name correctly.

Lines 5-9: test case 2. As expected, the program prints each of the three additional arguments that were on the command line, with the correct 0-based indices. It would have been sufficient to use just one extra argument.

The process of manually running each test case like this and examining their results is called manual testing. Manual testing a simple program is pretty easy, but even that can get awfully tedious if something goes wrong and you keep having to repeat the tests as you chase down and correct the bugs. That's especially true if you have to carefully scrutinize every line being printed out to check for an unexpected result.

An RPSD quickly moves from manual testing to automated testing, using some form of test automation. That creates an efficient workflow and gives you a way to repeat the testing later without having to remember all the cases. But that's a topic for another post.

For now, I've proven to myself that the code works as expected.

Learn To Code Introduction

2017-12-03T15:48:00.000-08:00

This will be an ongoing series of posts on learning to code, tagged with the label LearnToCode. You can always get to the series home page by clicking on the Learn To Code button in the bar at the top. It has the complete outline and map of the series.

For beginner posts, I'll assume you have no background in computers other than general use. Therefore I'll briefly explain a lot of terms that may be familiar to those with more experience. For intermediate and advanced posts, that won't be necessary.

There are lots of other "learn to code" resources available already, and I encourage you to use them. It's always good to learn from multiple source, because each one has a slightly different emphasis and take on the material.

The main difference here is that I'm going to teach multiple different languages at once. In fact, I'm going to cover 8 languages. That's my particular take on the material.

What?!? 8 Languages? Isn't that crazy? Won't that be confusing?

Yes, that's certainly ambitious and risks confusion. But I'll take advantage of the fact that there are a lot of common concepts between languages. I'll show you how to apply those concepts in the different languages, taking pains to point out the commonality and the differences.

Why multiple languages? The real secret to a long career is versatility. If you only know how to do one thing, the hot technology of the day, that's great until it's no longer the hot technology of the day. But if you know how to do multiple things, in multiple different ways, you can roll with it and adapt.

That will also help teach you how to acquire a new language. As I said in my welcome post to this blog, the most important skill you can develop is the ability to learn new skills.

There's another very practical reason. As a software developer, you often need to work in multiple different languages at once. Different parts of what you do will be in different environments, where different languages are in use.

Here are the languages:

C
C++
Java
Javascript
Go
Python
Bash
Pseudocode (not a real language, useful for sketching out designs)

On a daily basis in my job, I may work on some C or C++ for one of our embedded devices, some C++ for one of our backend servers, some Python to analyze frames in a video file or automate data retrieval from AWS, or some Bash to automate build steps or file management activities.

I may have to read someone else's Javascript for another backend server or Java for a mobile app. I'm constantly scribbling stuff in pseudocode as I work through my thoughts.

It doesn't stop there. I may have to update some Ruby to handle cloud deployment, or work on some Lisp for Emacs macros.

My facility with each language varies. I'm barely functional with Ruby or Lisp. But a little Googling around and experimentation allows me to get the job done. My experience with other languages helps me out, despite significant differences in them. That's how versatility increases my ability to tackle different problems.

In the process of teaching you, I'll also learn Java, Javascript, and Go myself.

Hang on, didn't I just say I may have to read someone else's Java or Javascript code at work? Right, I can read it, based on the common characteristics it shares with the other languages, but I don't know it well enough to write it. Learning to do that means I'll be able to contribute more at work. And I want to learn Go because it's one of the hot technologies of the day, that I believe has a long life ahead of it.

But if I don't know those languages, how can I teach them? That's another point from my welcome post: teaching is a great way to learn something, because I have to work it out myself well enough to explain it to others, augmented by my experience. See one, do one, teach one.

How will I do all this? I'll jump right in and start showing you example code, then break those examples down and go through them. Realize that in some cases, the code is written in a particular way to illustrate a specific point. So some of it will be very simplistic. You have to walk before you can run.

But I'll move on from there into more advanced topics, including:

Socket programming
Synchronous vs. asynchronous programming
Multithreaded programming
Algorithms

I'll cover applying these in a number of real-world systems, ranging from personal computing to big iron to miniaturized devices:

Desktop applications
Web apps
Mobile apps
Server systems
Embedded systems

I'll also touch on engineering process topics, including:

Software development life cycle (SDLC)
Requirements
Architecture
Design
Testing (see Testing Is How You Avoid Looking Stupid for a discussion of testing)
Source control

Why so much? Because I want to show you how to be a Real Practical Software Developer (RPSD), someone who can really get a job done, not just a Superficial Crappy Software Developer (SCSD).

An RPSD has depth. An SCSD just scratches the surface. An RPSD seeks understanding. An SCSD just seeks a quick fix.

These things are all important to know as an RPSD. Knowing how to code in the languages is just the first step. Making that code do real, useful things, working in a real software development environment, is where it really gets interesting.

I'll be introducing lots of terminology, indicated in italics as above. I may throw some terms out there before actually getting around to explaining them. Just remember that terminology can be slippery, having different meaning and usage to different people and in different contexts.

I'll also have some posts related to general programming knowledge and skills, like working with binary and hexadecimal numbers.

I'll use or refer to a number of online resources and books. The online resources are free, and include tutorials, programming tools, test platforms, and articles. Some of the books are also available as authorized downloadable PDF's that authors and publishers have generously made available for free (some PDF's are unauthorized pirated copies, so be careful what you download).

And let's make something perfectly clear from the start: there's nothing wrong with using Google or StackOverflow to find out how to do something, or as a reminder. As a student learning to code and later as a working RPSD, you'll get to know these as invaluable helpers, along with a few favorite tutorial sites.

I do that all the time, because I can't possibly know or remember everything. Why should I? I have computers to help me do that. They have a vastly larger store of knowledge than any single human.

When you look up something in a book or online, the key is not to just copy code blindly, but to take the time to understand what it's telling you, then adapt it to your situation. That's how you stand on the shoulders of others and increase your own store of knowledge.

I've been doing this for 35 years, and I love it. Building things and seeing other people put them to use gives me the greatest satisfaction. I hope you'll find this series useful as I share what I've learned.

Comments and Questions

I welcome comments here on the blog and emails at sdbranam at gmail dot com. Those might turn into good topics for further posts.

Note that I actively filter out comment spam. If you put a link in a comment, the comment must be directly relevant to the post, and the link must either refer to your personal online resources or to something directly relevant to the post. Otherwise I'll delete it.

(Continue to First Code)

Testing Is How You Avoid Looking Stupid

2017-11-23T11:44:00.000-08:00

This is a presentation I gave at the IOT With The Best online conference on October 14, 2017: Testing Is How You Avoid Looking Stupid.

The SlideShare includes an embedded YouTube video recording of my original presentation (I typically watch things like this at 1.5 or 2x speed, selectable from the Settings menu in the YouTube window, which helps me maintain focus).

The abstract:

As IOT products become more pervasive, they have an increasing ability to adversely affect the lives of their users and those around them. Testing is the due diligence that closes the engineering loop to verify proper behavior. Steve will present an introductory overview to testing for IOT products, covering the IOT triad: embedded IOT devices, backend servers, and frontend apps. He'll talk about the consequences of inadequate testing for companies and individual contributors, and levels and types of testing.

Testing is not an absolute guarantor of quality, and you need to have worked out requirements and design to test against, but without doing it, you'll look stupid.

Skimping on testing also means you'll make life miserable for someone. Maybe even kill them.

Books

Doing this presentation turned out to be a bit expensive, because it set me off on a book-buying binge. Fortunately, there's a robust online market in used books.

This went down three paths. First, I wanted to reference the Toyota unintended acceleration problem as a case study. I was familiar with it from reading Risks Digest (my source for all things safety, reliability, security, and usability).

What I found was Professor Philip Koopman at Carnegie Mellon University. He was a plaintiff's expert witness in one of the lawsuits, and had put together a nice presentation on the problem.

But it also turned out he had written a book on embedded systems entitled Better Embedded System Software (available from his site at half off). I ordered the book and read it immediately. It turned out to be a great overview of a broad range of topics on improving embedded system software.

It also listed a number of other books as recommended reading at the end of each chapter. The thing I like about that is these are curated recommendations, helping select which books to read from the vast ocean of books available and raising awareness of obscure areas.

Off to Amazon! And then of course those books had additional recommended reading as I started working my way through them, so more books...

He also has some good videos at his company website, Edge Case Research (he uses Vimeo for his video; I use the Vimeo Repeat And Speed Chrome extension for watching on Vimeo at 2x speed).

Second, a name that leapt out at me on the speaker's list for the conference was Stephen Mellor. Learning the Ward-Mellor method back in the late 80's was an absolute watershed moment for my career. I've applied parts of it informally ever since.

Three minutes into watching his recorded presentation he mentions that he has a new book out on how to take models directly into code for embedded systems. Stop! Google! Book ordered!

And of course as I started reading that one, it referenced others... These books cover Executable UML, which looks like an excellent follow-on to the Ward-Mellor method (unfortunately, I completely missed the boat on Schlaer-Mellor, but xUML also builds on that). One of the benefits I see in xUML is that it imposes rules and discipline on general UML that provide simplifying structure on what is already an extremely complex endeavor.

Third, there were several titles in the many Amazon recommendations as I placed orders that looked interesting, especially having been sensitized to some of the topics by the other books.

It'll take me a while to complete all these, but so far they've been well worth reading, an excellent addition to my bookshelf and another watershed for my career. There will probably be more.

Here's the full list if you're interested in further reading, organized by reference source:

Philip Koopman:

Better Embedded System Software, 2010, Philip Koopman.

Security Engineering: A Guide to Building Dependable Distributed Systems, 2008, Ross Anderson.

Software Security: Building Security In, 2006, Gary McGraw.
Writing Secure Code: Practical Strategies and Proven Techniques for Building Secure Applications in a Networked World (Developer Best Practices), 2004, Michael Howard, David LeBlanc.

Software Architecture: Perspectives on an Emerging Discipline, 1996, Mary Shaw, David Garlan.
Systems Architecting: Creating & Building Complex Systems, 1991, Eberhardt Rechtin.

Stephen Mellor:

Models To Code: With No Mysterious Gaps, 2017, Leon Starr, George Mangogna, Stephen Mellor.

Executable UML: A Foundation for Model-Driven Architecture, 2002, Stephen J. Mellor, Marc J. Balcer.
Executable UML How to Build Class Models, 2002, Leon Starr.

Amazon recommendations:

Embedded Software Development for Safety-Critical Systems, 2016, Chris Hobbs.
Real-Time Software Design for Embedded Systems, 2016, Hassan Gomaa.
Secure Coding in C and C++ (2nd Edition) (SEI Series in Software Engineering), 2013, Robert C. Seacord. This is particularly interesting because Seacord is the author of the CERT C Secure Coding Standard.
Software Fundamentals: Collected Papers by David L. Parnas, 2001, Daniel M. Hoffman, David M. Weiss, editors.
Software Architecture in Practice (2nd Edition), Len Bass, Paul Clements, Rick Kazman. I chose this over the 3rd edition due to the case studies listed.
Documenting Software Architectures: Views and Beyond (2nd Edition), 2011, Paul Clements, et al.

While I'm here, four other relevant books that I already had and highly recommend:

Computer-Related Risks, 1994, Peter G. Neumann. A compendium of people and companies looking stupid, from the first decade of Risks Digest.
Engineering a Safer World: Systems Thinking Applied to Safety, 2012, Nancy Leveson. The page includes a link to a free PDF download of the book under the "Open Access Title" heading, so there's no excuse for not reading it. This has some absolutely hair-raising case studies, and gives a pragmatic approach to understanding how and why systems fail. You'll never again blame it on "human error".
Real-Time Concepts for Embedded Systems, 2003, Qing Li, Caroline Yao. This is an excellent broad introduction for anyone new to embedded systems, as well as operating systems concepts for multithreaded systems.
Practical UML Statecharts in C/C++: Event-Driven Programming for Embedded Systems, 2008, Miro Samek. This book is just freakin' brilliant, applying the concepts of UML statecharts in the context of different classes of real-time systems, using the concepts outlined in Li and Yao's book. This could serve as the manual model compiler for xUML.

The book of Parnas' writings deserves special mention because Parnas is one of the greats of the field, like Knuth and Dijkstra. Many important concepts in software engineering can be traced to him; references to these papers pop up all over throughout decades of texts (such as in Gomaa's book!).

Welcome!

2017-11-23T07:59:00.001-08:00

Welcome to Flink And Blink, my software engineering blog. I've been developing software since 1982, primarily with a background in networking, such as routers, video streaming servers, and now IOT (Internet Of Things). I work primarily in the embedded and backend spaces, with some dabbling in frontend apps.

I'm mostly self-taught, which really means I've had many teachers, the many authors of books and articles, the many designers and developers who have built the things I've studied and worked on.

The most important thing this has taught me is that learning is a never-ending experience. The most important skill you can develop is the ability to learn new skills.

I love to learn new things, and I'm not afraid to make mistakes doing it. As long as there's no damage and no injury, it's all a learning experience. And a little blood on the deck isn't an injury.

I've always found that if I have technical information and a system to play on, I can learn how to make it go. Experimentation, both the successes and the failures, is a great learning tool.

Thomas Edison said, "Genius is one per cent inspiration, ninety-nine per cent perspiration." I temper that with Nikola Tesla's response: "...a little theory and calculation would have saved him ninety per cent of his labor." While it's Edison we all remember, it's Tesla's AC outlets that we plug everything into.

I like to combine the two approaches into informed experimentation. Try it and see, but think about it first. The analytical and empirical methods make a powerful combination.

Teaching is also a great way to learn. I have to be able to figure things out if I'm going to explain them.

The discipline of writing things down and drawing up diagrams forces me to order my thoughts. That then leaves me with something I can pass on to others to share the knowledge. That's part of the see one, do one, teach one methodology.

I've previously posted software-related things to my woodworking blog, CloseGrain. I'll cross-post some of those here.

Why "flink and blink"? Those who are wise in the ways of computer science will recognize these as the forward link and backward link of a double linked list, one of the fundamental data structures.