Embedded Systems Software, Computer Networking and Geeky Fun

nerd1951.com

August 5, 2009

The problem is not in our tools but in ourselves

Filed under: Tools, Rants, Programming — Harvey @ 10:44 pm

Let’s face it.  Software design falls into two extremes in the majority of software development organizations. Either it is hardly done at all or it is done to the point of dimensioning returns. You know I’m talking about you, most of you anyway.

The “hardly done at all” camp usually sees little value in doing design. Our managers want to see code! So, we will do the minimum documentation to satisfy our customers, clients or the process police. We even have a name for this approach: Agile Programming. To quote a Dilbert comic, “We just start coding and complaining.”

The design to death crowd has usually been sold some expensive tool or trained in some complex process. These days it’s almost always done in UML. Use cases must be reviewed by all of the stake-holders in painfully long meetings. Then all the varieties of UML diagrams are produced and painstakingly reviewed before a line of code is written. And this is still supposed to be an iterative process.

In the dozen or so years since I abandoned Structured Analysis and Design, I’ve learned two things: Object Oriented Programming really works well and the Waterfall Method of software development does not. Agile Development is a direct response to the old Waterfall model but most “agile” development organizations neither understand nor follow the Agile Process. What I haven’t learned though is a process better than Structured Analysis and Design for getting from the user’s requirements to code.

UML has a host of problems. My biggest problem with UML is that it is mistaken for a development process. There are no well defined processes for Object Oriented design. The analysis phase is well defined with Use Cases and Interaction Diagrams. But beyond this I find most of the process gets muddled. Where is the step-wise refinement that allows you to drill down from the high level architecture to the individual classes and objects?

I don’t even think UML is good notation. It’s just not intuitive. How can any modeling language that professes to be “Universal” address software development in a concise fashion? When I learned how to draw data flow diagrams in the 1980s, I knew intuitively what I was doing. The hardest concept was separating control flow from data flow. Real-time Structured Analysis even solved this problem by introducing control flow diagrams.

But I’m just ranting. There is a root cause is that we as front line developers have let this happen. We have allowed these problems to exist for so long that the quality of our work and our productivity have slipped.

When Object Oriented Analysis and Design were introduced we conviced ourselves and our managers that it would produce big productivity gains. When it didn’t live up to the hype we were told it would take a while for us to “get it.” We had been brought up on a diet of Structured Analysis and procedural programming and would need to unlearn those habits. Another problem early on was the lack of standard notation and processes. UML was supposed to fix that. So we struggled and blamed ourselves for not getting it and waited for better tools.

But now I have to say, I get Object Oriented Programming. I understand UML even though I find it hard to use. Yet I’m still dissatisfied with the current state of software design. Now development processes are often defined by  managers or process specialists who don’t actually develop software. Or is management is sold on Agile Development we skip most of the design phase. We have capitulated and gone along with the situation and have been afraid to admit that “The emperor has no clothes.”

As engineers we need to take control of the process again. We need to step up and own the process and put in the effort to make the process and the tools useful again. To paraphrase Shakespeare: “The problem is not in our tools but in ourselves.”

• • •
 

June 9, 2009

Zipcar’s technology

Filed under: Geeky Fun, Rants, Car Free Insanity — harvey.sugar @ 6:50 pm

  Zipcar sign at the Brookland-CUA Metro parking lot in Washington, D.C.

When a system is well designed you can use it without even realizing the complexity of the technology behind it. This is true of the Zipcar car sharing service.

For those of you who are not familiar with the service, Zipcar allows you to rent cars by the hour or the day. In the cities where Zipcar operates, their cars are strategically located around the area, especially around public transportation hubs. You can search for available cars on the Zipcar web site then reserve the car you want. If you need to extend your reservation at the last minute you can do so using your cell phone.

Cars are accessed using a membership card that uses an RFID technology to unlock the car. When the car has been returned and your reservation is over, Zipcar automatically bills your credit card.

Zipcar employs a range of technologies that must be integrated seamlessly in order to provide an easy-to-use service. First there is the RFID technology employed in the membership card. The car must communicate with the back office to determine if you have a reservation. I assume that Zipcar uses satellite communications for that purpose. The back office must track which cars are available and which cars have need reserved. They also track when the cars have been picked up and returned. Finally the back office operation has to take car of the billing to your credit card.

The Zipcar website does an excellent job of presenting the member with all of the information that they need. You can see where available cars are located and the cost of using them. The web site also displays your future reservations and billing history.

All of these technologies spanning from the RFID on the membership card to the communications with the back office and the web site have been expertly woven together. The most impressive part is that the user never experiences the complexity of these technologies interacting. I use the service often without even thinking about it. This is the best indication of a well designed system.

• • •
 

June 6, 2009

Intel buys WindRiver

Filed under: Tools, Rants — Harvey @ 10:36 am

I’m sure you’ve heard the news by now that Intel has purchased WindRiver, one of the largest providers of Real Time Operating Systems and embedded systems development tools.

Intel’s purchase of WindRiver for 884 million dollars is the end of VxWorks as a viable embedded RTOS.  Actually VxWorks has been losing ground in the embedded systems market for years.  Back in the late ’90s they changed their sales strategy to focus on selling to upper management rather than engineers.  They had to talk to upper management at their price.  As my manager quipped, “Merger?  I thought that the $884 million was for a developer’s license and that’s only valid in the state of California.”  WindRiver’s prices and license policies limited their appeal to everyone except large corporations that develop high quantity devices or companies in very high margin markets such as defense or heavy industrial machinery.  Admittedly, that’s where the money is.  The smaller players in the embedded systems market are not a lucrative customer based even though they employ the largest number of developers.

Intel’s motive is to penetrate the portable consumer device market.  This market is growing much faster than the desktop PC, laptop PC and server markets that Intel dominates.  These markets have all matured and will never experience the explosive growth of a new portable gadget.  Intel’s processors have not been as successful in the portable market and Intel has abandoned its embedded processor products more than once.  Their hope is that by providing the whole package, processor and operating system, they can capture this growth market that has slipped away from them.

Ironically, Intel had one of the first RTOS’s on the market and it was reasonably successful in the 1980s and early 1990s.  Intel’s RMX operating system was the first RTOS that I ever used on a microprocessor.  But by 2000, Intel was so focused on WinTel products that they no longer wanted the cost of supporting RMX86.

Technically, VxWorks is a good product and WindRiver has a history of innovation.  VxWorks was one of the first RTOS’s to provide Internet Protocol support and remote debugging.  They were also an early adopter of the GNU tool chain though they bent the rules quite a bit in redistributing the GNU development tools.  But their corporate policies alienated a lot of engineers.  In the late 1990’s WindRiver went on a shopping spree, purchasing several companies in the embedded systems developer’s market.  They purchased and marginalized pSOS.  They bought a couple of cross compiler companies and then jacked up the prices on their products. Finally there was the new sales model which tried to bypass the engineers that had to use their products.

So, why do I think this is then end of VxWorks as an embedded systems platform?  Intel will focus on portable consumer devices because they want to sell large quantities of processors into that market.  The rest of the embedded developer’s community, those of us who build communications, industrial and other embedded systems products, will become second class citizens.  VxWorks is also a popular OS for Freescale’s embedded PowerPC processors.  Will an Intel subsidiary continue to support a competitor’s products?  But the landscape has changed quite a bit in the last ten years and now we have plenty of alternatives for embedded real time operating systems.

• • •
 

February 2, 2009

From the Bovine Resources Department

Filed under: Geeky Fun, Rants — Harvey @ 11:12 pm

OK – this is not technical in any way – just a pure rant.

When did we go from being people to being resources? Was it when everyone started using Microsoft Project? I hate seeing my name listed under “resources”.I never see anything but people listed on a project as resources. I’ve been on projects where test-beds or prototype hardware were the critical resources during integration. Yet they were never resources on the Microsoft Project schedule.

The other thing I hate is the Human Resources Department; now days just H.R. When I first started working this was called the Personnel Department as in “person.” If we have a Human Resources Department should metropolitan police departments have Canine Resources Departments and Equine Resources Departments? Actually I’ve worked for some tech companies that should have Equine Resources Departments for some of the managers, marketing folks and even some engineers but that’s a different story.

I guess another part of it is that you never lay-off resources. You just downsize. Somehow downsizing your resource head count sounds better than laying-off people. And no one gets fired anymore; just terminated. Well that’s all for now. I have to go see to my terminated bovine resources.

• • •
 

January 20, 2009

Processor price vs. cost

Filed under: Rants — harvey.sugar @ 1:45 pm

There can be a big difference between the cost of choosing a processor or micro-controller and its price. That might sound counterintuitive but the total cost of using a particular processor includes much more than its piece-price. These additional costs include the cost of development tools, the RTOS if you are using one, and software especially development time. The software development time can also have a significant impact on a product’s time-to-market. I’ve work on many products where the cost of time-to-market in terms of lost sales far out-weighed component costs. The product quantities, tool costs, time-to-market, and software complexity must all be considered in determining the most cost effective processor.

The software team needs to be involved in processor selection as much as the hardware engineers. Some hardware engineers have a narrow view of processor costs; limited to the component cost for the processor and support components. I have seen projects where choosing the wrong micro-controller cost man-months of development time because the processor had a small address space and the memory had to be bank-switched, or the internal RAM was too small and variables had to be copied in and out of the internal RAM as needed.

Sometimes the hidden costs are in the development tools. Some processors are only supported by expensive proprietary tool sets. In others cases the only tools available provided a nonstandard subset of the C language. One well known micro-controller family is supported by a C compiler that can’t even link multiple object files. If you want to write modular code you have to divide your code into several include files that are basically compiled and linked as one single source file.

Another risk is thinking too small. You should be very careful about using small eight bit micro-controllers unless you are developing a product that will be produced in quantities of thousands per year and every penny of component costs is critical. There are so many low cost low power 16 bit and 32 bit processors out there that there’s no excuse for using under-powered processors for short-run or one-off embedded systems. Some examples are the ARM families, TI’s 403, Atmel’s 32 bit AVR, Frescale’s ColdFire, and Analog Devices’ Blackfin. Most of these processors have wide support from a number of software tool vendors as well as GNU open source tool support.

The tradeoffs in choosing the best processor for a product are more complex than the component costs. Engineering is often the art of compromise and choosing the most cost effecting processor is a compromise between component prices and development costs.

• • •
 

October 23, 2008

Picky-Picky C/C++ Style Conventions: postscript

Filed under: Rants, Programming — Harvey @ 6:20 pm

I goofed yesterday on one of the code examples as one reader, Michael, pointed out.  I was trying to show how to define a hardware register as a cosnt pointer to volatile data and I wrote:

static const uint8_t* volatile dataOut = 0x20000010;

which is a volatile pointer to const data.  This may be useful in some bizzarre application but it’s certainly not what I was trying to do.  Here is what I meant to do:

static volatile uint8_t* const dataOut = 0x20000010;

Fortunately, I had the code correct in the driver I’ve been working on and I hope I didn’t confuse anyone.  Actually if it had been real code instead of a web page, I’m sure the compiler would have complained the first time I tried to write to the register.  That’s one reason to have all of these qualifiers on your data.  Error messages are a great help, especially when you’ve been (trying to) code for eighteen hours.

Thanks for pointing this out Michael.

• • •
 

October 22, 2008

Picky-Picky C/C++ Style Conventions

Filed under: Rants, Programming — harvey.sugar @ 2:38 pm

By convention, at this point in time you can assume that a char in C is 8 bits, an int or a long is 32 bits and a short is 16 bits but not always. That can be a problem in embedded system programming since we often need to know the exact size of a variable. We also tend to use unsigned variables a lot and typing out unsigned gets tiresome. So you often see people inventing their own coding conventions such as ubyte, ushort, and ulong, etc. I work on code every day that may use two or three different conventions depending on who worked on the code last.

There is a standard way of expression variable sizes and whether they are signed or unsigned in the GNU and POSIX worlds. It is to use the header file stdint.h which defines a number of types of specific sizes: int8_t, uint8_t, int16_t uint16_t, int32_t, uint32_t. The stdint.h file is customized for the processor that you are working with so you can depend on the sizes. This header file also specifies things like the minimum and maximum values that can be represented by a type such as INT32_MAX, UINT32_MAX, and INT32_MIN, etc. All of this is defined in the man page for stdint.h if you are using Linux or Unix.

Sometimes you don’t really care about the exact width of a variable but you want to use a data width of some minimum size that is the most efficient for your processor. Stdint.h provides representations for that too, like: uint_fast8_t and int_fast32_t. There is also a definition for the widest types supported: int_max_t and unit_max_t. You could find out how many bytes wide these are by using sizeof(int_max_t) for example.

stddef.h is another useful header file. It defines things like NULL (though according to Stroustrup you should just use 0 for null pointers) and size_t. size_t is defined as the type returned by the sizeof() function. size_t is useful for times when you don’t really care about the size of a variable, as long as is large enough to represent the size of a data object in bytes. For example, suppose you want the binary inverse of every byte in and array:

for(size_t i = 0; i < sizeof(array); i++)
{
array[i] = ~array[i];
}

Finally, do you know what this represents?

static const uint8_t* volatile dataOut = 0x20000010;

This is a const pointer to a volatile byte at location 0×20000010. A hardware register.

You want a const pointer because the hardware registers shouldn’t be moving around. Declaring the pointer is const allows the compiler to catch you if you forget to dereference the pointer when writing to the location.

The value stored at location 0×20000010 could change on its own without the software changing it. Volatile tells the compiler not to optimize away any reads or writes to this location. For example if you only ever write to the register and never read it, the compiler might remove what seems to be unnecessary writes to this location. The optimizer might completely remove the variable from the object code. Volatile lets the compiler know that this location is special.

• • •
 

September 30, 2008

Hacker’s food

Filed under: Tools, Geeky Fun, Rants, Programming — harvey.sugar @ 4:23 pm

Some things that help make a normal life pleasant can get to be distractions when you’re way behind schedule on a project. Nerds are legendary for ignoring these distractions when a technical challenge requires their full attention. Details like hygiene and nutrition are the first casualties in battles against bugs and deadlines. It’s really hard to be fresh smelling and perky looking when you’re in the middle of marathon systems integration problems and have been working for eighteen hours straight.

Over the last couple of weeks, I’ve really noticed a decline in my eating habits. I usually try to prepare my food from fresh unadulterated ingredients; lots of vegetables, beans, salad and grains and a good bit of meat too. But I cook from scratch and watch the carbs and fat. That is until I hit systems integration.

I started out last week with salads for lunch and home made chili for dinner. When the chili and the lettuce ran out, I switched to carry out food. I tried sticking to wholesome stuff like the local kabob place, easy on the rice and Chipotle which is quite healthy and tasty without the rice and tortillas.

Then I started eating at odd hours, late at night or very early in the morning. I switched to the diet of the legendary first generation of hackers at MIT, like Richard Stallman. I started alternating between Chinese carry out and pizza washed down with lots of Coke.

I knew I hit bottom this morning. Taco Bell is open late around here. They call the time between midnight and two AM, “The Forth Meal.” I found myself driving to the Taco Bell to get there before closing so I could get mine. A few hours later I was at McDonalds’ getting a sausage, egg, and cheese McGriddle and another Coke.

I’m lost now and I’ll admit it. I’m not eating again until I can cook something for myself. Right after a shower and a twelve hour nap.

• • •
 

September 24, 2008

It’s (almost) never the hardware

Filed under: Rants, Programming — Harvey @ 10:47 pm

We embedded systems programmers have some special challenges in our work.  One of the major problems we face is that the hardware we work with is often not fully exercised until our software exercises it.  So, it is tempting to blame the hardware when things don’t work right and the cause is not obvious.  I’ve done my share of blaming the hardware, sometimes to the point of embarrassment when it turned out I was wrong.

Over the years I’ve learned to work with the hardware engineers to solve a problem rather than point my finger.  I’ve learned the hard way to devise some low level tests to isolate a suspected hardware problem before I go bother the hardware designer.  Sometimes a ’scope or a logic analyzer are the best software debugging tools and embedded systems programmer can have.  But sometimes you run across a bizarre software problem that masquerades as an obvious hardware problem.  A couple of these kinds of bugs will change your approach to debugging forever.

Will Rogers once said “There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves.”  For those who learn by reading or observation here are a couple of war stories about obvious hardware problems that weren’t.

The first story involves a very successful piece of test equipment.  If you ever worked with T1s and I told you the model number you would probably recognize it.  I worked on this product toward the end of its life cycle.  I did some of the last feature upgrades and became responsible for software maintenance on it.  This test set used a popular Intel counter-timer chip.  The same chip was used in the original IBM PC and there are incarnations of it in the system chips of personal computers to this day.  NEC was a second source for this part but for some reason this test set would not work with the NEC version of the timer chip.  There was even a note on the BOM (Bill of Materials) that only the Intel part could be used.

Well, one day, something fell through the cracks and a batch of these units were built with the NEC part.  I got a call from production test that the software was failing in these units.  So, I went over to the factory and picked up one of the failing units to look at with an ICE.  As soon as I took the unit apart, I saw the NEC timer and knew that was the problem but when I called the production manager, he insisted that I should take a second look at the software.  We used the same NEC time in several other products with no problems and it was getting harder to get the Intel part.  I was certain that it was not a software problem.  There were literally hundreds of these test sets in the field with this software, working just fine.  I couldn’t very well say no so I set up a a couple of breakpoints and ran the unit trough its paces.  As I single-stepped through the ISR for the timer for about the tenth time, I noticed that it pushed one more register on the stack than it popped off at the end.  There it was in x86 assembly code, a function that should have crashed every time no matter what timer chip was used.  I fixed the routine and the test set worked just fine with the NEC part from then on.  To this day I have no idea why the problem never showed up with the Intel timer chip but if it weren’t for the persistence of that production manager, that bug might have showed up at another time in another way.

The second story is about another very popular product from the same company.  I never worked on that product but a coworker told me about this one.  Test equipment tends to have a very long product lifetime.  This product had undergone so many upgrades that they decided that it was time for a major software rewrite.  The project went well until system testing.  It seemed that the units in the lab worked just fine but when they were buttoned up with the covers on, the software crashed.  How could the presence of the cover affect the software?  It seemed like a classic hardware problem.  Perhaps it was a problem with noise or temperature.  The mystery was that the previous software release worked just fine, with or without the covers on.  Debugging this was a nightmare.  You couldn’t hook up an ICE with the cover on the unit so they had to write test code and install it and put the cover back on.  There didn’t seem to be any rhyme or reason to the software crashes.  Meanwhile they were also pursuing another seemingly unrelated problem.  If two of the pins on the RS-232 interface were shorted together (like RTS to CTS) the software would crash as soon as the test set was powered up.  It turned out that in the start up code in the new version, someone enabled interrupts before the rest of the hardware was initialized.  Shorting the RS-232 pins together or putting the cover on the unit added just enough noise on a floating interrupt line to cause it to trigger an interrupt.  Since the interrupt vector table had not been initialized at this point, the interrupt vectored to an invalid address and crashed the system.  Yep, another classic hardware problem that turned out to be software.

So what should you take away from these stories?  Never assume the problem is hardware.  If it seems like it is hardware, work with the hardware engineer to solve the problem.  Don’t just point your finger.  Use a ’scope or a logic analyzer if you know how and if you don’t get an EE to drive for you.  Some very bizarre software bugs can disguise themselves as hardware problems.

• • •
 

September 9, 2008

Open source voting machines?

Filed under: News, Rants — Harvey @ 9:38 pm

Every once in a while I go over to techdirt for some interesting reading while I’m “waiting for something to compile.” They’ve had a number of posts about e-voting and voting machines in general. One in particular caught my eye: Palm Beach County Lost 3,400 Votes; Claims Different Sequoia Scanners Count Differently. One of the issues that they raised is that it is difficult for local jurisdictions to verify the operation of voting machines because the manufacturers have been highly resistant to independent inspection of the systems.

In another article on techdirt, E-Voting Isn’t Perfect, But It Takes Less Work to Corrupt Big Elections, they explain why e-voting makes it so much simpler to rig elections. The article references a paper, Voting System Risk Assessment via Computational Complexity Analysis by Rice computer scientist Dan Wallach. Being the nerd that I am, I downloaded the paper and read it. It’s pretty scary if you believe in democracy.

I’m sure that the technical community can do better than this. If we don’t want companies like Diebold and Premier counting our votes using flawed equipment and security features, I think we need to do something about it ourselves. I think we should form an organization to develop an open source voting machine. I’m sure there are some very talented computer security researchers out there who could design procedures that would make vote tampering nearly impossible. Just look at OpenSSL which is used for thousands of financial transactions every day. I also think that the open source community is better at developing reliable software  than for-profit enterprises that are more interested in minimizing cost than maximizing value.

An open source voting machine would have the following advantages:

  • The complete design, hardware, software, procedures, would be open for inspection by any government body or independent panel.
  • The project could attract the best and brightest in computer and systems security to ensure state of the art fraud prevention.
  • The project would also attract top talent in hardware and software development.
  • Local government agencies could have competitive bidding for the manufacture of the voting machines. The machines could even be made available to budding democracies in other countries.
  • An open source community would be more willing to admit to flaws and mistakes and correct them.
  • If fraud is suspected the open source community would be available to analyze the evidence.

I don’t know if this is a crackpot idea or a chance for us tehies to help preserve democracy. I don’t know, there may already be a project like this going. I would be interested in what others think of this idea and if anyone might even want to work on a project like this.

• • •
 
Next Page »