Posts with «c++» label

Code Craft-Embedding C++: Hacking the Arduino Software Environment

The Arduino software environment, including the IDE, libraries, and general approach, are geared toward education. It’s meant as a way to introduce embedded development to newbies. This is a great concept but it falls short when more serious development or more advanced education is required. I keep wrestling with how to address this. One way is by using Eclipse with the Arduino Plug-in. That provides a professional development environment, at least.

The code base for the Arduino is another frustration. Bluntly, the use of setup() and loop() with main() being hidden really bugs me. The mixture of C and C++ in libraries and examples is another irritation. There is enough C++ being used that it makes sense it should be the standard. Plus a good portion of the library code could be a lot better. At this point fixing this would be a monumental task requiring many dedicated developers to do the rewrite. But there are a some things that can be done so let’s see a couple possibilities and how they would be used.

The Main Hack

As mentioned, hiding main() bugs me. It’s an inherent part of C++ which makes it an important to learning the language. Up until now I’d not considered how to address this. I knew that an Arduino main() existed from poking around in the code base – it had to be there because it is required by the C++ standard. The light dawned on me to try copying the code in the file main.cpp into my own code. It built, but how could I be sure that it was using my code and not the original from the Arduino libraries? I commented out setup() and it still built, so it had to be using my version otherwise there’d be an error about setup() being missing. You may wonder why it used my version.

When you build a program… Yes, it’s a “program” not a “sketch”, a “daughter board” not a “shield”, and a “linker” not a “combiner”! Why is everyone trying to change the language used for software development?

When you build a C++ program there are two main stages. You compile the code using the compiler. That generates a number of object files — one for each source file. The linker then combines the compiled objects to create an executable. The linker starts by looking for the C run time code (CRTC). This is the code that does some setup prior to main() being called. In the CRTC there will be external symbols, main() being one, whose code exists in other files.

The linker is going to look in two places for those missing symbols. First, it loads all the object files, sorts out the symbols from them, and builds a list of what is missing. Second, it looks through any included libraries of pre-compiled objects for the remaining symbols. If any symbols are still missing, it emits an error message.

If you look in the Arduino files you’ll find a main.cpp file that contains a main() function. That ends up in the library. When the linker starts, my version of main() is in a newly created object file. Since object files are processed first the linker uses my version of main(). The library version is ignored.

There is still something unusual about main(). Here’s the infinite for loop in main():

	for (;;) {
		loop();
		if (serialEventRun) serialEventRun();
	}

The call to loop() is as expected but why is there an if statement and serialEventRun? The function checks if serial input data is available. The if relies on a trick of the tool chain, not C++, which checks the existence of the symbol serialEventRun. When the symbol does not exist the if and its code are omitted.

Zapping setup() and loop()

Now that I have control over main() I can address my other pet peeve, the setup() and loop() functions. I can eliminate these two function by creating my own version of main(). I’m not saying the use of setup() and loop() were wrong, especially in light of the educational goal of Arduino. Using them makes it clear how to organize an embedded system. This is the same concept behind C++ constructors and member functions. Get the initialization done at the right time and place and a good chunk of software problems evaporate. But since C++ offers this automatically with classes, the next step is to utilize C++’s capabilities.

Global Instantiation

One issue with C++ is the cost of initialization of global, or file, scope class instances. There is some additional code executed before main() to handle this as we saw in the article that introduced classes. I think this overhead is small enough that it’s not a problem.

An issue that may be a problem is the order of initialization. The order is defined within a compilation unit (usually a file) from the first declaration to the last. But across compilation units the ordering is undefined. One time all the globals in file A may be initialized first and the next time those in file B might come first. The order is important when one class depends on another being initialized first. If they are in different compilation units this is impossible to ensure. One solution is to put all the globals in a single compilation unit. This may not work if a library contains global instances.

A related issue occurs on large embedded computer systems, such as a Raspberry Pi running Linux, when arguments from the command line are passed to main(). Environment variables are also a problem since they may not be available until main() executes. Global instance won’t have access to this information so cannot use it during their initialization. I ran into this problem with my robots whose control computer was a PC. I was using the robot’s network name to determine their initial behaviors. It wasn’t available until main() was entered, so it couldn’t be used to initialize global instances.

This is an issue with smaller embedded systems that don’t pass arguments or have environment values but I don’t want to focus only on them. I’m looking to address the general situation that would include larger systems so we’ll assume we don’t want global instances.

Program Class

The approach I’m taking and sharing with you is an experiment. I have done something similar in the past with a robotics project but the approach was not thoroughly analyzed. As often happens, I ran out of time so I implemented this as a quick solution. Whether this is useful in the long run we’ll have to see. If nothing else it will show you more about working with C++.

My approach is to create a Program class with a member run() function. The setup for the entire program occurs in the class constructor and the run() function handles all the processing. What would normally be global variables are data members.

Here is the declaration of a skeleton Program class and the implementation of run():

class Program {
public:
	void run();
	static Program& makeProgram() {
		static Program p;
		return p;
	}

private:
	Program() { }
	void checkSerialInput();
};

void Program::run() {
	for (;;) {
		// program code here
		checkSerialInput();
	}
}

We only want one instance of Program to exist so I’ve assured this by making the constructor private and providing the static makeProgram() function to return the static instance created the first time makeProgram() is called. The Program member function checkSerialInput() handles checking for the serial input as discussed above. In checkSerialInput() I introduced an #if block to eliminate the actual code if the program is not using serial input.

Here is how Program is used in main.cpp:


void arduino_init() {
	init();
	initVariant();
}

int main(void) {
	arduino_init();
	Program& p = Program::makeProgram();
	p.run();
	return 0;
}

The function initArduino() is inlined and handles the two initialization routines required to setup the Arduino environment.

One of the techniques for good software development is to hide complexity and provide a descriptive name for what it does. These functions hide not only the code but, in one case, the conditional compilation.

Redbot Line Follower Project

This code experiment uses a Sparkfun Redbot setup for line following. This is a two wheeled robot with 3 optical sensors to detect the line and an I2C accelerometer to sense bumping into objects. The computer is a Sparkfun Redbot Mainboard which is compatible with the Arduino Uno but provides a much different layout and includes a motor driver IC.

This robot is simple enough to make a manageable project but sufficiently complex to serve as a good test, especially when the project gets to the control system software. The basic code for handling these motors and sensors comes from Sparkfun and uses only the basic pin-level Arduino routines. I can’t possibly hack the entire Arduino code but using the Sparkfun code provides a manageable subset for experimenting.

For this article we’ll just look at the controlling the motors. Let’s start with the declaration of the Program class for testing the motor routines:

class Program {
public:
	void run();
	static Program& makeProgram() {
		static Program p;
		return p;
	}

private:
	Program() { }
	static constexpr int delay_time { 2000 };

	rm::Motor l_motor { l_motor_forward, l_motor_reverse, l_motor_pwm };
	rm::Motor r_motor { r_motor_forward, r_motor_reverse, r_motor_pwm };
	rm::Wheels wheels { l_motor, r_motor };

	void checkSerialInput();
};

There is a namespace rm enclosing the classes I’ve defined for the project, hence the rm:: prefacing the class names. On line 11 is something you may not have seen, a constexpr which is new in C++ 11 and expanded in C++14. It declares that delay_time is a true constant used during compilation and will not be allocated storage at run-time. There is a lot more to constexpr and we’ll see it more in the future. One other place I used it for this project is to define what pins to use. Here’s a sample:

constexpr int l_motor_forward = 2;
constexpr int l_motor_reverse = 4;
constexpr int l_motor_pwm = 5;
constexpr int r_motor_pwm = 6;
constexpr int r_motor_forward = 7;
constexpr int r_motor_reverse = 8;

The Motor class controls a motor. It requires two pins to control the direction and one pulse width modulation (PWM) pin to control the speed. The pins are passed via constructor and the names should be self-explanatory. The Wheels class provides coordinated movement of the robot using the Motor instances. The Motor instances are passed as references for the use of Wheels. Here are the two class declarations:

class Motor : public Device {
public:
	Motor(const int forward, const int reverse, const int pwm);

	void coast();
	void drive(const int speed);

	int speed() const {
		return mSpeed;
	}

private:
	void speed(const int speed);

	PinOut mForward;
	PinOut mReverse;
	PinOut mPwm;
	int mSpeed { };
};


class Wheels {
public:
	Wheels(Motor& left, Motor& right) :
			mLeft(left), mRight(right) {
	}

	void move(const int speed) {
		drive(speed, speed);
	}
	void pivot(const int speed) {
		drive(speed, -speed);
	}
	void stop() {
		mLeft.coast();
		mRight.coast();
	}

	void drive(const int left, const int right) {
		mLeft.drive(left);
		mRight.drive(right);
	}

private:
	Motor& mLeft;
	Motor& mRight;
};

The workhorse of Wheels is the function drive() which just calls the Motor drive() functions for each motor. Except for stop(), the other Wheels functions are utilities that use drive() and just make things easier for the developer. The compiler should convert those to a direct call to driver() since they are inline by being inside the class declaration. This is one of the interesting ways of using inline functions to enhance the utility of a class without incurring any cost in code or time.

The run() method in Program tests the motors by pivot()ing first in one direction and then the other at different speeds. A pivot() rotates the robot in place. Once the speed is set it continues until changed so the delay functions simply provide a little time for the robot to turn. Here’s the code:

void Program::run() {
	for (;;) {
		wheels.pivot(50);
		delay (delay_time);

		wheels.pivot(-100);
		delay(delay_time);

		checkSerialInput();
		if (serialEventRun) {
		}
	}
}

Wrap Up

The Redbot project is an interesting vehicle for demonstrating code techniques. The current test of the motor routines demonstrates how to override the existing Arduino main(). Even if you don’t like my approach with Program, the flexibility of using your own main() may come in handy for your own projects. The next article is going to revisit this program using templates.

THE EMBEDDING C++ PROJECT

Over at Hackaday.io, I’ve created an Embedding C++ project. The project will maintain a list of these articles in the project description as a form of Table of Contents. Each article will have a project log entry for additional discussion. Those interested can delve deeper into the topics, raise questions, and share additional findings.

The project also will serve as a place for supplementary material from myself or collaborators. For instance, someone might want to take the code and report the results for other Arduino boards or even other embedded systems. Stop by and see what’s happening.


Filed under: Arduino Hacks, Hackaday Columns, Software Development, software hacks

EG First 2nd rev bot

Primary image

What does it do?

Modified RC obstacle avoidance Car

Very new to this site. Been playing with this project on and off for 6 months. Completed once then went back to and it wasn't working. Tried to troubleshoot and burnt out the chip. Pulled out the whole circuit board. Then rewired the battery pack. At that point it is just the frame DC motors on front and back(front geared to turn right and left, back left as dc motor for fowards and back) . I pulled an H bridge chip from a motor drive expansion board and breadboarded it .

Cost to build

Embedded video

Finished project

Number

Time to build

Type

URL to more information

Weight

read more

Code Craft – Embedding C++: Templates

The language C++ is big. There is no doubting that. One reason C++ is big is to allow flexibility in the technique used to solve a problem. If you have a really small system you can stick to procedural code encapsulated by classes. A project with a number of similar but slightly different entities might be best addressed through inheritance and polymorphism.

A third technique is using generics, which are implemented in C++ using templates. Templates have some similarities with #define macros but they are a great deal safer. The compiler does not see the code inserted by a macro until after it has been inserted into the source. If the code is bad the error messages can be very confusing since all the developer sees is the macro name. A template is checked for basic syntax errors by the compiler when it is first seen, and again later when the code is instantiated. That first step eliminates a lot of confusion since error messages appear at the location of the problem.

Templates are also a lot more powerful. They actually are a Turing complete language. Entire non-trivial programs have been written using templates. All the resulting executable does is report the results with all the computation done by the compiler. Don’t worry, we aren’t going there in this article.

Template Basics

You can use templates to create both functions and classes. The way this is done is quite similar for both so let’s start with a template function example:

template<typename T, int EXP = 2>
T power(const T value, int exp = EXP) {
	T res { value };
	for (; exp > 1; --exp) {
		res *= value;
	}
	return res;
}

This is a template function for raising value by the integer exponent, exp. The keyword template is followed in angle brackets by parameters. A parameter is specified using either typename or class followed by a name, or by an integer data type followed by a name. You can also use a function or class as a template parameter but we won’t look at that usage.

The name of a parameter is used within the body of the class or function just as you would use any other type, or value. Here we use T as the type name for the input and return values of the function. The integer EXP is used to set a default value of 2 for the exponent, i.e. making power calculate the square.

When the compiler instantiates a template function or class, it creates code that is the same as a handwritten version. The data types and values are inserted as text substitutions. This creates a new version typed by the actual arguments to the parameters. Each different set of arguments creates a new function or type. For example, an instance of power() for integers is not the same as power() for floats. Similarly, as we’ll see in a moment, a class Triple for integers is not the same as one for float. Each are distinct types with separate code.

Since power() is a template function it will work directly for any numeric data type, integer or floating point. But what if you want to use it with a more complex type like the Triple class from the last article? Let’s see.

Using Templates

Here’s the declaration of Triple reduced to only what is needed for this article:

class Triple {
public:
	Triple(const int x, const int y, const int z);
	Triple& operator *=(const Triple& rhs);

	int x() const;
	int y() const;
	int z() const;

private:
	int mX { 0 };	// c++11 member initialization
	int mY { 0 };
	int mZ { 0 };
};

I switched the plus equal operator to the multiple equal operator since it is needed by the power() function.

Here is how the power() function is used for integer, float, and our Triple data types:

int p = power(2, 3);
float f = power(4.1, 2);

Triple t(2, 3, 4);
Triple res = power(t, 3);

The only requirement for using a user defined data type (UDT) like Triple with power() is the UDT must define an operator=*() member function.

Template Classes

Assume you’ve been using Triple in a project for awhile with integer values. Now a project requirement needs it for floating point values. The Triple class code is all debugged and working, and more complex than what we’ve seen here. It’s not a pleasant thought to create a new class for float. There are also hints that a long or double version might be needed.

With not much work Triple can be converted to a generic version as a template class. It’s actually fairly straightforward. Begin with the template declaration just as with the function power() and replace all the declarations of int with T. Also check member function arguments for passing parameters by value. They may need to be changed to references to more efficiently handle larger data types or UDTs. I changed the constructor parameters to references for this reason.

Here is Triple as a template class:

template<typename T>
class Triple {
public:
	Triple(const T& x, const T& y, const T& z);
	Triple& operator *=(const Triple& rhs);

	T x() const;
	T y() const;
	T z() const;

private:
	T mX { 0 };	// c++11 member initialization
	T mY { 0 };
	T mZ { 0 };
};

Not a lot of difference. Here’s how it could be used:

Triple<int> ires = power(Triple { 2, 3, 4 }, 3);
Triple fres = power(Triple(1.2F, 2.2, 3.3)); // calc square
Triple dres = power(Triple(1.2, 2.2, 3.3));// calc square
Triple lres = power(Triple(1, 2, 3.3), 2);

Unfortunately, the new flexibility comes at the cost of telling the template the data type to use for Triple. That is done by putting the data type inside brackets following the class name. If that is a hassle you can always use typedef or the new using to create an alias:

using TripleInt = Triple;
TripleInt ires = power(Triple { 2, 3, 4 }, 3);

Creating a template class like this saves debugging and maintenance costs overall. Once the code is working, it works for all related data types. If a bug is found and fixed, it’s fixed for all versions.

Template Timing and Code Size

The code generated by a template is exactly the same code as a handwritten version of the same function or class. All that changes between versions is the data type used in the instantiation. Since the code is the same as the handwritten version, the timing is going to be the same. Therefore there is no need to actually test timing. Phew!

Templates are Inline Code

Templates are inherently inline code. That means every time you use a template function or a template class member function the code is duplicated inline. Each instance with a different data type creates its own set of code, but that will be no more than if you’d written a class for each data type. There can be savings using template classes since member functions are not instantiated if they are not used. For example, if the Triple class getter functions – x(), y(), z() – are never used, their code is not instantiated. They would be for a regular class, although a smart linker might drop them from the executable.

Consider the following use of power() and Triple:

int i1 = power(2, 3);
int i2 = power(3, 3);
Triple t1 = power(Triple(1, 2, 3), 2);

This creates two inline integer versions of power even though both are instantiated for the same data type. Another instance is created for the Triple version. A single copy of the Triple class is created because the data type is always int.

Here we’re relying on implicit instantiation. That means we’re letting the compiler determine when and where the code is generated. There is also explicit instantiation that allows the developer to specify where the code is produced. This takes a little effort and knowledge of which data types are used for the templates.

Generally, implicit instantiation means inline function code with the possibility of duplication of code. Whether that matters depends on the function. When a function, not an inline function, is called there is overhead in invocation. The parameters to the function are pushed onto the stack along with the housekeeping information. When the function returns those operations are reversed. For a small function the invocation may take more code than the function’s body. In that case, inlining the function is most effective.

The power() function used here is interesting because the function’s code and the code to invoke it on an Uno are similar in size. Of course, both vary depending on the data type since large data types require more stack manipulation. On a Arduino Uno, calling power() with an int takes more code than the function. For float, the call is slightly larger. For Triple, the code to invoke is a good piece larger. On other processors the calling power() could be different. Keep in mind that power() is a really small function. Larger functions, especially member functions, are typically going to outweigh the cost to call them.

Specifying where the compiler generates the code is an explicit instantiation. This will force an out-of-line call with the associated overhead. In a source file you tell the compiler which specializations you need. For the test scenario we want them for int and Triple:

template int power(int, int);
template Triple<int> power(Triple<int>, int);

The compiler will create these in the source file. Then, as with any other function, you need to create an extern declaration. This tells the compiler to not instantiate them as inline. These declarations are just the same as above, only with extern added:

extern template int power(int, int);
extern template Triple power(Triple, int);

Scenario for Testing Code Size

It took me a bit to create a test scenario for demonstrating the code size differences between these two instantiations. The problem is the result from the power() function must be used later in the code or the compiler optimizes the call away. Adding code to use the function changes the overall code size in ways that are not relevant to the type of instantiation. That makes comparisons difficult to understand.

I finally settled on creating a class, Application, with data members initialized using the power() function. Adding data members of the same or different types causes minimal overall size changes so the total application code size closely reflects the changes only due to the type of instantiation.

Here is the declaration of Application:

struct Application {
public:
	Application(const int value, const int exp);

	static void loop() {
	}

	int i1;
	int i2;
	int i3;
	Triple t1;
	Triple t2;
	Triple t3;
};

and the implementation of the constructor:

Application::Application(int value, const int exp) :
		i1 { power(value++, exp) }, //
				i2 { power<int, 3="">(value++) }, // calcs cube
				i3 { power(value++, exp) }, //

				t1 { power(Triple(value++, 2, 3)) }, // calcs square
				t2 { power(Triple(value++, 4, 5), exp) }, //
				t3 { power(TripleInt(value++, 2, 3), exp) } //
{
}

The minimum application for an Arduino has just an empty setup and loop() functions which takes 450 bytes on a Uno. The loop() used for this test is a little more than minimum but it only creates an instance of Application and calls its loop() member function:

void loop() {
	rm::Application app(2, 3);
	rm::Application::loop();	// does nothing
}

Code Size Results

Here are the results for various combinations of implicit and explicit instantiation with different numbers of class member variables:

The first columns specify how many variables were included in the Application class. The columns under Uno and Due are the code size for those processors. They show the size for implicit instantiation, explicit instantiation of power() for just the Triple class, and explicit instantiation for both int and Triple data types.

The code sizes are dependent on a number of factors so can only provide a general idea of the changes when switching from implicit to explicit template instantiation. Actual results depend on the tool chains compiler and linker. Some of that occurs here using the Arduino’s GCC tool chain.

In all the cases with the Uno where one variable is used, the code size increases with explicit instantiation. In this case the function’s code plus the code for calling the function is, as expected, greater than the inline function’s code.

Now look at the Uno side of the table where there are 2 integers and 2 Triples, i.e. the fourth line. The first two code sizes remain the same at 928 bytes. The compiler optimized the code for the two Triples() by creating power() out-of-line without being told to do it explicitly. In the third column there is a decrease in code size when the integer version of power() is explicitly instantiated. It did the same a couple of lines below that when there are only the 2 Triples. These were verified by examing the assembly code running objdump on the ELF file.

In general, the Due’s code size did not improve with explicit instantiation. The larger word size of the Due requires less code to call a function. It would take a function larger than power() to make explicit instantiation effective in this scenario.

As I mentioned, don’t draw too many conclusions for these code sizes. I repeatedly needed to check the content of the ELF file using objdump to verify my conclusions. As a case in point, look at the Due side, with 2 integers and a Triple, with the two code sizes of 10092. They’re just coincidence. In one the integer version of power() is inlined and in the other, explicitly out-of-lined. The same occurs on the first line under Uno where there are just two integers and no Triples.

You can find other factors influencing code size. When three Triples are involved the compiler lifts the multiplication code from power(), but not the entire function. This isn’t because power() is a template function but just a general optimization, i.e. lifting code from inside a loop.

Wrap Up

Templates are a fascinating part of C++ with extremely powerful capabilities. As mentioned above, you can write an entire program in templates so the compiler actually does the computation. The reality is you probably are not going to be creating templates in every day programming. They are better suited for developing libraries and general utilities. Both power() and Triple fall into, or are close to, that category. This is why the C++ libraries consist of so many template classes. Creating a library requires attention to details beyond regular coding.

It’s important to understand templates even if you don’t write them. We’ve discussed some of the implications of usage and techniques for making optimal use of templates because they are an inherent part of the language. With them being such a huge part of C++ we’ll come back to them again to address places where they can be used.

The Embedding C++ Project

Over at Hackaday.io, I’ve created an Embedding C++project. The project will maintain a list of these articles in the project description as a form of Table of Contents. Each article will have a project log entry for additional discussion. Those interested can delve deeper into the topics, raise questions, and share additional findings.

The project also will serve as a place for supplementary material from myself or collaborators. For instance, someone might want to take the code and report the results for other Arduino boards or even other embedded systems. Stop by and see what’s happening.


Filed under: Arduino Hacks, Hackaday Columns, Software Development

ATtiny Does 170×240 VGA With 8 Colors

The Arduino is a popular microcontroller platform for getting stuff done quickly: it’s widely available, there’s a wealth of online resources, and it’s a ready-to-use prototyping platform. On the opposite end of the spectrum, if you want to enjoy programming every bit of the microcontroller’s flash ROM, you can start with an arbitrarily tight resource constraint and see how far you can push it. [lucas]’s demo that can output VGA and stereo audio on an eight-pin DIP microcontroller is a little bit more amazing than just blinking an LED.

[lucas] is using an ATtiny85, the larger of the ATtiny series of microcontrollers. After connecting the required clock signal to the microcontroller to get the 25.175 Mhz signal required by VGA, he was left with only four pins to handle the four-colors and stereo audio. This is accomplished essentially by sending audio out at a time when the VGA monitor wouldn’t be expecting a signal (and [lucas] does a great job explaining this process on his project page). He programmed the video core in assembly which helps to optimize the program, and only used passive components aside from the clock and the microcontroller.

Be sure to check out the video after the break to see how a processor with only 512 bytes of RAM can output an image that would require over 40 KB. It’s a true testament to how far you can push these processors if you’re determined. We’ve also seen these chips do over-the-air NTSC, bluetooth, and even Ethernet.


Filed under: ATtiny Hacks

Speech Recognition with Arduino and BitVoicer Server

In this post I am going to show how to use an Arduino board and BitVoicer Server to control a few LEDs with voice commands. I will be using the Arduino Micro in this post, but you can use any Arduino board you have at hand.

The following procedures will be executed to transform voice commands into LED activity:

read more

Audio Effects on the Intel Edison

With the ability to run a full Linux operating system, the Intel Edison board has more than enough computing power for real-time digital audio processing. [Navin] used the Atom based module to build Effecter: a digital effects processor.

Effecter is written in C, and makes use of two libraries. The MRAA library from Intel provides an API for accessing the I/O ports on the Edison module. PortAudio is the library used for capturing and playing back audio samples.

To allow for audio input and output, a sound card is needed. A cheap USB sound card takes care of this, since the Edison does not have built-in hardware for audio. The Edison itself is mounted on the Edison Arduino Breakout Board, and combined with a Grove shield from Seeed. Using the Grove system, a button, potentiometer, and LCD were added for control.

The code is available on Github, and is pretty easy to follow. PortAudio calls the audioCallback function in effecter.cc when it needs samples to play. This function takes samples from the input buffer, runs them through an effect’s function, and spits the resulting samples into the output buffer. All of the effect code can be found in the ‘effects’ folder.

You can check out a demo Effecter applying effects to a keyboard after the break. If you want to build your own, an Instructable gives all the steps.


Filed under: digital audio hacks

Code Craft-Embedding C++: Hidden Activities?

What is an embedded system? The general definition is a computer system dedicated to a specific purpose, i.e. not a general purpose system usable for different tasks. That is a very broad definition. I was just skimming the C++ coding guidelines for the Joint Strike Fighter. That’s a pretty big embedded system and the first DOD project that allowed C++! When you use an ATM to get money you’re using an embedded system. Those are basically hardened PCs. Then at the small end we have all the Internet of Things (IoT) gadgets.

The previous articles about embedding C++ discussing classes, virtual functions, and macros garnered many comments. I find both the positive and critical comments rewarding. More importantly, the critical comments point me toward issues or questions that need to be addressed, which is what got me onto the topic for this article. So thank you, all.

Let’s take a look at when embedded systems should or should not use C++, taking a hard look at the claim that there may be hidden activities ripe to upset your carefully planned code execution.

Limits of Embedded Development Boards

Embedded systems are often thought of as having limited resources, e.g. memory, processing power. Having real-time constraints is another requirement frequently brought up. While those do occur in embedded systems they are not defining characteristics.

At some point a processor or memory limits preclude using C++, and often even C. Vendors might resort to a restricted version of C on some processors to provide a high-level language capability, an effort that would be silly for C++.

But we’ve not hit the limit on the boards used in these articles. We see with the Arduino Uno and its relatives that C++ is usable. The Uno is restricted to a subset of C++, in part because the developers did not have a C++ standard library available. (If you really want one, there are ports of the STL for the Uno.) The compiler in the Uno toolset supports C++11 and there is some support for C++14, but I haven’t explored the latter to know what is usable. There are capabilities in C++11, and C+14, that improve C++ use in embedded systems.

The Due, a larger Arduino board I’ve used to contrast with the Uno, does have the full standard library. Switch over to the Raspberry Pi, or equivalents, where you not only get the GCC toolset but can run Eclipse on the board, and it feels like the sky’s the limit.

Should You C++?

While all the above is valid, it misses a critical point. The issue isn’t whether you can use C++ on the smaller systems but whether solving the problem needs C++’s capabilities. What I’m suggesting is changing the question from “Can you use C++?” to “Should you use C++?”

We’ve addressed some of the really basic objections to using C++. Code bloat is not the great explosion folks imagine. Virtual functions are not super slow. But the comments raise other issues. One comment advised against using C++ because of the hidden activities. Specifically mentioned were copy constructors, side effects, hidden allocation, and unexpected operations.

What is a copy constructor, and why do we need one? It’s a constructor that makes a copy of an existing instance. Any time a copy is made the copy constructor is called. Recall that all constructors initialize instances so they are ready to be used.

A copy constructor is required if you pass a parameter by value. That’s a copy. Returning a value from a function causes a copy, although a decent compiler will optimize it away. Assignment also involves making a copy.

With built in types the cost of a copy is low, except maybe if you are using long doubles at 16 bytes a value. For large data structures a copy can be expensive and can be tricky. Rather than bemoan that C++ does copies, we need to recognize they are a necessity. That recognition means we can work to avoid them and get them right when they are needed.

One way to avoid copies is to pass structures by reference. In C, passing by pointer is a pass by reference. C++ allows that and introduces the reference operator. The reference operator is not just syntactic sugar. For example, references eliminate the dangling pointer problem since you cannot have a null reference.

Which brings up the ownership problem with pointers and the questions they raise for data structure copies. Quite frequently, even in C++, a data structure contains a pointer to another data structure. When you make a copy who owns the structure at the end of the pointer? Do you copy the pointer or the data? If you just copy the pointer you are sharing data between the two copies. One copy can modify the data in the other copy. That is usually not a good thing. Copying the data might be expensive. Also, who ultimately decides when the target of a pointer is deleted, or even if it should be deleted?

C++ doesn’t introduce a problem with copy constructors; it highlights a requirement that needs to be addressed, sometimes by looking to the problem requirements. What is needed by the solution when a copy is made?

Copying Data

In my robotics work I use an inertial measurement unit (IMU) to help track position and bearing, the robot’s pose. Inside the IMU are an accelerometer, a gyroscope, and a compass. The accelerometer and gyroscope both provide data as a triple of data, i.e. measurements in x, y, and z axis. There are a number of operations that need to be done on that data to make it usable, many more than we want to look at here. But we can look at how to handle this triple of data and to add a triple of values together. This is done with the gyroscope since it reports the angular rate of change per unit of time. By accumulating those readings you can obtain, theoretically, the bearing of the robot.

C++ Implementation

Here’s the declaration of the class Triple and the overloaded addition operator:

class Triple {
public:
	Triple() = default; // C++11 use default constructor despite other constructors being declared
	Triple(const Triple& t);	// copy constructor so we can track usage
	Triple(const int x, const int y, const int z);

	const Triple& operator +=(const Triple& rhs);

	int x() const;
	int y() const;
	int z() const;
private:
	int mX { 0 };	// C++11 member initialization
	int mY { 0 };
	int mZ { 0 };
};

inline Triple operator+(const Triple& lhs, const Triple& rhs);

I’m using a number of C++11 features here. They’re marked, and the implications for most are obvious if you are familiar with earlier versions of C++. The line with Triple() = default; probably isn’t obvious. It requests that the compiler generate the default constructor. Without it we couldn’t create a variable with no arguments on the constructor: Triple t3;. Normally the default constructor is only created by the compiler when no other constructors are defined. Since Triple has two other constructors there would be no default constructor. I requested one using the notation so variables could be created without arguments.

The next constructor, Triple(const Triple& t), is the copy constructor. It is not needed for this class since C++ would have generated one by default that would have worked fine for this simple class. I created it to show how one works and illustrate where it is invoked. This uses a new C++11 feature where a constructor can invoke another constructor to handle the initialization. This came into being to avoid code duplication, which often led to errors, or the use of a class member to perform initialization.

The final constructor allows us to initialize a Triple with three values. Those three values are stored in the data members of the class.

The next function overloads the plus equals operator. It turns out that the most effective way to implement the actual addition operator, seen a few lines below, is to first implement this operator.

The remaining functions are getters because they allow us to get data from the class. Some classes also have setters that allow setting class values. We don’t want them in Triple.

Here are the implementations of the arithmetic operators:

inline const Triple& Triple::operator +=(const Triple& rhs) {
	mX += rhs.mX;
	mY += rhs.mY;
	mZ += rhs.mZ;
	return *this;
}

inline Triple operator+(const Triple& lhs, const Triple& rhs) {
	Triple left { lhs };
	left += rhs;
	return left;
}

The first operator is straightforward; it simply applies the plus equal operator to each value in the class and returns the instance as a reference. This operator modifies the data in the calling object so the returned reference is valid.

The addition operator uses the plus equal operator in its implementation. Here is where the copy constructor comes into play. We have to create a new object to hold the result so one is created from the lhs value. That’s a copy.

The rhs is added to the new object using plus equal operator and the result returned by value, not by reference. The return is another copy. It cannot be returned by reference because the result object, left, was created inside the function.

There are two possible copies in any arithmetic operator. However, C++ in the standard specifically allows compilers to optimize away the copy for the return value. This is the return value optimization. You’re welcome to try adjusting the code, but there is no way you can avoid creating a copy or two somewhere during this operation.

This code will run on an Arduino, but I created it and ran it on Linux so I could step through the operations to verify where the copy constructor was called and where it wasn’t.

How do you use this? Pretty much the same as any arithmetic operation:

	Triple t1 { 1, 2, 3 };
	Triple t2 { 10, 20, 30 };

	Triple t3 { t1 + t2 };

C Implementation

What would a similar implementation look like in C? How about this:

struct Triple {
	int mX;
	int mY;
	int mZ;
};

void init(struct Triple* t, const int x, const int y, const int z) {
	t->mX = x;
	t->mY = y;
	t->mZ = z;
}
struct Triple add(struct Triple* lhs, struct Triple* rhs) {
	struct Triple result;
	result.mX = lhs->mX + rhs->mX;
	result.mY = lhs->mY + rhs->mY;
	result.mZ = lhs->mZ + rhs->mZ;
	return result;
}

Overall it looks shorter and neater. The struct Triple contains the three data items for the axis. The routine init sets them to user specified values. The add function adds two Triples and returns the result. The add routine avoids initializing result because we know its content will be overwritten by the addition operations. That’s a bit of a savings for C. There is still a copy when the function returns the value. You just don’t have any control of how that copy is done. In this simple situation it doesn’t matter but with a more complicated data structure, say, one with pointers, the copy might be more challenging. We’d probably need to resort to an output parameter using pass by reference with pointers instead of a return value.

Here is how it is used:

	struct Triple t1;
	init(&t1, 1, 2, 3);

	struct Triple t2;
	init(&t2, 10, 20, 30);

	struct Triple t3 = add(&t1, &t2);

Two values are created and initialized and then added. Simple, but you’ve got to remember to take the addresses of the structures and to assure the init routine is only called once.

Consider how the two different versions would look if you implemented a complicated expression. I’ll just say I know which I would prefer.

Wrap Up

I didn’t start this article intending to do a direct comparison between the two languages. I only wanted to illustrate that the copy constructor is, if you insist, a necessary evil. Copies occur in multiple places in both C++ and C. They become critical to understand in C++ when using user defined data types, i.e. classes. Copying in C is less obvious but still necessary.

Since I didn’t intend to make a comparison, I don’t have code size or timings for the two versions. As I pointed out and demonstrated in the article on virtual functions, comparing these simple examples on those parameters is often misleading. A C++ capability is used to solve a problem, not just as an exercise of the language features. Only if an equivalent solution in C is created is a comparison valid.

The Embedding C++ Project

Over at Hackaday.io, I’ve created an Embedding C++project. The project will maintain a list of these articles in the project description as a form of Table of Contents. Each article will have a project log entry for additional discussion. Those interested can delve deeper into the topics, raise questions, and share additional findings.

The project also will serve as a place for supplementary material from myself or collaborators. For instance, someone might want to take the code and report the results for other Arduino boards or even other embedded systems. Stop by and see what’s happening.


Filed under: Hackaday Columns, Software Development

Code Craft: When #define is Considered Harmful

An icon of Computer Science, [Edsger Dijkstra], published a letter in the Communications of the Association of Computer Machinery (ACM) which the editor gave the title “Go To Statement Considered Harmful“. A rousing debate ensued. A similar criticism of macros, i.e. #define, in C/C++ may not rise to that level but they have their own problems.

Macros are part of the preprocessor for the C/C++ languages which manipulates the source code before the actual translation to machine code. But there are risks when macros generate source code. [Bjarne Stroustrup] in creating C++ worked to reduce the need and usage of the preprocessor, especially the use of macros. In his book, The C++ Programming Language he writes,

Don’t use them if you don’t have to. Almost every macro demonstrates a flaw in the programming language, in the program, or in the programmer.

As C retrofitted capabilities of C++, it also reduced the need for macros, thus improving that language.

With the Arduino using the GNU GCC compilers for C and C++ I want to show new coders a couple of places where the preprocessor can cause trouble for the unwary. I’ll demonstrate how to use language features to achieve the same results more cleanly and safely. Of course, all of this applies equally when you use any of these languages on other systems.

We’re only going to be looking at macros in this article but if you want to read more the details about them or the preprocessor see the GNU GCC Manual section on the preprocessor.

Basic Macro Usage

The preprocessor is complex, but described in simplified terms, it reads each line in a compilation unit, i.e. file, scanning for lines where the first non-whitespace character is a hash character (#). There may be whitespace before and after the #. The next token, i.e. a set of characters bounded by whitespace, is the name of the macro. Everything following the name is the argument. A macro has the form:

#define <name> <rest of line>

The simplest macro usage is to create symbols that are used to control the preprocessor or as text substitution in lines of code. A symbol can be created with or without a value. For example:

#define LINUX 
#define VERSION 23 

The first line defines the symbol LINUX but does not give it a value. The second line defines VERSION with the value 23. This is how constant values were defined pre-C++ and before the enhancements to C.

By convention, macro symbol names use all caps to distinguish them from variable and function names.

Symbols without values can only be used to control the preprocessor. With no value they would simply be a blank in a line of code. They are used in the various forms of the #if preprocessor directives to determine when lines of code are included or excluded.

When a symbol with a value appears in a line of code, the value is substituted in its place. Here is how using a macro with a value looks:

const int version_no = VERSION; 

which results in the code

const int version_no = 23; 

This type of macro usage doesn’t pose much of a threat that problems will arise. That said, there is little need to use macros to define constants. The language now provides the ability to declare named constants. One reason macros were used previously was to avoid allocating storage for a value that never changes. C++ changed this and constant declarations do not allocate storage. I’ve tested this on the Arduino IDE, and found that C does not appear to allocate storage but I’ve seen mention that C may do this on other systems.

Here is the current way to define constants:

const int version = 23;
enum {start=10, end=12, finish=24};   // an alternative for related integer consts

Function Macros

Another form of macro is the function macro which, when invoked looks like a function call, but it is not. Similar to the symbol macros, function macros were used to avoid the overhead of function calls for simple sequences of code. Another usage was to provide genericity, i.e. code that would work for all data types.

Function macros are used to pass parameters into the text replacement process. This is fraught with danger unless you pay close attention to the details. The use of inline functions is much safer as I’ll show below.

To illustrate here’s an example of a function macro to multiply two values.

#define MULT(lhs, rhs) lhs * rhs

This function macro is used in source code as:

int v_int = MULT(23, 25);
float v_float = MULT(23.2, 23.3);

Consider this use of the macro, its expansion, and its evaluation, which definitely does not produce the expected result:

int s = MULT(a+b, c+d);
// translates to: int s = a + b * c + d;
// evaluates as: a + (b * c) + d

This can be addressed by adding parenthesis to force the proper evaluation order of the resulting code. Adding the parenthesis results in this code:

#define MULT(lhs, rhs) ((lhs) * (rhs))
int s = MULT(a+b, c+d);
// now evaluates as: (a + b) * (c + d)

The parenthesis around lhs force (a + b) to be evaluated before the multiplication is performed.

Another ugly case is:

#define POWER(value) ((value) * (value))
int s = POWER(a++);
// evaluates as: ((a++) * (a++))

Now there are two problems. First, a is incremented twice, and, second, the wrongly incremented version is used for the calculation. Here again it does not produce the desired result.

It’s really easy to make a mistake like this with function macro definitions. You’re better off using an inline function which is not prone to these errors. The inline equivalents are:

inline int mult(const int x, const int y) { return (x * y); }
inline int power(const int x) { return (x * x); }
 

Now the values of x and y are evaluated before the function is called. The increment or arithmetic operators are no longer evaluated inside the actual function. Remember, an inline function does not produce a function call since it is inserted directly into the surrounding code.

In C, there is a loss of generality using inline over the macro. The inline functions shown only support integers. You can add similar functions for different data types, which the standard libraries do, but the names must reflect the data type. A few cases would be covered by mult_i, mult_f,  mult_l, and mult_d for integer, float, long and double, respectively.

This is less of a problem in C++ where there are two solutions. One is to implement separate functions, as in C, but the function names can all be mult relying on C++’s ability to overload function names.

A nicer C++ version is to use template functions. These really are straightforward for simple situations. Consider:

template <typename T>
inline T mult(const T x, const T y) { return (x * y); }
template <typename T>
inline T power(const T x) { return (x * x); }

You use these just like any other function call and the compiler figures out what to do. There is still one minor drawback. The mult cannot mix data types which MULT has no problem doing. You must use an explicit cast to make the types agree.

The code generated by the inline and template versions are going to be the same as the macro version, except they will be correct. You should restrict the use of macros to preprocessing of code,  not code generation. It’s safer and once you are used to the techniques it’s easy.

If these problems aren’t enough, take a look at the GNU preprocessor manual section which provides more details and examples of problems.

Stringification and Concatenation

The previous sections discussed the problems with macros and how to avoid them using C/C++ language constructs. There are a couple of valuable uses of macros that we’ll discuss in this section.

The first is stringification which converts a function macro argument into a C/C++ string. The second is concatenation which combines two arguments into a single string.

A string is created when a # appears before a token. The result is a string: #quit becomes “quit”.

Two arguments are concatenated when ## appears between them: quit ## _command becomes quit_command.

This is useful in building tables of data to use in a program. An illustration:

#define COMMAND(NAME) { #NAME, NAME ## _command }

struct command commands[] =
{
COMMAND (quit),
COMMAND (help),
...
};

expands to the code

struct command
{
char *name;
void (*function) (void);
};

struct command commands[] =
{
{ "quit", quit_command },
{ "help", help_command },
...
};

Wrapup

The C/C++ preprocessor is powerful and dangerous. The standards committees have followed Stroustrup’s lead in adding features that reduce the need to use the preprocessor. There is still a need for it and probably always will be since it is an inherent part of the languages. Be careful when and how you use #define, and use it sparingly.


Filed under: Hackaday Columns, Software Development, software hacks

Embed with Elliot: the Static Keyword You Don’t Fully Understand

One of our favorite nuances of the C programming language (and its descendants) is the static keyword. It’s a little bit tricky to get your head around at first, because it can have two (or three) subtly different applications in different situations, but it’s so useful that it’s worth taking the time to get to know.

And before you Arduino users out there click away, static variables solve a couple of common problems that occur in Arduino programming. Take this test to see if it matters to you: will the following Arduino snippet ever print out “Hello World”?

void loop()
{
	int count=0;
	count = count + 1;
	if (count &gt; 10) {
		Serial.println(&quot;Hello World&quot;);
	}
}

If you said, “Yes” you absolutely need to read this article. If you said “No, you need to define count as a global variable outside of the loop(), silly!”, you got the answer right in principle, but defining the variable as static is yet better. And we’ll see why.

But the shortest possible summary of this article is the following: if you want a variable to retain its value between calls to the function that contains it, declare that variable static.

And the summary rationale is that declaring a variable to be static sets the scope to be local, but makes the variable stick around for the lifetime of the program, rather than just the function call in which it was defined. Which means that it’s not re-initialized on every call to the function, and that lets us effectively store data within a function. It’s a great trick to have in your programmers’ toolbag.

Scope and Duration

To understand static, we’ll have to tease apart two related concepts of a variable’s availability to our code: scope and duration (or lifetime).

Most of you will know about variable scope. In many languages, if you declare a variable within a function or block, it stays inside that function or block. That’s called “local scope” or “function scope” or similar. Other functions can’t use that variable without our function explicitly passing the variable out. This is great, because it means that you can name all your counter variables count and they won’t collide with each other across functions. You probably know this.

Variables in C & Co. also have a lifetime that they’re guaranteed to be available for, also called their “duration”. After their duration is up, you won’t be able to use the variable any more. This is closely related to the idea of “scope” which specifies in which functional blocks of code the variable is available, but you’re going to want to keep them separate in your mind. Easiest is to think of duration as being a time, and scope as being a location (in code-space).

The default duration of a variable is the current call of the function. That’s why something like:

int counter (void)
{
	int count=0;
	count=count+1;
	return count;
}

will always return 1. Each time you call the function, count gets re-initialized to zero, and when the function returns, that variable’s storage gets reallocated because it has function duration. (And this is exactly the problem with the Arduino snippet above.)

We want our counter() function to keep track of how many times it’s been called. We need the variable count to have program duration — it needs to stick around as long as our code is running, and not get reallocated after every function call.

Globals?

The brute-force method is to declare count as a global variable by defining it outside of any functions. Global variables live for the duration of the program, so the value contained in count will stay available. So far, so good. But variables with global scope have the side effect of making count available from any other function. Sometimes this is exactly what we want, but a lot of the time it’s not.

The problem with global scope is that you can’t use any other global variables of the same name, so you need pick the names carefully to be unique so that they don’t clash. This isn’t a huge problem if you just use long, specific names: My_Global_Loop_Counter instead of count, for instance.

A more subtle problem arises when you intend to use another variable named count in a second function, but forget to declare it. The second function thinks that you meant to use the globally defined count, and havoc ensues.

Finally, it’s tempting to use global variables to pass data among different functions. (This is especially true for people who haven’t yet made peace with pointers and structures.) If you spread the functions using a global variable across different files, it can take heroic feats of debugging to track down every location where a popular global variable is accessed or modified. That is to say, global variables can lead to fragmented, hard to debug code.

No, Static.

Anyway, if you buy the arguments above, what you really want is a variable with program duration but function scope. And that’s exactly what static does when it’s applied to a variable inside a function. The language has precisely the feature you want, so you might as well use it.

And to bring it on home, the Arduino snippet up above will work just fine with a globally defined count variable,

int count;
void loop()
{
	count = count + 1;
	if (count &gt; 10) {
		Serial.println(&quot;Hello World&quot;);
	}
}

but it’s also a lot cleaner when written using static:

void loop()
{
	static int count;
	count = count + 1;
	if (count &gt; 10) {
		Serial.println(&quot;Hello World&quot;);
	}
}

because the name “count” isn’t globally exposed. It’s also declared in the function that uses it, so there’s no question of to whom it belongs. And it works. You can’t beat that.

Static and global variables (all variables with program duration) are pre-initialized to zero by default, so we didn’t need to write static int count = 0; above. However, if we wanted the initial value to be non-zero, we could define the variable like so: static int count = 42;.

There’s something creepy about static int count = 42;, though. It looks like we’re declaring the variable and then setting its value on every pass through the code. But the static keyword prevents this. Trust us, or modify the demo code above to test it out for yourself.

Data Hiding and Singletons and Stuff

There’s a second, related, use of static to mention. When used outside of any functions, at the top-level of a file, static limits the scope to the file in question. Contrast this with global variables which have truly global scope and thus are available across files. Static variables defined outside of functions, then, are like a quasi-global variable: available for the duration of the program from every function defined within the file, but not reachable from outside the file.

Here’s an example where this quasi-global functionality is useful. Say we’d like to spin off the counter code into a counter library. We’re starting off with something like this:

int counter(void){
	static int count;
	count++;
	return count;
}

Every time you call counter() from other code, the function keeps the old value of count, adds one to it, and the counter behaves like it should. This is the function-scope static definition.

But now imagine that we’d also like to reset the counter. We might want a reset() function that will set the value back to zero. But count has function scope, so our reset() function won’t be able to touch it. The solution here is to use the static type declaration at file scope.

You could have something like this in your file “counter.c”:

static int count;

int counter(void){
	count++;
	return count;
}

void reset_count(void){
	count = 0;
}

static void my_secret_function(void){
	count = 42;
}

Because count is defined static at file scope, all the functions in the file will be able to use it, and the value will persist between function calls. Additionally, because my_secret_function() is defined static, it’s not callable from outside this file, though functions defined here can call it as usual.

Defining variables (and functions) static at file scope is perfect for storing library-specific stuff that is needed for the library, but that doesn’t need to be seen on the outside. This is a lot like what object-oriented languages do with public and private data and methods.

In contrast to the object-oriented pattern, though, we’ve only got one instance of our data. You can’t create a second one easily. This is what the software engineer types call a singleton.  These singletons are great for keeping track of global state, where you just write some simple functions to modify and access the data. Since the data is static and file-local, you know that you can’t mess it up from outside, and the manipulation functions are available anywhere you can #include them.

The drawback of singletons are that there’s only ever one of them. If you wanted two counters, for instance, this code wouldn’t help. When you get to such cases, you’ll want full-blown object-oriented support.

… and Arduino

This was a lot of heavy C theory, so you might think it’s not applicable if you’re programming in Arduino. If you think like that, you weren’t paying attention during the last “Embed”; Arduino is C/C++ with added convenience libraries. And in fact, the particular way that Arduino repeatedly calls the loop() function makes knowing a bit about scoping nearly mandatory: all of your function-call-duration variables get wiped each time through the loop unless you do something.

As we saw in the introductory example, you can’t initialize a variable inside the loop() without it resetting each time through the loop. If you didn’t know about the static keyword, you’d probably take the brute-force solution and define the variable as global. A bunch of the Arduino examples do just this.

What could go wrong?

int i;

void loop(){
	i++;
	Serial.println(i);
	delay(100);
	doSomethingTwice();
}

void doSomethingTwice(){
	for (i=0; i&lt;2; i++){
		Serial.println(&quot;twice?&quot;);
	}
}

Well, say you’ve gotten in the habit, as we have, of using i as a generic loop variable. And say you re-used it without definition in another function, maybe even in another “.ino” file, that you call from within the loop. Because i is a global variable, the accidental second use is actually perfectly kosher. The compiler won’t be able to help you find the mistake, but your program won’t work right. Instead of counting up, the poor Arduino will just keep printing “3”.

There are two causes of this problem: forgetting to declare i inside doSomethingTwice(), and creating a global variable with an easy-to-reuse name. You will forget to declare variables from time to time. We all do. And when we do, we want the compiler to let us know.

So there are two ways to avoid the problem. The first solution is to name your global variables something crazy so that they’re unlikely to be reused. “My_Amazing_Global_Loop_Counter” would work well. Even typing it once makes me never want to type it again.

The “right” solution solves the initial problem that led us to use a global variable in the first place. We simply define our counter variable as static inside loop() and then it has program duration but function scope and all’s well. Now we can even call the counter variable i with impunity, because it’s scoped to loop().


void loop(){
	static int i;
	i++;
	Serial.println(i);
	delay(100);
	doSomethingTwice();
}

void doSomethingTwice(){
	// You'll get an error here b/c i isn't declared.  Thanks, compiler!
	for (i=0; i&lt;2; i++){
		Serial.println(&quot;twice?&quot;);
	}
}

And what do you do if you want to share variables among different functions within a single “sketch”? One way is to pass them directly as arguments, but again you’ll see lots of folks resort to global variables that can be simply accessed from each relevant function. It’s not a huge problem if you take care to name the variables well, so we won’t insist.

But you could also do it “correctly” and define the global variables as static at the file level instead. You’ll still be able to make subtle mistakes within your own functions, but by declaring the variables static you won’t run the chance of confusing them up with variables defined elsewhere in the Arduino infrastructure. It’s a belt when you’re already wearing suspenders, but it doesn’t cost you anything except a tiny bit more typing.

Conclusion

Use the static keyword for variables within a function when you want the value to persist across function calls, but you don’t need to enlarge the scope. Use the static keyword for functions and variables at the file level when you want them to behave like quasi-globals, being accessible everywhere within the file but hidden from the outside.

See? That’s not so mysterious after all.


Filed under: Hackaday Columns

Embed with Elliot: There is no Arduino “Language”

This installment of Embed with Elliot begins with a crazy rant. If you want to read the next couple of paragraphs out loud to yourself with something like an American-accented Dave-Jones-of-EEVBlog whine, it probably won’t hurt. Because, for all the good Arduino has done for the Hackaday audience, there’s two aspects that really get our goat.

(Rant-mode on!)

First off is the “sketch” thing. Listen up, Arduino people, you’re not writing “sketches”! It’s code. You’re not sketching, you’re coding, even if you’re an artist. If you continue to call C++ code a “sketch”, we get to refer to our next watercolor sloppings as “writing buggy COBOL”.

And you’re not writing “in Arduino”. You’re writing in C/C++, using a library of functions with a fairly consistent API. There is no “Arduino language” and your “.ino” files are three lines away from being standard C++. And this obfuscation hurts you as an Arduino user and artificially blocks your progress into a “real” programmer.

(End of rant.)

Let’s take that second rant a little bit seriously and dig into the Arduino libraries to see if it’s Arduinos all the way down, or if there’s terra firma just beneath. If you started out with Arduino and you’re looking for the next steps to take to push your programming chops forward, this is a gentle way to break out of the Arduino confines. Or maybe just to peek inside the black box.

Arduino is C/C++

Click on the “What is Arduino” box on the front page of arduino.cc, and you’ll see the following sentence:

“ARDUINO SOFTWARE: You can tell your Arduino what to do by writing code in the Arduino programming language…”

Navigate to the FAQ, and you’ll see

“the Arduino language is merely a set of C/C++ functions that can be called from your code”.

Where we come from, a bunch of functions written in a programming language is called a library. So which is it, Arduino?

(The Language Reference page is a total mess, combining parts of standard C with functions defined in the Arduino core library.)

Maybe that’s not as sexy or revolutionary as claiming to have come up with a new programming language, but the difference matters and it’s a damn good thing that it’s just a set of libraries. Because the beauty about the Arduino libraries is that you don’t have to use them, and that you can pick and choose among them. And since the libraries are written in real programming languages (C/C++), they’re a totally useful document if you understand those languages.

C and Assembly language, on the other hand, are different languages. If you’re writing assembler, you can easily specify exactly which of the chip’s native instructions to use for any particular operation — not so in C. Storing data in particular registers in the CPU is normal in assembler, but heroic in C. So if you start out writing your code in C, and then find out that you need some of the features of assembler, you’re hosed. You stop writing in C and port all your code over to assembler. You have to switch languages. You don’t get to pick and choose.

(Yes, there is inline assembler in GCC.  That’s cheating.)

This is not at all the case with Arduino: it’s not a programming language at all, and that’s a darned good thing. You’re writing in C/C++ with some extra convenience libraries on top, so where the libraries suck, or they’re just plain inconvenient, you don’t have to use them. It’s that simple.

A prime example is digitalWrite() in the Arduino’s core library, found in the “wiring_digital.c” file. It’s madness to use the ridiculously slow digitalWrite() functions when speed or timing matters. Compared to flipping bits in the output registers directly, digitalWrite() is 20-40x slower.

The scope shots here are from simply removing the delay statements from the Blink.ino example code that comes with Arduino — essentially toggling the LED pin at full speed. Upper left is using digitalWrite() to flip the pin state. Upper right is using direct bit manipulation in C: PORTB ^= (1 << LED_BIT); Because Arduino’s digitalWrite() command has a bunch of if...then statements in it that aren’t optimized away by the compiler, it runs 28 times slower.

(And worse, as you can see in the lower left, the Arduino code runs with occasional timing glitches, because an interrupt service routine gets periodically called to update the millisecond timer. That’s not a problem with digitalWrite() per say, but it’s a warning when attempting tight timing using the Arduino defaults.)

OK, so digitalWrite() is no good for timing-critical coding. If Arduino were a real language, and you were stuck with digitalWrite(), you wouldn’t be able to use the language for anything particularly sophisticated. But it’s just a convenience function. So you can feel free to use digitalWrite() in the setup() portion of your code where it’s not likely to be time critical. That doesn’t mean that you have to use it in the loop() portion when timing does matter.

And what this also means is that you’re no longer allowed to say “Arduino sucks”. Arduino is C/C++, and at least C doesn’t suck. (Zing! Take that, C++ lovers. De gustibus non disputandem est.) If you think that some of the Arduino libraries suck, you’re really going to have to specify which libraries in particular you mean, or we’ll call you out on it, because nobody’s forcing you to use them wholesale. Indeed, if you’re coding on an AVR-based Arduino, you’ve got the entire avr-libc project baked in. And it doesn’t suck.

The “.ino” is a Lie

So if Arduino is just C/C++, what’s up with the “.ino” filetype? Why is it not “.c” or “.cpp” like you’d expect? According to the Arduino build process documentation,

“The Arduino environment performs a few transformations to your main sketch file (the concatenation of all the tabs in the sketch without extensions) before passing it to the avr-gcc compiler.”

True C/C++ style requires you to declare (prototype) all functions that you’re going to use before you define them, and this is usually done in a separate header “.h” file. When C compiles your code, it simply takes each function and turns it into machine code. In a philosophically (and often practically) distinct step, references to a function are linked up with the compiled machine code representing them. The linker, then, only needs to know the names of each function and what types of variables it needs and returns — exactly the data in the function declaration.

Long story short: functions need prior declaration in C, and your “.ino” code defines setup() and loop() but never declares them. So that’s one thing that the Arduino IDE does for you. It adds two (or more, if you define more functions in your “.inos”) function prototypes for you.

The other thing the IDE’s preprocessor does is to add #include "Arduino.h" to the top of your code, which pulls in the core Arduino libraries.

(And then, for some mysterious reason, it also deletes all comments from your code, making it harder to debug later on. Does anyone out there know why the Arduino IDE does this?)

So that’s it. Three lines (or maybe a few more) of very simple boilerplate separate a “sketch” from valid C/C++ code. This was presumably done in the interest of streamlining the coding experience for newbies, but given that almost every newb is going to start off with the template project anyway, it’s not clear that this buys much.

On the other hand, the harm done to the microcontroller newbie is reasonably large. The newb doesn’t know that it’s actually C/C++ underneath the covers and doesn’t learn anything about one of the most introductory, although mindless, requirements of the language(s): function declarations.

When the newb eventually does want to include outside code, the newb will need to learn about #include statements anyway, so hiding #include "Arduino.h" is inconsistent and sets up future confusion. In short, the newb is blinded from a couple of helpful learning opportunities, just to avoid some boilerplate that’s templated out anyway.

Write C++ Directly in the Arduino IDE

And if you don’t believe that Arduino is C/C++, try the following experiment:

  1. Save a copy of the example Blink project.
  2. Go into the “sketch’s” directory and copy Blink.ino to Blink.cpp.
  3. Re-open the project in Arduino and delete everything from Blink.ino
  4. Add the required boilerplate to Blink.cpp. (One include and two function declarations.)
  5. Verify, flash, and whatever else you want.

You’ve just learned to write C/C++ directly from within the Arduino IDE.

(Note: for some reason, the Arduino IDE requires a Blink.ino file to be present, even if it’s entirely empty. Don’t ask us.)

main.cpp

So if Blink.ino turns into Blink.cpp, what’s up with the setup() and loop() functions? When do they ever get called? And wait a minute, don’t all C and C++ programs need a main() function to start off? You are on the path to enlightenment.

Have a look at the main.cpp file in hardware/arduino/avr/cores/arduino.

There’s your main() function! And although we’ve streamlined the file a little bit for presentation, it’s just about this straightforward.

The init() function is called before any of your code runs. It is defined in “wiring.c” and sets up some of the microcontroller’s hardware peripherals. Included among these tasks on the AVR platform is configuring the hardware timers for the milliseconds tick and PWM (analogOut()) functions, and initializing the ADC section. Read through the init() function and the corresponding sections of the AVR datasheet if you’ve never done any low-level initializations of an AVR chip; that’s how it’s done without Arduino.

And then we get to the meat. The setup() function is called, and in an endless for loop, the loop() function is continually called. That’s it, and it’s the same code you’d write in C/C++ for any other microcontroller on the planet. That’s the magic Arduino setup() and loop(). The emperor has no clothes, and the Wizard of Oz is just a pathetic little man behind a curtain.

If you want to dig around more into the internals of the Arduino core library, search for “Arduino.h” on your local install, or hit up the core library on Github.

The Arduino Compile Phase

So we’ve got C/C++ code. Compiling it into an Arduino project is surprisingly straightforward, and it’s well-documented in the Arduino docs wiki. But if you just want to see for yourself, go into Preferences and enable verbose logging during compilation. Now the entire build process will flash by you in that little window when you click “Verify”.

It’s a lot to take in, but it’s almost all repetitive. The compiler isn’t doing anything strange or unique at all. It’s compiling all of the functions in the Arduino core files, and putting the resulting functions into a big (static) library file. Then it’s taking your code and this library and linking them all together. That’s all you’d do if you were writing your own C/C++ code. It’s just that you don’t know it’s happening because you’re pressing something that looks like a play button on an old Walkman. But it’s not rocket science.

There is one more detail here. If you include a library file through the menu, and it’s not part of the core Arduino libraries, the IDE locates its source code and compiles and links it in to the core library for you. It also types the #include line into your “.ino” file. That’s nice, but hardly a deal-breaker.

If you’d like to see this build process in the form of a Makefile, here’s (our) primitive version that’s more aimed at understanding, and this version is more appropriate for production use.

Next Steps

If the Arduino is the embedded electronics world’s gateway drug, what are the next steps that the Arduino programmer should take to become a “real” embedded programmer slash oil-burning heroin junkie? That depends on you.

Are you already good at coding in a lower-level language like C/C++? Then you need to focus on the microcontroller-specific side of things. You’re in great shape to just dive into the Arduino codebase. Try to take a few of the example pieces, or even some of your own “sketches” and look through the included Arduino library’s source code. Re-write some simple code outside the IDE and make sure that you can link to the Arduino core code. Then replace bits of the core with your own code and make sure it still works. You’ll spend half of your time looking into the relevant micro’s datasheet, but that’s good for you.

Are you comfy with electronics but bewildered by coding? You might spend a bit of time learning something like C. Learn C the Hard Way is phenomenal, and although it’s aimed at folks working on bigger computers, it’s got a lot of the background that you’ll need to progress through and beyond the Arduino codebase. You’re going to need to learn about the (relatively trivial) language conventions and boilerplatey stuff to get comfortable in straight C/C++, and then you can dig in to the Arduino source.

Wherever you are, remember that Arduino isn’t a language: it’s a set of libraries written in C/C++, some of them really quite good, and some of them (we’re looking at you EEPROM) simply C++ wrappers on the existant avr-libc EEPROM library. And that means that for every Arduino project you’ve written, you’ve got the equivalent source code sitting around in C/C++, ready for you to dig into. Thank goodness they didn’t invent their own programming language!


Filed under: Hackaday Columns