Posts with «arduino engineering» label

How to deal with API clients, the lazy way — from code generation to release management

This post is from Massimiliano Pippi, Senior Software Engineer at Arduino.

The Arduino IoT Cloud platform aims to make it very simple for anyone to develop and manage IoT applications and its REST API plays a key role in this search for simplicity. The IoT Cloud API at its core consists of a set of endpoints exposed by a backend service, but this alone is not enough to provide a full-fledge product to your users. What you need on top of your API service are:

  • Good documentation explaining how to use the service.
  • A number of plug-and-play API clients that can be used to abstract the API from different programming languages.

Both those features are difficult to maintain because they get outdated pretty easily as your API evolves but clients are particularly challenging: they’re written in different programming languages and for each of those you should provide idiomatic code that works and is distributed according to best practices defined by each language’s ecosystem.

Depending on how many languages you want to support, your engineering team might not have the resources needed to cover them all, and borrowing engineers from other teams just to release a specific client doesn’t scale much. 

Being in this exact situation, the IoT Cloud team at Arduino had no other choice than streamlining the entire process and automate as much as we could. This article describes how we provide documentation and clients for the IoT Cloud API.

Clients generation workflow

When the API changes, a number of steps must be taken in order to ship an updated version of the clients, as it’s summarized in the following drawing. 

As you can see, what happens after an engineer releases an updated version of the API essentially boils down to the following macro steps:

1. Fresh code is generated for each supported client.
2. A new version of the client is released to the public.

The generation process

Part 1: API definition

Every endpoint provided by the IoT Cloud API is listed within a Yaml file in OpenAPI v3 format, something like this (the full API spec is here):

/v2/things/{id}/sketch:
    delete:
      operationId: things_v2#deleteSketch
      parameters:
      - description: The id of the thing
        in: path
        name: id
        required: true
        schema:
          type: string
      responses:
        "200":
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ArduinoThing'
          description: OK
        "401":
          description: Unauthorized
        "404":
          description: Not Found

The format is designed to be human-readable, which is great because we start from a version automatically generated by our backend software that we manually fine-tune to get better results from the generation process. At this stage, you might need some help from the language experts in your team in order to perform some trial and error and determine how good the generated code is. Once you’ve found a configuration that works, operating the generator doesn’t require any specific skill, the reason why we were able to automate it.

Part 2: Code generation

To generate the API clients in different programming languages we support, along with API documentation we use a CLI tool called openapi-generator. The generator parses the OpenAPI definition file and produces a number of source code modules in a folder on the filesystem of your choice. If you have more than one client to generate, you will notice very soon how cumbersome the process can get: you might need to invoke openapi-generator multiple times, with different parameters, targeting different places in the filesystem, maybe different git repositories; when the generation step is done, you have to go through all the generated code, add it to version control, maybe tag, push to a remote… You get the gist. 

To streamline the process described above we use another CLI tool, called Apigentools, which wraps the execution of openapi-generator according to a configuration you can keep under version control. Once Apigentools is configured, it takes zero knowledge of the toolchain to generate the clients – literally anybody can do it, including an automated pipeline on a CI system.

Part 3: Automation

Whenever the API changes, the OpenAPI definition file hosted in a GitHub repository is updated accordingly, usually by one of the backend engineers of the team. A Pull Request is opened, reviewed and finally merged on the master branch. When the team is ready to generate a new version of the clients, we push a special git tag in semver format and a GitHub workflow immediately starts running Apigentools, using a configuration stored in the same repository. If you look at the main configuration file, you might notice for each language we want to generate clients for, there’s a parameter called ‘github_repo_name’: this is a killer feature of Apigentools that let us push the automation process beyond the original plan. Apigentools can output the generated code to a local git repository, adding the changes in a new branch that’s automatically created and pushed to a remote on GitHub.

The release process

To ease the release process and to better organize the code, each API client has its own repo: you’ll find Python code in https://github.com/arduino/iot-client-py, Go code in https://github.com/arduino/iot-client-go and so on and so forth. Once Apigentools finishes its run, you end up with new branches containing the latest updates pushed to each one of the clients’ repositories on GitHub. As the branch is pushed, another GitHub workflow starts (see the one from the Python client as an example) and opens a Pull Request, asking to merge the changes on the master branch. The maintainers of each client receive a Slack notification and are asked to review those Pull Requests – from now on, the process is mostly manual.

It doesn’t make much sense automate further, mainly for two reasons:

  1. Since each client has its own release mechanism: Python has to be packaged in a Wheel and pushed to PyPI, Javascript has to be pushed to NPM, for Golang a tag is enough, docs have to be made publicly accessible. 
  2. We want to be sure a human validates the code before it’s generally available through an official release.

Conclusions

We’ve been generating API clients for the IoT Cloud API like this for a few months, performing multiple releases for each supported programming language and we now have a good idea of the pros and cons of this approach.

On the bright side: 

  • The process is straightforward, easy to read, easy to understand.
  • The system requires very little knowledge to be operated.
  • The time between a change in the OpenAPI spec and a client release is within minutes.
  • We had an engineer working two weeks to set up the system and the feeling is that we’re close to paying off that investment if we didn’t already.

On the not-so-bright side: 

  • If operating the system is trivial, debugging the pipeline if something goes awry requires a high level of skill to deep dive into the tools described in this article.
  • If you stumble upon a weird bug on openapi-generator and the bug doesn’t get attention, contributing patches upstream might be extremely difficult because the codebase is complex.

Overall we’re happy with the results and we’ll keep building up features on top of the workflow described here. A big shoutout to the folks behind openapi-generator and Apigentools!

Arduino Education nominated in BETT Awards 2020

The Arduino Engineering Kit has been nominated as finalist for BETT Awards 2020 under the category “Higher Education or further education digital Services”.  22nd January, the contention will take place.

ABOUT THE BETT AWARDS

The Bett Awards are a celebration of the inspiring creativity and innovation that can be found throughout technology for education. The awards form an integral part of Bett each year, the world’s leading showcase of education technology solutions. The winners are seen to have excelled in ICT provision and support for nurseries, schools, colleges and special schools alike with a clear focus on what works in the classroom.

ABOUT THE NOMINATION:

The Arduino Engineering kit developed in partnership with Mathworks is aimed at higher education engineering students. It features hands-on projects that will cover system modelling, controls, robotics, mechatronics and other important engineering concepts.

Despite Arduino and Mathworks being some of the most widely used products in the engineering field all over the world, there wasn’t any product that was teaching how to integrate Matlab and Simulink software with Arduino hardware. Thus Arduino together with Mathworks, saw this as an opportunity to join forces to develop a learn-by-doing kit that provided real world example usage scenarios to teach both the software and engineering fundamentals of the following:

  • Robotics
  • Mechatronics
  • control systems
  • image and video processing
  • physics, and mathematics

The kit is built on its own Education Learning Management System (LMS) with step-by-step instructions and lessons. It comes in a stackable toolbox for storage and years of reuse. The student will have access to a dedicated e-learning platform and other learning materials, including a one-year individual license for MATLAB and Simulink.

ARDUINO AT BETT:

Fabio Violante, Arduino’s CEO, says,  “We are delighted to feature a series of new Arduino Education programs at BETT 2020 which will expand STEAM learning for lower secondary to university students. Our technology, programming, and curriculum content are creative tools – just like brushes and paint – that students can use as they become part of our next generation of scientists and artists.

Contributing back to Ansible — flexible secrets with some Sops

This post is from Edoardo Tenani, DevOps Engineer at Arduino.

In this blog, we’re going to answer: How does one store sensitive data in source code (in this case, Ansible playbooks) securely and in a way that the secrets can be easily shared with the rest of the team?

Ansible is an open source community project sponsored by Red Hat, it’s the simplest way to automate IT. Ansible is the only automation language that can be used across entire IT teams from systems and network administrators to developers and managers.

At Arduino, we started using Ansible around the beginning of 2018 and since then, most of our infrastructure has been provisioned via Ansible playbooks: from the frontend servers hosting our websites and applications (such as Create Web Editor), to the MQTT broker at the heart of Arduino IoT Cloud.

As soon as we started adopting it, we faced one of the most common security problems in software: How does one store sensitive data in source code (in this case, Ansible playbooks) securely and in a way that the secrets can be easily shared with the rest of the team?

Ansible configuration system comes to the rescue here with its built-in mechanism for handling secrets called Ansible Vault, but unfortunately it had some shortcomings for our use case.

The main disadvantage is that Vault is tied to Ansible system itself: In order to use it, you have to install the whole Ansible stack. We preferred a more self-contained solution, possibly compiled in a single binary to ease portability (i.e. inside Docker containers).

The second blocker is the “single passphrase” Ansible Vault relies on: a shared password to decrypt the entire vault. This solution is very handy and simple to use for personal projects or when the team is small, but as we are constantly growing as a company we preferred to rely on a more robust and scalable encryption strategy. Having the ability to encrypt different secrets with different keys, while being able to revoke access to specific users or machines at any time was crucial to us.

The first solution we identified has been Hashicorp Vault, a backend service purposely created for storing secrets and sensitive data with advanced encryption policies and access management capabilities. In our case, as the team was still growing, the operational cost of maintaining our Vault cluster was considered too high (deploying a High Available service that acts as a single point of failure for your operations is something we want to handle properly and with due care).

Around that same time, while reading industry’s best practices and looking for something that could help us managing secrets in source code, we came across mozilla/sops, a simple command line tool that allows strings and files to be encrypted using a combination of AWS KMS keys, GCP KMS keys or GPG keys.

Sops seemed to have all the requirements we were looking for to replace Ansible Vault:

  • A single binary, thanks to the porting from Python to Golang that Mozilla recently did.
  • Able to encrypt and decrypt both entire files and single values.
  • Allow us to use identities coming from AWS KMS, identities that we already used for our web services and where our operations team had access credentials.
  • A fallback to GPG keys to mitigate the AWS lock-in, allowing us to decrypt our secrets even in the case of AWS KMS disruption.
  • The same low operational cost.

Sops’ adoption was a great success: The security team was happy and the implementation straightforward, with just one problem. When we tried to use Sops in Ansible configuration system, we immediately noticed what a pain it was to encrypt variables.

We tried to encrypt/decrypt single values using a helper script to properly pass them as extra variables to ansible-playbook. It almost worked, but developers and operations were not satisfied: It led to errors during development and deployments and overall felt clumsy and difficult.

Next we tried to encrypt/decrypt entire files. The helper script was still needed, but the overall complexity decreased. The main downside was that we needed to decrypt all the files prior to running ansible-playbook because Ansible system didn’t have any clue about what was going on: those were basically plain ansible var_files. It was an improvement, but still lacking the smooth developer experience we wanted.

As Ansible configuration system already supports encrypted vars and storing entire files in Ansible Vault, the obvious choice was to identify how to replicate the behaviour using Sops as the encryption/decryption engine.

Following an idea behind a feature request first opened upstream in the Ansible repository back in 2018 (Integration with Mozilla SOPS for encrypted vars), we developed a lookup plugin and a vars plugin that seamlessly integrate Ansible configuration system and Sops.

No more helper scripts needed

Just ensure Sops executable is installed, correct credentials are in place (ie. AWS credentials or GPG private key) and run ansible-playbook as you normally would.

We believe contributing to a tool we use and love is fundamental in following the Arduino philosophy of spreading the love for open source. 

Our sops plugins are currently under review in the mozilla/sops GitHub repository: Add sops lookup plugin and Add sops vars plugin

You can test it out right away by downloading the plugin files from the PRs and adding them in your local Ansible controller installation. You will then be able to use both plugins from your playbooks. Documentation is available, as for all Ansible plugins, in the code itself at the beginning of the file; search for DOCUMENTATION if you missed it.

If you can leave a comment or a GitHub reaction on the PR, that would be really helpful to expedite the review process.

What to do from now on?

If you’re a developer you can have a look at Sops’ issues list and contribute back to the project!

The Sops team is constantly adding new features (like a new command for publishing encrypted secrets in latest 3.4.0 release, or Azure Key Vault support) but surely there are interesting issues to tackle. For example, the Kubernetes Secret integration being discussed in issue 401 or the –verify command discussed in issue 437.

Made with <3 by the Arduino operations team!

Ansible® is a registered trademark of Red Hat, Inc. in the United States and other countries.