OpenLX SP – Blog

Config files are trees

15 Jun 2022, 7:59 a.m.

In our last post we looked at problems with the inconsistency of configuration file formats, but what's the answer? Most config files have some sort of tree structure – consider a webserver as an example, these have a set of server names and then within each server there’s a set of configuration entries for locations etc. Would some sort of tree of trees be suitable?

With a tree structure, upgrades could easily insert a new node, or have a well-defined ‘tree diff’ of nodes to insert, change or remove, rather than the less robust method of trying to programmatically slice up and insert text into a config file.

You could also have a grammar that defined permitted entries in the tree, and that definition could be used for validation. Add to this per-node/branch permissions and you have a fine grained permissions structure. Documentation could be included in the tree definition which could be extracted to build documentation, rather like Javadoc does for any Java program or Sphinx/Autodoc does for Python. ‘Types’ could be defined for nodes, e.g. enumerated lists that defined all possible options that could be used by configuration checkers, documentation or GUI tools.

Add idempotent APIs to manipulate the trees and with IaaS, configurations could be stored and applied – e.g. at setup, or any point in the life of a server, a tree representing the config to deploy could be used, rather than effectively trying to mechanically edit config files through pattern matching and anchor points, that can’t cope if a config file has deviated from what’s expected.

This approach is not too dissimilar from Microsoft's Windows Registry that provides a tree of config values and files that can act as ‘tree diffs’ that specify nodes to add etc.

Moving on

There are reasons why it became standard to use some sort of text file to hold configuration data. Many were created in an earlier era of computing; before the likes of YAML, JSON and XML had been invented. Also, in some ways we’re just as guilty – when we needed a config for CodeBug Connect for WiFi; we wanted something simple for users, so we went for a text file. In our defence we do provide an API to update that text file and for other configs we used JSON.

But if the configuration of everything in Linux were to be a tree, it would be easy to create web APIs that could manipulate them. There’s a clear need for this when you look at VyOS, pfsense (I know strictly it’s not Linux but BSD based) or similar, which all have complicated logic to generate config files to drive applications and systems such as firewalls, vpns etc. The GUIs are almost completely disconnected too – when a new configuration option is made available to a config file, completely separate logic needs to be implemented to make that option appear in a menu etc..

There’s hope on the way though in that newer features of Linux handle configuration more sensibly. For example, Netplan, which sets up the network, uses a YAML format, and then has ‘renderers’ that apply it. There are also mechanisms, such as cloud-init to allow these configuration files to be applied to new servers as they’re being commissioned.

The Elektra Project is making early steps and we'll be watching closely to see if more applications adopt it.

There are still opportunities to make this work in a standard reusable way and to create standard tools that can find and merge differences in configs, or for graphical editors to provide more flexibility to users.

Wouldn’t it be easier if Linux, its applications and its utils all offered a well thought out API, achieved through a reusable set of libraries and tooling rather than a jumble of text files?

Is it time to move on from the humble config file?

12 May 2022, 9:50 a.m.

In the age of Infrastructure as a Service (IaaS) most cloud platforms have APIs to update and read the current configurations. Linux in comparison, tends to have an array of text files. While there are tools such as Ansible, Chef, Puppet etc to automate configuration of config files, they often don’t really understand the text files. Instead, they apply mechanical text transformations with complex expressions to match lines or words or make substitutions. We’d argue there’s a number of drawbacks of having lots of text files for config:

A separate format for every program which results in:

Each program maintaining its own parser. This isn’t ideal for security, plus other associated tooling needs to be maintained and developed, e.g. means of generating corresponding documentation.
Users having to learn multiple syntaxes.
Little reuse of external configuration tools, e.g. Ansible has separate modules to handle configuration of each program.
No accessible parser for external tools to manipulate the config files – instead tools have to build their own parsers or some hideously complicated regular expression to ‘find and replace’ that hopefully doesn’t have unintended consequences.
'Blunt' version control. At best, version control considers the contents of the text file. There’s no ‘smartness’ that determines what the change actually means – has a setting been updated, or is it just an updated comment?
Permissions granularity – permissions are only enforced by the file system on whole text file level

As computer scientists we look for patterns and opportunities where we can reuse code. But many programs are maintaining their own parsers and mechanisms for generating documentation. Users also have multiple syntaxes to contend with; ‘what is the separator for this file, commas, tabs, colons or equal sign?’, ‘What about comments, are they # , or ;, and are they restricted to being the first character of a line’?

When text files are compared without parsing them, it's difficult to know the meaning of what has changed. Consider when Linux packages are upgraded, the current mechanism is to use diff to compare text files and look for the lines that aren’t identical. Diff can’t determine between a harmless comment added to a line or a breaking change. If instead, a standard structure were to be adopted, there could be a standard tool to determine exactly which nodes had changed, and further logic could be built to determine if this resulted in conflicting settings, or if not, merge them.

With text file configurations, there’s no fine grained security. It’s not possible to restrict access to a single configuration setting, all that can be done is limiting read and/or write permissions of the entire file.

In our next post we suggest where we think development is needed to bring configuration files up-to-date.

With Linux almost 30 years old, we ask: is it time to modernise some of its aspects?

3 Apr 2022, 2:23 p.m.

In a series of blog posts we look at opportunities to improve Linux to make it serve the needs of Enterprise better. With Android being based on Linux, embedded systems like Raspberry Pi and industrial controllers embedded in all manner of routers, ticket machines, etc. it's certainly made its mark on the world. And that’s not even mentioning the majority of servers and cloud instances on the web that run it, nor Microsoft’s move to add WSL allowing Linux to be used at the core of its Desktop operating system. But we think there are some shortcomings related to the computing landscape of the 2020s.

At OpenLX we make extensive use of Linux, from our LoraWAN base stations that collect sensor data, to our data-warehouse and analytics systems in the cloud. We also use Amazon AWS, Google Cloud and Oracle Cloud services and noticed an opportunity for Linux to do things differently. In all projects, we need to manage configuration so we or our users can make updates or ensure consistency across entire fleets of devices and deployments. And for that, using managed cloud services is easier.

Most of the cloud platforms have a similar approach to configuration and management, with role based security and rest APIs to read and write configurations. In contrast, Linux configuration settings tend to be set in text files, with file-based permissions. In the next series of blog posts, we explore if it’s time to reconsider how configuration and security are handled in Linux?

New CodeBug Connect launch on Kickstarter

18 Nov 2020, 2:45 p.m.

We're delighted to announce we've launched our latest member of the CodeBug family, CodeBug Connect. If you want to get your hands on one as soon as they're produced, back us on the Kickstarter

CodeBug Connect

Online Quizzing with QuizIt Live!

5 Jul 2020, 2:38 p.m.

With lockdown we're all finding it hard not being able to see friends and family. That's why we used our expertise with real-time remote messaging to create QuizIt Live!

QuizIt Live Website

We designed and created it while we've not been able to make our usual site visits.

A new spin on Zoom quizzing, the website works like a slick TV game show, firing the multiple choice question off to all the players at the same time. And the eight second timer means it’s fast and furious fun (plus there’s no time to sneakily Google the answer!).

The website (www.quizit.net) comes jam-packed with more than 200 questions and also has a scoreboard feature that automatically records everyone’s scores. And best of all, only the host needs to sign up - everyone else plays for free!

We're also offering a lockdown offer for the enhanced features of just £1.99 a month. It comes with the option of setting your own questions and giving teams more than eight seconds to answer. There’s also a free version with limited questions each month for those who only want to run the occasional quiz.

Blog