Kaleidoscope's implementation - possible improvements

@merlin is working on some interesting ideas to change key positions/addresses from two byte storage to one byte, in order to safe memory. When discussing with him, I was reminded about some related aspects that could be improved in Kaleidoscope.

In order not to forget them, I will summarize most of those below, together with ideas how to improve them.

Dear developers, please don’t feel offended by me writing this. You have done a great job so far.

What I state here are just ideas about how things could be optimized from the perspective of a long-time C++ programmer. None of the points I address below are related to any bugs/errors (hopefully). Most is about readability, type safety and portability. None of the improvements proposed here would affect neither performance, nor memory consumption. Only some changes would break existing code, but are still worth being considered carefully.

I believe that it is worthwhile optimizing things (if agreed on) as early as possible, before there are more and more users and devs around. Keyboardio is soon going to ship thousands of keyboards and then there will be hundreds of developers working with Kaleidoscope. The more people are going to use it, the more of them are affected by possible future interface changes. That’s why I believe it is a good idea to polish things as early as possible.

There are likely more improvement ideas apart from the ones stated below. Let’s collect and discuss them here!

Type Safety

Macros and Macro Functions

There are far too many macros and macro functions used in Kaleidoscope. None of them are type safe. I know that this is the way it is done in C, but C++ provides features that make things safer for the programmer, so why not use them.
All preprocessor macros, such as key definitions could actually be static constexpr constants

// in key_defs.h
constexpr ModHeldType CTRL_HELD = B00000001;
// instead of
#define CTRL_HELD B00000001
// ModHeldType would be uint8_t on most platforms (see below)

or

// in key_defs_keyboard.h
constexpr Key_ Key_A = { HID_KEYBOARD_A_AND_A, KEY_FLAGS };
// instead of 
#define Key_A (Key) { HID_KEYBOARD_A_AND_A, KEY_FLAGS }

and macro functions could be inline constexpt functions to make things type safe. All that without performance loss.

inline constexpr Key LCTRL(const Key &k) { return Key(k.keyCode, k.flags | CTRL_HELD); }

This also would avoid compiler warnings (e.g. about narrowing conversions) and errors on other platforms (see portability). Of course, if this would be changed, we would have to stick with the old upper case macro names, to preserve compatibility.

Changing this is very easy, as it can mostly be done with regex based replacement in single files.

Apples and Oranges

Currently almost everything is uint8_t or uint16_t in Kaleidoscope. Without any performance loss and with a great gain in readability every type could be turned into a class, or at least into a typedef. Using classes (like Key) for such things prevents accidental assignment of information that is e.g. stored as uint8_t but of completely unrelated meaning (apples and oranges). This is also related to the portability aspects, discussed below.

This is harder to achieve than replacing macros, as it requires a careful analysis of the different types of data that is used throughout the whole firmware. Changes must be done with care.

Encapsulation

Packing and unpacking of information (juggling with bits) definitively belongs into a class. That’s nice about the Key class, but it is still far from perfect as it is abused to store/pack all sorts of information (keycodes and control information of different type). Here I would love to see a dedicated derived classes of Key_ that would code and decode information and would name what is actually stored. This would make things way more transparent.

There seem to be different ways, modifiers are coded in a Key_ object and in the keyboard HID report. For new contributers (like me), this is pretty confusing. Here dedicated classes definitely could improve readability.

That is a restricted amount of work as it means introducing proper derived classes of Key. Also it would require to change some functions from passing Key by value to passing if by (const) reference, for inheritance to work.

Portability

Also related to using dedicated types (classes or typedefs) instead of uint8_t. Doing so also simplifies porting Kaleidoscope to other architectures with other bitness. Some might be able to process e.g. uint16_t faster than uint8_t. It can sometimes pay off to waste memory (if there is enough - not on the atmega32u4, of course) to gain speed of execution. Classes like Key_ are a good way to allow different implementations of packed storage of data for different platforms.

Expense for changes, see Apples and Oranges above.

API Documentation

Currently all developers of new plugins have to browse the Kaleidoscope core that is distributed across files and (Arduino) libraries to find out what interface functions are available. This is tedious and scary, and thus possibly repelling for new devs.
What about using a tool like doxygen to document the plugin API? Or better, all of Kaleidoscope. The latter seems preferable as there is no well-defined plugin API, appart from plugin hook callback functions. Browsing plugin code, one can see that plugins are calling all sorts of functions they can access.

6 Likes

For future readers: The basis for the start of this discussion is version v1.22 of Kaleidoscope.

Something like this? We started down that path a while ago, but it’s been on the backburner, due to more pressing issues needing attention.

There is merit in the other things you are saying, but I only did a quick read so far. Will come back to it when my plate has less stuff on it. Making the code easier to work with is a great goal, doubly so if it does not come with increased size or any other unwanted side effects.

I think there will be cases where we will have to sacrifice Good Code on the altar of practicality, but if we can keep those cases to the minimum, I’ll be happy.

2 Likes

Your work log certainly proofs that.

1 Like

Some more…

Symbol Names and File Names

It is a good habit in C++ to store classes in files whose names match class names. In an optimal case, one class per file. This makes it easy to find class definitions and implementations via simple file search.

In Kaleidoscope there are sometimes several classes in one file, sometimes intermixed with other stuff. Cleaning this up would make things easier for users and maintainers.

This can be done without breaking existing code and without a huge amount of work. To guarantee compatibility, the existing headers would live on and serve as wrapper headers for newly introduced ones.

Namespaces

Namespaces are currently used inconsistently in Kaleidoscope. Using at least namespace kaleidoscope {...} throughout the whole firmware would significantly reduce the probability of symbol conflicts.

Nested namespaces (below kaleidoscope) might improve readability and allow for a namespace to file relationship when a file contains, e.g. a lot of constant definitions. An example would be the Kaleidoscope-Ranges.h header. Once macros would be replaced by C++ constexpr constants, they are symbols that would better live in a namespace kaleidoscope::ranges or similar.

This is relatively complicated to fix and can break existing code.

Preprocessor Macros and Macro Functions (reloaded)

In my initial post, I forgot to state the main reason why proprocessor macros and macro functions are considered as evil:
Macro names have a great potential to collide (as they are not scoped). Naming collisions can cause very strange compiler errors which can be very hard to understand, track down and solve, especially for newbies. This is why C++ encourages to use constexpr definitions instead of macros and inline functions instead of macro functions. As both feature a scope (function/class/namespace), naming collisions can be prevented.

Include Directives

Most include directives in Kaleidoscope only use the file basename in the include statement, e.g.

#include "layers.h"

File names are often short and collision-prone. It is likely to have a header named layers.h in another module, e.g. a user plugin. It would e.g. be quite annoying for a plugin developer that had a header foo.h as part of her/his plugin and one day the firmware core introduces a plugin with the same name.

To avoid such scenarios, it can help to add parts of the include file’s path instead.

#include "Kaleidoscope/src/layers.h"

Most large C++ projects do so, see e.g. the code of the boost project for an example.

This change can be accomplished relatively easy using regex based replacement for the whole project. It would (possibly) require a slight adaption of the build system to make sure that the directory .../hardware/avr/libraries is passed to the compiler via -I....

1 Like

FWIW, we have a coding style guide (arguably, a work in progress still). Some of the things you mention are covered there already. We haven’t updated everything to follow the style guide yet, nor is it finished, either.

This is one case where half the codebase has been converted to the style guide, but the other half has not been. The goal is to move everything (within reason) to kaleidoscope::.

One reason why we use #include "something.h" is because Arduino tools aren’t the smartest thing out there. They use include statements to discover which libraries we depend on. Thus, to include things from another Arduino Library, we need to use #include "something.h", without any path components. There is no way around this.

Within a library, we can - and should - use #include "Kaleidoscope/something-else.h", as is done by a lot of the plugins I wrote. Perhaps we can take that a step further and use #include "Kaleidoscope/SomePlugin/Foo.h" - that would be in-line with the current guide lines, but… I’m not entirely convinced it is worth the effort. Though, this is not a strongly held opinion.

Hrm… on a second read, it may be possible to include hardware/avr/libraries in the include path… but we’d still require the path-less includes for dependency discovery for Arduino, so we wouldn’t win much in the end, even if we changed tool arguments.

Needless to say :wink:

I read that thoroughly a while ago (although I was already quite familiar with Google’s original version). It is a very good idea to have a style guide. Unfortunately, tools like astyle can only enforce a subset of it.

I feel at least some of the blame for this stalling is mine. I was pushing for it, and said that I would take on the task of documenting, but I haven’t followed through at all yet, partly because I got sidetracked by plugin-writing and core design experiments, partly because I didn’t feel sufficiently expert to write the documentation without a thorough understanding of the code, and partly because the bits that I do understand well seem likely to change, causing a lack of motivation to write them up in their current incarnation.

1 Like

@noseglasses - Thank you for all the detailed suggestions. There’s a lot of very good stuff here.

Indeed, astyle is a code formatter and not a linter.

The best freely available linter we’ve found to date is google’s cpplint.py. (That’s not a real link. It’s just discourse trying to be helpful)

The current automated tests to enforce cpplint run this invocation:

    $(PLUGIN_TEST_SUPPORT_DIR)/cpplint.py  --quiet --filter=-whitespace,-legal/copyright,-build/include,-readability/namespace  --recursive --extensions=cpp,h,ino src examples

The “noisier” version that we have available but don’t currently enforce is:

    -$(PLUGIN_TEST_SUPPORT_DIR)/cpplint.py  --filter=-legal/copyright,-build/include,-readability/namespace,,-whitespace/line_length  --recursive --extensions=cpp,h,ino --exclude=$(BOARD_HARDWARE_PATH) --exclude=$(TRAVIS_ARDUINO) src examples

Join the club :crazy_face:

3 Likes

@jesse & @algernon: Is there a plan for converting names (among other things, but variable names really stand out to me) to match the style guide at some point?

Also, the style guide currently says that class data members should have trailing underscores, but struct data members should not. I think it has been indicated somewhere in a PR that the distinction should actually be private vs public members, not class vs struct. That makes much more sense to me; am I correctly recalling that?

1 Like

Another question:

Would it be helpful if plugins each used their own namespaces, such that everything in plugin Foo was in the namespace kaleidoscope::foo? This seems logical to me, but as is surely clear by now, I’m no C++ expert. :wink:

1 Like

Yes, but the style guide isn’t finished yet, either.

Yeah. There seems to be an agreement that public should be data_member, private should be data_member_. As structs are public, their members should not have a trailing underscore, indeed.

I… thought about that, but there are a few things that make this a harder decision:

  • We need to have a top level Kaleidoscope-Something.h, because of Arduino.
  • The style guide currently says that the directory layout should follow the namespace, so kaleidoscope::foo should reside in src/kaleidoscope/foo/.
  • If Kaleidoscope-Something is in the kaleidoscope::something namespace, and the class is also Something, that seems repetitive.
  • If we put plugins under kaleidoscope::plugin, that may make more sense (no repetition, but still namespaced).

Not sure it would increase clarity… it would help a little with potential naming conflicts, but most plugins have only two-three files, named after the plugin…

Come to think of it… we could have kaleidoscope::something for stuff used by the plugin, but otherwise not necessarily public, and have a kaleidoscope::Something class - while this would help avoid conflicts, would not have repetition, the difference between the two is the case of a single letter. I’m not sure I’d be happy with that.

I thought of this shortly after I wrote up my query. Since the class is basically already a namespace, it does seem unnecessary.

I feel the same way.

Here’s one situation that might benefit from a plugin namespace: if a plugin library consists of multiple classes (one of which is the plugin itself). It would be nice for them to all share their own namespace. Maybe the plugins could all be class kaleidoscope::something::Plugin? I like that, with mild ambivalence – it definitely makes sense for Qukeys, where I have a Qukey class and a Qukeys class right now; it would be better (in my opinion) to have qukeys::Qukey and qukeys::Plugin.

Can you post this idea to the style guide issue too? We’d have all the ideas in the same place then, which will make review easier once we get there.

I would simplify this. Just add an underscore to all class data members. The whole point of the underscore is to make it easier to see if a variable is a class member or a global, in case its not a function argument. This helps to determine quickly where a variable is defined. At least, that’s the way most projects I came across did it.

I’ll do it whichever way is decided, but I much prefer the public/private distinction, because it distinguishes things that can be modified outside the class/struct from things that are strictly internal. I find that to be a much more useful distinction in Kaleidoscope (and in a less constrained environment, there’s very little reason to have public data members, so the distinction shouldn’t matter).

1 Like

There is no reason at all to have public data members. Not because of constraints of the environment. When inlined, setters and getters do not cause any runtime cost at all. So, there is no excuse for using structs in interfaces. For internal use they might be ok, though.

Don’t setters and getters require copy operations that can sometimes be skipped with direct access, even if the code is inlined?

And I would argue that if I have a class with a data member that I want other code to be able to read and write without restriction for performance reasons, the only reason I have to use setters and getters is because that might change in a future version. And even then, backwards compatibility can be maintained by (ab)using operator overloading and typecasting tricks.

Also, direct access to public members makes for much clearer and understandable code, in my opinion.