Narrowing down parameter space for international layouts

andrewg · November 14, 2017, 3:04pm

Forking this from another discussion, because it’s veering off topic:

There is a danger here that we fork off into a near-infinite number of layout discussions, when the same principles apply to the majority of them. Most scancodes will not change, no matter the language. So maybe it’s better to concentrate on only those scancodes that will.

There are six keys, maximum that need their base-layer scancodes changed from the stock firmware to accomodate an arbitrary Latin-script language keymap.

The number row, and all keys in line with the number row (40 keys in total) MUST use the same scancodes across all languages.
In most languages, ANSI LeftBracket MUST go to the right of P. This is also true of Dvorak.
In most languages, ANSI Apostrophe MUST remain to the right of ANSI Semicolon.
ANSI Backtick is fine where it is.
NonUsBackslashAndPipe SHOULD go on the bottom left key, beside ANSI Z. PageDown MUST therefore go elsewhere, and PageUp SHOULD go with it.

The remaining free scancodes are ANSI Backslash, Minus, Equals and RightBracket.

The bottom-right key and Any key are available for two of those four scancodes.
At least one scancode MUST move to the left hand (unless we want to displace Enter/Return!), and left of ANSI A SHOULD be available (see above).
The options for the last scancode are then (realistically) limited to Prog, LED or Num.
- Mapping Prog can cause problems
- Using Num rather than LED minimizes the number of scancodes that change hands. Num can go to LED or into the Function layer.

So we can reduce our variables to the distribution of four scancodes (ANSI Backslash, Minus, Equals and RightBracket) across four physical keys (call them LEFT_OF_A, LEFT_OF_6, RIGHT_OF_0, and RIGHT_OF_SLASH). Note that all of these scancodes are used for brackets in at least one language, and therefore need to be pairable. The bracket pairs are:

(LeftBracket, RightBracket) in English, Italian, Romanian and Spanish (es_es) QWERTY, Belgian AZERTY, Swiss QWERTZ.
(Minus, Equals) in Dvorak and AZERTY (paired with numbers)
(Apostrophe, Backslash) in Spanish (es_es, es_latam) QWERTY, Swiss QWERTZ
(RightBracket, Backslash) in Brazilian QWERTY and Slovak QWERTZ

Some layouts require two paired sets of bracket keys. Slovak QWERTZ has square and curly brackets as per US QWERTY, but pairs right square bracket and backslash for left and right parentheses respectively (using AltGr). Most AZERTY variants use both of the (Minus, Equals) keys for right curly and square brackets, and the corresponding left brackets are elsewhere in the number row.

So we can define a ‘standard international’ variant where the nesting of the AZERTY and Dvorak brackets is preserved and the English QWERTY brackets are close to each other, with ANSI Backslash displaced:

#define LEFT_OF_Z Key_NonUsBackslashAndPipe
#define LEFT_OF_A Key_Backslash
#define LEFT_OF_6 Key_Minus
#define RIGHT_OF_0 Key_Equals
#define RIGHT_OF_P Key_LeftBracket
#define RIGHT_OF_SLASH Key_RightBracket

And an ‘alternative international’ variant where both of the Spanish bracket pairs remain paired and ANSI Equals is displaced:

#define LEFT_OF_Z Key_NonUsBackslashAndPipe
#define LEFT_OF_A Key_Equals
#define LEFT_OF_6 Key_Minus
#define RIGHT_OF_0 Key_RightBracket
#define RIGHT_OF_P Key_LeftBracket
#define RIGHT_OF_SLASH Key_Backslash

Is there any Latin-script language for which neither of these will work?

Japanese is going to be harder, because the Model01 doesn’t have enough keys for the Kana buttons.

merlin · November 14, 2017, 5:52pm

I can almost entirely get behind this proposal, but I have a few suggestions:

It seems like it would work better to map RIGHT_OF_0 to Key_Minus, since that’s the key that’s usually to the right of zero (unless my assumption is wrong, and that key has a different scancode on non-US-QWERTY keyboards). Then only one of those two (Key_Equals) gets moved out of its normal position in the number row.
I’m pretty sure that mapping the LED functions to the prog key won’t cause any problems when trying to flash the keyboard. If I’m right, maybe it makes sense to add RIGHT_OF_5 to the list of candidates. And if most of those “international” keyboards use the QWERTY brackets as brackets of some kind, it would be easy to remember them if they were mirrored (and frustrating if one of them is in the approximate usual place, but the other is in some other, semi-random position), so using RIGHT_OF_5 & LEFT_OF_6 might be the best choice.
Alternatively, it might make sense to move Key_Backtick to RIGHT_OF_5, maintaining its position relative to tab, and freeing up LEFT_OF_Q for Key_RightBracket, mirroring the left bracket key, and keeping them both in their original row.

I definitely think PageUp & PageDown belong on the same layer as the other navigation keys, not on the base layer.

Jennigma · November 14, 2017, 6:30pm

This is the first link I found that seems to give information and images about foreign layouts:

I would like us to end up with at least charts of the keycodes necessary to implement these foreign language keyboards on the Model 01 (or some similar list), and preferably keymaps and images. It seems like we have folks already in the community who can pull together starter charts and maps for most of the Latin languages.

I know at least one keyboard has gone to Japan. Perhaps a good starting place would be to get the list of countries the Model 01 has been or will be shipped to, and start with that as a priority list.

Alternately, figuring out what keyboard layouts Microsoft and Apple support, and aiming for parity with that.

ETA: This is probably a better link:

ETAA: Another one:

ETAAA:

Here is a Microsoft developer page with tables of key translations.

https://msdn.microsoft.com/en-us/library/ms892472.aspx

andrewg · November 14, 2017, 6:54pm

That was my original idea, but this doesn’t nest the brackets properly in either Dvorak or AZERTY.

I’m pretty sure that mapping the LED functions to the prog key won’t cause any problems when trying to flash the keyboard. If I’m right, maybe it makes sense to add RIGHT_OF_5 to the list of candidates. And if most of those “international” keyboards use the QWERTY brackets as brackets of some kind, it would be easy to remember them if they were mirrored (and frustrating if one of them is in the approximate usual place, but the other is in some other, semi-random position), so using RIGHT_OF_5 & LEFT_OF_6 might be the best choice.

Using LED as a printable is entirely doable. However, RIGHT_OF_P is the “proper place” for LeftBracket in many more languages than it is a bracket. I have considered using LED and ANY as Minus and Equals, as this maintains both Dvorak and AZERTY bracket nesting.

I’d consider (RIGHT_OF_5, LEFT_OF_6) = (Key_Minus, Key_Enter) a viable “standard international” option for en_GB, Dvorak and AZERTY. Not sure what the various QWERTZ languages would make of it though, as Key_Minus is usually a letter, and hand-swapping it might be a step too far.

(If we did decide to use LED, then Num is left as-is on the base layer? I’d be tempted to leave Prog alone, as some people are working on macro plugins that require it to be accessible. Dropping LED to the function layer sounds good to me - how many times a day does it get used?)

Alternatively, it might make sense to move Key_Backtick to RIGHT_OF_5, maintaining its position relative to tab, and freeing up LEFT_OF_Q for Key_RightBracket, mirroring the left bracket key, and keeping them both in their original row.

Putting Backtick on RIGHT_OF_5 was in @celtic’s original proposal, although for different reasons.

Mirroring bracket pairs across the keyboard makes sense. But this way leaves them anti-nested, with the closing brackets/braces on the left and the opening ones on the right. And if you’re going to hand-swap one set of opening brackets then in Spanish you probably want to hand-swap the other to match. So, for “alternative international” you would get something like (ordering the #defines to make more visual sense):

#define RIGHT_OF_5 Key_Backtick
#define LEFT_OF_Q Key_LeftBracket
#define LEFT_OF_A Key_Quote
#define LEFT_OF_Z Key_NonUsBackslashAndPipe

#define LEFT_OF_6 Key_Equals
#define RIGHT_OF_P Key_RightBracket
#define RIGHT_OF_SEMICOLON Key_Backslash
#define RIGHT_OF_SLASH Key_Minus

That should nest both pairs of spanish brackets on (LEFT_OF_Q, RIGHT_OF_P) and (LEFT_OF_A, RIGHT_OF_SEMICOLON). It’s a lot more radical than I had considered up to now.

It’s very hard to visualize these, because the scancodes are all defined in reference to ANSI and I don’t (yet!) have a reverse lookup table for the various languages. And I could run them all through a keyboard visualizer, but it’s hard to look at a whole keyboard diagram and just see the relevant changes.

andrewg · November 14, 2017, 7:10pm

That’s a good one. I’ve been using wikipedia a lot, and the problem there is a) the pictures aren’t visually consistent, and b) they’re sorted into QWERTY, AZERTY etc. on different pages. From this, I can see that even Turkish-F and Arabic are probably viable using “standard international”. Thanks!

andrewg · November 14, 2017, 7:19pm

That’s interesting, but not very parseable. I was planning to mine the XKB database again to do something like ‘SELECT us.name, ‘=’, lang.name from us JOIN lang WHERE us.scancode = lang.scancode’ for each value of lang.

Jennigma · November 14, 2017, 8:11pm

I was realizing right about the time I found this that the investigation would eat my morning if I let it. What I was seeing on this page (and I agree it is murky! =) was tables that started with key location, then some gobbletygook that appeared to be keycodes and generated characters.

What I am imagining is a table something like:

Model 01 key location	Kaleidoscope keycode	hex scancode	character (shifted)
r1c1	Key_Q	10	q (Q)
r1c1+fn
r1c1+num
…

if that makes sense. My example is pretty uninteresting; perhaps I should have picked one of the numpad keys, and perhaps I’m overthinking. Just trying to remember what confused me when I sat down and looked at this, and trying to imagine what it would be like to start climbing the knowledge hill here with the intent of ending up with, for instance, a katakana layout.

At minimum we need a |Keycode|Scancode|Character| translation table that covers the “common” layouts. I don’t know what those are, but the layouts for the countries where the Model 01 has been shipped seems like a necessary set.

andrewg · November 14, 2017, 8:18pm

I’ve now implemented this in my live branch (as ‘merlin-2’) with minus and equals on LED and Any respectively, which under my Dvorak mapping is in practice ‘’. It’s nice, and I might keep it.

Any QWERTZ users have an opinion? @ngetal? It will turn out as:

German: ß on LED and dead keys on Any
Hungarian: Ü on LED and Ó on Any
Everyone else: various symbols

andrewg · November 14, 2017, 8:25pm

I think that’s over-egging it slightly. I doubt that the Fn or Numpad layers will need to change much, except maybe to host a few keys that have been evicted from the base layer (PageUp and PageDown seem to be first in line). So long as we can get all the important keys onto the base layer, that’s all we should need to worry about - and remember that all the hard work of translating scancodes into printables is being done in the OS. We just need to stay out of its way.

Jennigma · November 14, 2017, 8:55pm

I’m imagining that novice users will want to do what I wanted to do, which is find the character they wanted and work from there to the keycode they need to put in the map. I am guessing there are languages (eg Japanese) for which some primary characters will need to get moved to modifier layers. Perhaps it would make more sense to index my imaginary table by generated character.

It’s easy now for me to understand how things interact and skip steps, but I’m trying to bear in mind when documenting this that the average reader of the documentation is not going to understand the basic

keypress --> keycode --> scancode --> OS interpretation --> character/action

process that happens between what they do with their fingers and something changing on their screen. I think it’s a good idea to create documentation that doesn’t skip steps, because showing every step reinforces the learning process.

I had a vague notion that my keyboard was smarter than a set of switches when I received it, but I was not prepared for the complexity I encountered. It was a steep hill to climb, and recent enough that I can remember it quite clearly. Exposing that complexity to new people while giving them clear paths from where they are to where they want to be is my goal.

I am imagining the international docs as starting places for experimentation. My goal is to make it as easy as possible for folks who are not fluent in English to get their keyboards to ground level with a native English speaker receiving a Model 01. Ideally the basic docs would have translations available.

I don’t think I’m doing a great job of explaining my goals, though, and frankly I’m feeling my way towards a goal I don’t really understand, being myself American. I have never had to accommodate myself to a keyboard, so I don’t know what that process is like.

andrewg · November 14, 2017, 9:38pm

For CJK languages there is an extra abstraction layer between the glyph produced by the standard sequence of mappings, and the glyph you eventually get. But it’s an extra layer of abstraction at the very far end of the process, and it’s one which is highly visible to the user. We should be able to treat it as a sophisticated editing application, and therefore outside the scope of our documentation.

From the user’s point of view the invisible abstraction layers are the same in Japanese as they are in English. Here is what is printed on the physical key, and this should match what comes out the far end of the black box, even if the thing that comes out is slightly more abstract than an English speaker might expect. The conceptual division between printables, modifiers and state changes is much the same. And if you look at the sample keyboards in the link you posted earlier, there are only two extra keys, which we should be able to crowbar onto the thumbs; we just need a native speaker or two to help arrange the specifics.

andrewg · November 14, 2017, 11:43pm

OK, I’ve had a look at the Japanese keyboard specs, and there are five extra scancodes (defined in Kaleidoscope already as Key_International1 - Key_International5). These are commonly aliased as:

Key_Ro
Key_KatakanaHiragana
Key_Yen
Key_Henkan
Key_Muhenkan

Of these, only Ro and Yen are printable keys. Key_Ro is normally at RIGHT_OF_SLASH and Key_Yen lost away in the top right where some old keyboards keep their backslash (but there’s also a standard backslash key). Japanese keyboards don’t have a NonUsBackslashAndPipe, so it could possibly go to LEFT_OF_Z. The Backtick key is a modal key rather than a printable in Japanese, so there are exactly the same number of printables as on a standard international keyboard. This should just amount to a rearrangement of the usual orphan keys.

KatakanaHiragana, Henkan and Muhenkan are modal keys, as well as the Backtick scancode, which is marked as ZenkakuHankaku. K/H and Z/H can be thought of as analogous to Caps Lock, but for character sets. Henkan and Muhenkan are menu iterators. Now, the Apple compact keyboard linked in the Medium article above does not appear to have all four of those - only Henkan and KatakanaHiragana. Whether this is due to the specific input method in OSX, whether the other two keys are hidden under Fn-SOMETHING, or whether it is possible in general to get away without the other two modal keys, I can’t say. But it should be possible to squeeze at least three of them onto the thumb buttons without too much trouble (by default the Model01 has two Control keys, two Alt keys and two Shift keys, and AltGr is not used in Japanese).

ngetal · November 15, 2017, 9:40am

I guess it’s fine. Although I ended up going back to the standard layout on the right side, somehow it felt weird to be typing “-” having to reach out that far towards the middle with my index finger. Feels quicker to have those right next to where my pinky rests.

github.com

imrekoszo/Model01-Firmware/blob/ed5bad80ec40fa0543c8e139f92c7208a3ab7f75/Model01-Firmware.ino#L97


      
          const Key keymaps[][ROWS][COLS] PROGMEM = {
          
            [QWERTY] = KEYMAP_STACKED
            (Key_Escape,                   Key_1, Key_2, Key_3, Key_4, Key_5, ___,
             Key_Backtick,                 Key_Q, Key_W, Key_E, Key_R, Key_T, Key_Tab,
             Key_Backslash,                Key_A, Key_S, Key_D, Key_F, Key_G,
             Key_NonUsBackslashAndPipe,    Key_Z, Key_X, Key_C, Key_V, Key_B, ___,
             Key_LeftGui, Key_Backspace, Key_LeftControl, Key_LeftShift,
             ShiftToLayer(FUNCTION),
          
             Key_LeftBracket, Key_6, Key_7, Key_8,     Key_9,         Key_0,         Key_RightBracket,
             Key_Enter,       Key_Y, Key_U, Key_I,     Key_O,         Key_P,         Key_Equals,
                              Key_H, Key_J, Key_K,     Key_L,         Key_Semicolon, Key_Quote,
             Key_RightAlt,    Key_N, Key_M, Key_Comma, Key_Period,    Key_Slash,     Key_Minus,
             Key_RightShift, Key_RightControl, Key_Spacebar, Key_LeftAlt,
             ShiftToLayer(FUNCTION)),
          
            [FUNCTION] =  KEYMAP_STACKED
            (___, Key_F1,          Key_F2,                     Key_F3,                   Key_F4,                   Key_F5, Key_LEDEffectNext,
             ___, ___,             Consumer_ScanPreviousTrack, Consumer_PlaySlashPause,  Consumer_ScanNextTrack,   ___,    ___,
             ___, ___,             Key_Mute,                   Consumer_VolumeDecrement, Consumer_VolumeIncrement, ___,

andrewg · November 15, 2017, 6:46pm

So I’ve thought a little more about this and @merlin’s suggestion number 2 about square brackets has a lot of merit. I’ve seen discussions elsewhere on the site about putting multibracket keys on LED/Any, so it appears to be a convergent meme (if that’s not a term, I’m hereby inventing it!).

So how about this for a proposal: we group language layouts primarily by how we have to mangle their brackets. This will lead us to create more than two international layouts, but should still be a manageable number.

In all cases where non-AltGr paired brackets exist, they should end up in the same place: LED/Any.
For keymaps which don’t have non-AltGr paired brackets, we prioritize keeping RIGHT_OF_P and RIGHT_OF_SEMICOLON invariant as in such cases these tend to be letters - which in practice means using the same layout as AZERTY/Dvorak.
In cases such as Spanish or Swiss where we have more than one pair of brackets, we prioritize them in the order (parentheses, brackets, braces).
In cases where the brackets are on the AltGr layer of letters, we prioritize the letter placement over the bracket placement.
When deciding which key to move to LEFT_OF_A, the least used or most inaccessible or unpaired one of the remaining keys (Key_Backslash, Key_RightBracket or Key_Equals) should be chosen.

So, we have the following classes:

Brackets on the number row or no paired brackets and/or letters on RIGHT_OF_P, RIGHT_OF_SEMICOLON (AZERTY, Dvorak, most QWERTZ, Nordic, Canadian-multi, Italian, …): merlin-2 as defined above
Bracket pair right of P (en_qwerty): stock layout with brackets on LED/Any; Key_Backslash on LEFT_OF_A, Key_NonUsBackslashAndPipe on LEFT_OF_Z (as per @ngetal above, but using LED instead of Num)
Bracket pair right of semicolon (es_latam): Key_Quote ([) and Key_Backslash (]) on LED/Any, Key_RightBracket (+) on LEFT_OF_A and Key_Minus, Key_Equals (upright and inverted question marks) on RIGHT_OF_SEMICOLON, RIGHT_OF_SLASH (unsure which way around)
Bracket pair on Key_RightBracket and Key_Backslash (pt_BR): Key_RightBracket ([) and Key_Backslash (]) on LED/Any, and Key_Minus, Key_Equals on RIGHT_OF_SLASH, LEFT_OF_A respectively
Bracket pairs as per both 2 and 3 above (es_es, Swiss) but not on base layer - beats me, suggestions welcome.

Jennigma · November 15, 2017, 7:30pm

I’m so glad you’re working through this so I don’t have to spend cycles on it.

Jennigma · November 15, 2017, 11:39pm

Just found this blog page:

http://d.hatena.ne.jp/mobitan/20160201

Someone working on a Japanese layout.

andrewg · November 16, 2017, 12:02am

That’s interesting. Not so much the layout of the printables, which I think is a little too radical to be a starting point for us - but they only use Kana and Henkan mode keys, like the Apple keyboard above. This is good news, as it seems to indicate we have enough keys.

(Note also the convergent brackets meme)

Unfortunately my Japanese is too rusty to make much sense out of the associated text…

merlin · November 16, 2017, 3:18am

Ah, but the translation in Chrome from Google is just a joy to behold!

(I’ve been watching a lot of Google Translate Sings videos recently)

andrewg · November 16, 2017, 10:07am

Machine translation is always fun.

I was on mobile at the time and didn’t have the convenient option. I see it now. Not much to add over and above what I deduced.

merlin · November 16, 2017, 11:13am

Chrome offers (semi-)automatic translation on mobile (Android, anyway).

Topic		Replies	Views
Initial Dvorak Layout Model 01 Layouts	194	12293	December 8, 2017
Qwertz in general, german in particular Model 01 Layouts	62	6653	June 29, 2022
Share your layout Model 01 Layouts	77	20039	November 25, 2020
A Dvorak layout that works when layout is switched to Dvorak in the operating system (Model 01 layout conflicts with software-defined keyboard layout) Model 01 Layouts question , plugin	21	2723	March 7, 2019
Dedicated brackets key Help and Getting Started	33	4727	November 8, 2017

Narrowing down parameter space for international layouts

Related topics