rc-core: proposed changes

What are some of the current issues and how can they be fixed?

David Härdeman

<david@hardeman.nu>

Revision History
Revision 1.0	2017-04-27
Initial version

Table of Contents

Introduction
rc-core history
The current situation
Problems with the current situation
- Protocols are not part of the keytable
- NEC scancodes are ambigous
Solutions

This document is targeted at rc-core developers and people interested in rc-core development. It describes the current situation, some of the issues with the current approach and proposes solutions to those issues. The plan is to update this document as development progresses and consensus is found and/or further issues are identified and addressed.

Introduction

By far, the most common kind of remote control hardware (receivers and transmitters for receiving/generating the kind of signals used by handheld remote controls - like the kind you would normally expect to use to control e.g. your TV set) is based on InfraRed, IR, technology. This kind of hardware is often referred to as Consumer InfraRed, CIR, in order to distinguish it from other types of IR hardware, such as the wireless data transfer technology standardized by the Infrared Data Association, IrDA.

There are, however, other solutions based on e.g. radio-frequency (RF), HDMI, etc. One such example that is supported by the Linux kernel is the ATI Remote Wonder which uses RF rather than IR to transmit commands. For that reason, this document will use the terminology “remote control”, RC, rather than e.g. IR. This is also the approach adopted in the kernel, as evidenced by the drivers/media/rc directory and rc-core module in the kernel sources.

rc-core history

The granddaddy of RC drivers in the Linux ecosystem is the LIRC project. LIRC was developed out-of-kernel and contained drivers as well as software tools to receive/send IR signals. The focus was traditionally on hardware which measured and generated pulse-space timings from IR signals. Such hardware, which will be referred to as raw IR hardware, has the advantage that it will (in theory) work with more or less any kind of remote, no matter which specific IR protocol it uses - as long as there is an appropriate decoder written for that protocol. In LIRC, such decoders were implemented in a userspace daemon which received captured pulse-space values from the in-kernel hardware driver.

In addition to special-purpose IR hardware, many digital and analog TV/video capture cards included hardware to receive IR signals. These cards often come bundled with a remote and typically do not support IR transmission. Many of the cards also have on-board decoders for received IR signals, meaning that rather than delivering raw pulse-space timings, the cards only provide already decoded commands. The advantage is that no further decoding of the signal is necessary, but the hardware is typically limited to supporting one, or at most a few, IR protocols. Some hardware decoders do not even return the full decoded signal, but only parts of it (e.g. only the command bits and not the address bits), making it impossible to know the full signal. Hardware which does some kind of hardware decoding will be referred to as cooked in this article.

As cooked hardware was typically found on digital and analog video cards, the IR support for this kind of hardware developed in the linux kernel (via the linux-media/linux-tv projects) independently from LIRC. As a result of discussions on how to merge the LIRC tree into the kernel, rc-core was created in order to merge the two approaches into a consistant API. The legacy of the two code-bases is, however, still very evident today if one looks at the rc-core code (i.e. everything under drivers/media/rc/) in detail (example: the in-kernel keymaps were traditionally associated with cooked hardware while the in-kernel decoders and lirc kernel module are examples of code specifially for raw hardware).

The current situation

The following picture provides an overview of how rc-core works today. Note that dashed lines denote optional parts.

Figure 1. rc-core current situation

As part of its initialization, a kernel hardware driver calls rc_register_device() which returns a struct rc_dev. This struct holds all the relevant state of the rc device (similar to the role played by struct input_dev in the input subsystem). All further interaction with rc-core (via the functions defined in include/media/rc-core.h) uses the struct rc_dev to keep track of the relevant device.

For each registered struct rc_dev, rc-core maintains a keytable, which is simply an array with scancode-keycode mappings. Scancodes are protocol specific, but most IR protocols divide the transmitted message into system and command bits (though the terminology varies). Roughly, the system bits define the device that is the intended recipient (which is why the remote control for the stereo will, hopefully, not have any effect on the TV and vice versa, even if they use the same protocol).

For example, an IR in the RC5 protocol has 5 system bits and 7 (originally 6) command bits as well as one toggle bit (to distinguish between a long keypress and several short ones, not used in the scancode). The command bits are stored in the lowest 8 bits and the system bits are stored in the next 8 bits. Given a scancode of 0x0000031F, we can therefore deduce that the system was 0x03 and the command was 0x1F (assuming the scancode is an RC5 scancode).

Keycodes are defined in input/uapi/linux/input.h as integer constants like KEY_A, KEY_ESC, etc. The keytable is typically populated at module load by the rc hardware driver setting the map_name of the struct rc_dev to the name of a keymap. Default keymaps are implemented as loadable kernel modules (see drivers/media/rc/keymaps/ for a long list) and rc-core will make sure that the corresponding keymap kernel module is loaded and used to populate the keytable.

For raw hardware, rc-core will also load various decoding modules (per default all known modules are loaded) which are used to decode the pulse-space timings produced by this kind of hardware into the scancodes explained above.

rc-core also exposes a sysfs API (see /sys/class/rc/rcX/) which can be used to, inter alia, enable/disable various protocols during runtime for raw as well as cooked hardware (the former simply enabling/disabling the use of the software decoder while latter usually means that the underlying hardware is reconfigured).

rc-core also maintains an input device (struct input_dev) per struct rc_dev. As rc messages are received and, optionally, decoded, the keytable is consulted. If a matching scancode-keycode entry is found, the event (e.g. KEY_A is pressed) is reported to the input subsystem. In any case, the scancode is reported to the input subsystem (as a EV_MSC/MSC_SCAN event), allowing keymaps to be constructed in userspace for new remotes.

The input subsystem, in turn, maintains a chardev per input device (e.g. /dev/input/event12) which allows interested userspace applications to be notified of input events on the relevant device(s) by open():ing the chardev and read():ing struct input_event events from it (this is how e.g. the X server and libinput library receive events. The input device that rc-core creates looks just like a keyboard to a userspace application, allowing it to be blissfully unaware of any rc related details.

The input subsystem also contains functionality to allow userspace to manipulate the keytable via ioctl():s on the event chardev. The traditional ioctl():s (EVIOCGETKEYCODE and EVIOCSKEYCODE) only took a pair of integers (scancode/keycode) as arguments. Since scancodes could be larger than an integer, the EVIOCGKEYCODE_V2 and EVIOCSKEYCODE_V2 ioctl():s, which are able to take much larger scancodes, were introduced. The new functions also allow scancode/keycode mappings to be looked up by index rather than by scancode, which allows the whole keytable to be dumped in a simple fashion.

Finally, the rc subsystem includes the optional lirc module, which acts like a bridge between rc-core and userspace applications which expect the LIRC API. For implementational reasons, the lirc module is implemented as a codec, meaning that it will receive the raw pulse/space events for raw hardware. If lirc is loaded, rc-core maintains a struct lirc_codec per struct rc_dev, which maintains the relevant state.

The lirc module maintains a per-device /dev/lircX chardev which a userspace application can open() and then perform read(), write(), and ioctl() on in order to receive and send data in the form of pulse-space timing integers and to control various parameters of the hardware (such as carrier, duty cycle, timeout, etc). The lirc API is IR specific.

Problems with the current situation

Protocols are not part of the keytable

As noted above, setting and getting keycodes in the input subsystem used to be done via the EVIOC[GS]KEYCODE ioctl() with an unsigned int[2] parameter (one int for scancode and one for the keycode).

The interface has since been extended to use the EVIOC[GS]KEYCODE_V2 ioctl() which uses the an input_keymap_entry structure:


	struct input_keymap_entry {
		__u8  flags;
		__u8  len;
		__u16 index;
		__u32 keycode;
		__u8  scancode[32];
	};

Note that scancode can of course be even bigger, thanks to the len field.

This is what is currently used in rc-core, thereby allowing arbitrary sized scancodes and other advantages (such as index-based lookup of keytable entries).

The problem is that even though the kernel is aware of the protocol which was used to generate any given scancode/keypress, it has no way of communicating that information to userspace. While userspace can get the scancode of an unknown remote control from the kernel (assuming the kernel was able to decode it) via reading EV_MSC/EV_SCAN events from the input device, there is no way to get the corresponding protocol.

Furthermore, there is no way for userspace to tell the kernel which protocol a given scancode corresponds to, meaning that the kernel can't know the protocol a given scancode corresponds to in the keytable.

Scancodes can, and will, overlap.

Example 1. Ambigous scancodes


    RC5 message to address 0x00, command 0x03 has scancode 0x00000503
    JVC message to address 0x00, command 0x03 has scancode 0x00000503

It is only possible to distinguish (and parse) scancodes by knowing the scancode and the protocol.

NEC scancodes are ambigous

The NEC represents a special case of the problems identified above. While the kernel correctly makes a distinction between protocols within the same family that are genuinely different (e.g. Sony12 and Sony15, which have different lenghts) it also distinguishes between NEC16, NEC24 and NEC32 (which do not).

The NEC protocol historically contained 8 bits of address and 8 bits of command and the inverse of address and data (which acts like a checksum and also make the transmission time constant). So, in reality, 32 bits were used to transmit 16 bits.

The adress space of the NEC protocol was quickly used up and NEC24 was created by removing the address redundancy (i.e. the 32 bits were redefined as address_low, address_high, command and command_inverted. This was later extended to do the same trick with the command bits, giving 16 bits of address and 16 bits of command in what the kernel calls NEC32.

Distinguishing between the three variations in the kernel creates ambiguity, not just between protocol scancodes (as shown above), but also within the NEC protocol itself.

Example 2. Ambigous NEC scancodes


    NEC16 message to address 0x05, command 0x03 has scancode 0x00000503
    NEC24 message to address 0x0005, command 0x03 has scancode 0x00000503

These two messages look completely different when transmitted, but result in the same scancode. And userspace will never be able to tell them apart. Furthermore, the pointless distinction (in-kernel) of the scancodes mean that the same code is repeated over and over again througout different drivers.

Example 3. Dealing with NEC scancodes


if (buf[14] == (u8) ~buf[15]) {
	if (buf[12] == (u8) ~buf[13]) {
		/* NEC */
		state->rc_keycode = RC_SCANCODE_NEC(buf[12],
						    buf[14]);
		proto = RC_TYPE_NEC;
	} else {
		/* NEC extended*/
		state->rc_keycode = RC_SCANCODE_NECX(buf[12] << 8 |
						     buf[13],
						     buf[14]);
		proto = RC_TYPE_NECX;
	}
} else {
	/* 32 bit NEC */
	state->rc_keycode = RC_SCANCODE_NEC32(buf[12] << 24 |
					      buf[13] << 16 |
					      buf[14] << 8  |
					      buf[15]);
	proto = RC_TYPE_NEC32;
}

(the above example is taken from drivers/media/usb/dvb-usb-v2/af9015.c).

Solutions

Modify the input ioctls

The only solution is to make the protocol explicit, for userspace and for the kernel. This requires the EVIOC[GS]KEYCODE_V2 ioctl():s to include the protocol. The proposed solution changes how the input_keymap_entry struct is interpreted by rc-core by casting it to a new rc_keymap_entry struct:


	struct rc_keymap_entry {
		__u8  flags;
		__u8  len;
		__u16 index;
		__u32 keycode;
		union {
			struct rc_scancode rc;
			__u8 raw[32];
		};
	};

	struct input_keymap_entry {
		__u8  flags;
		__u8  len;
		__u16 index;
		__u32 keycode;
		__u8  scancode[32];
	};

The u64 scancode field is large enough for all current protocols and because of the len field, it would be possible to define a larger scancode, should it be necessary for some exotic protocol.

The proposed situation would thus look something like this (compare to the previous situation in the first figure):

Figure 2. rc-core proposed situation

An illustration of rc-core with protocol support

The advantages with this change are:

The protocol is made explicit for both userspace and kernel.
struct rc_map no longer hardcodes the protocol, meaning that keytables with mixed entries are possible.
The kernel can use the protocol information to automatically enable/disable the right protocols in hardware or software, making the use of /sys/class/rc/rcX/protocols unnecessary in most cases.

The disadvantages are:

For userspace application which use the old style ioctl():s, we have no choice but to guess the protocol which was intended when a call to EVIOC[GS]KEYCODE[_V2] is made. That guess might be wrong, meaning we break userspace.
The change means that the protocol number definitions become part of the userspace ABI and API. Any changes to the protocol list (save for additions) is therefore impossible once a change like this has been made.

Regarding userspace breakage

The userspace breakage is unavoidable if we ever want to escape the current protocol-less situation since, as noted above, we have no choice but to guess when ioctl() calls are made where the protocol is missing.

However, it should be noted that LIRC applications are unaffected as well as any “RC unaware” application which only consume input events. The damage is limited to applications which manipulate the kernel keytables (and if those tools fail, then input consumers will of course be affected).

Fortunately, that means that it is essentially only ir-keytable which is affected and needs to be updated. Updating various tools as the kernel is updated is a job that distributions are good at.

32 bit NEC scancodes

There are two possible solutions for the NEC protocol issue. One is to simply perform the above mentioned change to the new ioctl() structs and export three different NEC protocols to userspace. The alternative solution is to use the opportunity provided by the changeover to the new structs to also move over to always use NEC32 in the kernel and in the kernel ⇄ userspace interface.

The advantage of the first solution is that it requires less change to existing drivers and that some “cooked” hardware only supports a subset of the NEC protocol types (usually, the hardware only supports NEC16 and only reports two bytes to the kernel driver). Separating the NEC protocols in the API makes it simpler to report that this is the case to userspace (not that there is much userspace can do with that knowledge).

The major advantage of the second solution is that we can remove a lot of pointless code from the kernel and simplify the kernel ⇄ userspace communication at the same time since NEC32 representation allows an unambigous definition of any scancode. At present all duplications of NEC “decoding” logic in the kernel is around 300 lines of code and each additional driver risks adding more.

Of course, userspace programs would still be free to represent NEC scancodes in a suitable manner as 16, 24 and 32 bit variations. We're just talking about the kernelspace ⇄ userspace API here.

To use an analogy, the current situation is like the kernel would report the IPv6 loopback address “::1/128” as some kind of special protocol because it “knows” that it has a special meaning.