Previously, I have covered some basic issues about kernel, scheduling, preemption and preempt_rt in part 1. In this presentation, preempt_rt is explained more detailed.
Only one process per CPU can run at any one time, multitasking operating systems use a concept called multiprogramming to schedule time for each process to run on a CPU. A scheduler is responsible for giving each process time on the CPU. When the current time slice expires, the scheduler puts the current process to sleep and the next process is given CPU time. Some scheduling systems include:
First Come First Served, Shortest Process Next, Shortest Remaining Time, Round Robin Scheduling, Preemption, Priority Scheduling
In computing, preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such a change is known as a context switch. It is normally carried out by a privileged task or part of the system known as a preemptive scheduler, which has the power to preempt, or interrupt, and later resume, other tasks in the system.
In the Linux kernel, the scheduler is called after each timer interrupt (that is, quite a few times per second). It determines what process to run next based on a variety of factors, including priority, time already run, etc.
Pros/Cons: Making a scheduler preemptible has the advantage of better system responsiveness and scalability, but comes with the disadvantage of race conditions (A race occurs when correctness of the program depends on one thread reaching point x before another thread reaches point y.).
The term preemptive multitasking is used to distinguish a multitasking operating system, which permits preemption of tasks, from a cooperative multitasking system wherein processes or tasks must be explicitly programmed to yield when they do not need system resources.
Although multitasking techniques were originally developed to allow multiple users to share a single machine, it soon became apparent that multitasking was useful regardless of the number of users. Many operating systems, from mainframes down to single-user personal computers and no-user control systems (like those in robotic spacecraft), have recognized the usefulness of multitasking support for a variety of reasons. Multitasking makes it possible for a single user to run multiple applications at the same time, or to run “background” processes while retaining control of the computer.
Kernel is a computer program that manages input/output requests from software, and translates them into data processing instructions for the central processing unit and other electronic components of a computer.
Monolithic kernels, which have traditionally been used by Unix-like operating systems, contain all the operating system core functions and the device drivers (small programs that allow the operating system to interact with hardware devices, such as disk drives, video cards and printers). This is the traditional design of UNIX systems. A monolithic kernel is one single program that contains all of the code necessary to perform every kernel related task.
The main disadvantages of monolithic kernels are the dependencies between system components – a bug in a device driver might crash the entire system.
A microkernel runs most of the operating system’s background processes in user space, to make the operating system more modular and, therefore, easier to maintain.
Main criticisms of monolithic kernels from microkernel advocates, which is that;
Kernel preemption is a method used mainly in monolithic and hybrid kernels where all or most device drivers are run in kernel space, whereby the scheduler is permitted to forcibly perform a context switch (i.e. preemptively schedule; on behalf of a runnable and higher priority process) on a driver or other part of the kernel during its execution, rather than co-operatively wait for the driver or kernel function (such as a system call) to complete its execution and return control of the processor to the scheduler.
Lower-priority process effectively blocks a higher-priority one. Lower-priority process’s ownership of lock prevents higher-priority process from running.
Solution to priority inversion is Priority Inheritance: Temporarily increase process’s priority when it acquires a lock.
Kernel mode: The Linux kernel System Call Interface, Process scheduling subsystem, IPC subsystem, Memory management subsystem, Virtual files subsystem, Network subsystem, Other components
User mode: System daemons(deamon: In multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user.), Windowing system, C standard library, Other libraries.
Priority inheritance support available since Linux 2.6.18.
Linux kernel is a preemptive operating system. When a task runs in user-space mode and gets interrupted by an interruption, if the interrupt handler wakes up another task, this task can be scheduled as soon as we return from the interrupt handler.
However, when the interrupt comes while the task is executing a system call, this system call has to finish before another task can be scheduled. By default, the Linux kernel does not do kernel preemption.
This means that the time before which the scheduler will be called to schedule another task is unbounded.
Hard real-time systems: required to complete a critical task within a guaranteed amount of time.
Soft real-time systems: requires that critical processes receive priority over less fortunate ones.
Real-time applications have operational deadlines between some triggering event and the application’s response to that event. To meet these operational deadlines, programmers use real-time operating systems (RTOS) on which the maximum response time can be calculated or measured reliably for the given application and environment. A typical RTOS uses priorities. The highest priority task wanting the CPU always gets the CPU within a fixed amount of time after the event waking the task has taken place.
Traditionally, the Linux kernel will only allow one process to preempt another only under certain circumstances:
If kernel code is executing when some event takes place that requires a high priority thread to start executing, the high priority thread can not preempt the running kernel code, until the kernel code explicitly yields control. In the worst case, the latency could potentially be hundreds milliseconds or more.
The Linux 2.6 configuration option CONFIG_PREEMPT_VOLUNTARY introduces checks to the most common causes of long latencies, so that the kernel can voluntarily yield control to a higher priority task waiting to execute. This can be helpful, but while it reduces the occurences of long latencies (hundreds of milliseconds to potentially seconds or more), it does not eliminate them. However unlike CONFIG_PREEMPT (discussed below), CONFIG_PREEMPT_VOLUNTARY has a much lower impact on the overall throughput of the system. (As always, there is a classical tradeoff between throughput — the overall efficiency of the system — and latency. With the faster CPU’s of modern-day systems, it often makes sense to trade off throughput for lower latencies, but server class systems that do not need minimum latency guarantees may very well choose to use either CONFIG_PREEMPT_VOLUNTARY, or to stick with the traditional non-preemptible kernel design.)
The 2.6 Linux kernel has an additional configuration option, CONFIG_PREEMPT, which causes all kernel code outside of spinlock-protected regions and interrupt handlers to be eligible for non-voluntary preemption by higher priority kernel threads. With this option, worst case latency drops to (around) single digit milliseconds, although some device drivers can have interrupt handlers that will introduce latency much worse than that. If a real-time Linux application requires latencies smaller than single-digit milliseconds, use of the CONFIG_PREEMPT_RT patch is highly recommended.
The RT-Preempt patch converts Linux into a fully preemptible kernel. The magic is done with:
Architectures, CONFIG_PREEMPT_RT patch does the support:
There are systems representing the x86, x86_64, ARM, MIPS, and Power architectures using the CONFIG_PREEMPT_RT patch. However, in many ways this is the wrong question. Support for real-time is not just about the instruction set architecture, but also about supporting the high resolution timer provided by the CPU and/or CPU support chipset, the device drivers for the system being well behaved, etc.
Please refer to platforms tested and in use with CONFIG_PREEMT_RT section in this wiki for a list of platforms that members of the -rt community have used successfully.
The normal Linux kernel allows preemption of a task by a higher priority task only when the user space code is getting executed.
In order to reduce the latency, the CONFIG_PREEMPT_RT patch forces the kernel to non-voluntarily preempt the task at hand, at the arrival of a higher proiority kernel task. This is bound to cause a reduction in the overall throughput of the system since there will be several context switches and also the lower priority tasks won’t be getting much a chance to get through.
This is the current status of Realtime Linux using the Realtime Preempt patches (aka PREEMPT_RT):
1) Since kernel 2.6.24 in mainline Linux
2) Since kernel 2.6.25 in mainline Linux
3) Realtime-Preempt patches 184.108.40.206-rt15 or higher required
4) Since kernel 2.6.30 in mainline Linux
5) Not yet adapted to generic interrupt code
6) Since kernel 2.6.33 in mainline Linux
7) Since kernel 2.6.39 in mainline Linux
ketchup 3.12.6 ketchup -s 2.6-rt #find latest RT revision
bzcat ../patch-220.127.116.11-rt11.bz2 | patch -p1
make make modules make modules_install make install
This is the traditional Linux preemption model, geared throughput. It will still provide good latencies most of the time, but there are no guarantees and occasional longer delays are possible. Select this option if you are building a kernel for a server or scientific/computation system, or if you want to maximize the raw processing power of the kernel, irrespective of scheduling latencies.
This option reduces the latency of the kernel by adding more”explicit preemption points” to the kernel code. These new preemption points have been selected to reduce the maximum latency of rescheduling, providing faster application reactions, at the cost of slightly lower throughput. This allows reaction to interactive events by allowing a low priority process to voluntarily preempt itself even if it is in kernel mode executing a system call. This allows applications to run more ‘smoothly’ even when the system is under load. Select this if you are building a kernel for a desktop system.
This option reduces the latency of the kernel by making all kernel code (that is not executing in a critical section) preemptible. This allows reaction to interactive events by permitting a low priority process to be preempted involuntarily even if it is in kernel mode executing a system call and would otherwise not be about to reach a natural preemption point. This allows applications to run more ‘smoothly’ even when the system is under load, at the cost of slightly lower throughput and a slight runtime overhead to kernel code. Select this if you are building a kernel for a desktop or embedded system with latency requirements in the milliseconds range.
This option is basically the same as (Low-Latency Desktop) but enables changes which are preliminary for the full preemptible RT kernel.
All and everything.
The content of this text is mainly compiled from various resources.
These topics are covered in presentation:
Chipsee pandaboard expansion set comes with ubuntu-precise kernel. Even though, modified files are already indicated by readme files, as far as i know there isn’t any vanilla kernel integration of expansion set drivers.
To achieve that, I analyzed some linux kernel patches* from version 3.2 through version 3.10.80. And then, modified both chipsee provided files and kernel files.
These are the changed files:
+ drivers/video/omap2/displays/panel-chipsee-dpi.c M drivers/video/omap2/displays/Kconfig M drivers/video/omap2/displays/Makefile M drivers/input/touchscreen/ads7846.c M arch/arm/mach-omap2/board-omap4panda.c M arch/arm/mach-omap2/dss-common.c
And this is the related commit:
Although, it is not proper to implement platform spesifics in dss-common.c, since it is an experimental work, i didn’t hesitate to do that.
After patching kernel, one should calibrate the touchscreen. Tslib (http://processors.wiki.ti.com/index.php/Tslib) is the hardware handling utility and it has ts_calibrate tool to do that. However, before executing ts_calibrate, some environment variables should be set:
export TSLIB_CONFFILE=/etc/ts.conf export TSLIB_FBDEVICE=/dev/fb0 export TSLIB_TSDEVICE=/dev/input/event0 export TSLIB_PLUGINDIR=/usr/lib/ts export TSLIB_CALIBFILE=/etc/pointercal
Moreover, in TSLIB_CONFFILE, module_raw input line has to be commented out.
After all of them, to run a qt application in linux frame buffer, these arguments should be passed:
-platform linuxfb -plugin evdevkeyboard:/dev/input/eventX -plugin evdevmouse:/dev/input/eventY -plugin tslib:/dev/input/eventZ
X, Y, Z numbers can be checked from /sys/class/input.
*: Here are the kernel patches that I analyzed:
During the building operations of bash 4.3.30 using buildroot, some errors may occur:
bashline.o: In function `bash_event_hook': bashline.c:(.text+0x2328): undefined reference to `rl_signal_event_hook' bashline.o: In function `bash_execute_unix_command': bashline.c:(.text+0x2d5c): undefined reference to `rl_executing_keyseq' bashline.o: In function `bashline_set_event_hook': bashline.c:(.text+0x3734): undefined reference to `rl_signal_event_hook' bashline.o: In function `bashline_reset_event_hook': bashline.c:(.text+0x3748): undefined reference to `rl_signal_event_hook' bashline.o: In function `initialize_readline': bashline.c:(.text+0x4464): undefined reference to `rl_filename_stat_hook' bashline.o: In function `attempt_shell_completion': bashline.c:(.text+0x4b0c): undefined reference to `rl_filename_stat_hook' bashline.o: In function `bashline_reset': bashline.c:(.text+0x4bb8): undefined reference to `rl_filename_stat_hook' bashline.c:(.text+0x4bc0): undefined reference to `rl_signal_event_hook' bashline.o: In function `command_word_completion_function': bashline.c:(.text+0x572c): undefined reference to `rl_filename_stat_hook' collect2: error: ld returned 1 exit status make: *** [bash] Error 1
This error may depend on your external toolchain, however solution is as easy as making symlink. Main source of this error comes from readline library. It seems that readline 6.2 & 6 (afaik) do not have above functions.
In buildroot sysroot directory,
libreadline.so exists in usr/lib/arm-linux-gnueabihf and linked to lib/arm-linux-gnueabihf/libreadline.so.6 which also linked to libreadline.so.6.2. Changing this linkage to libreadline.so.6.3:
cp usr/lib/libreadline.so.6.3 lib/arm-linux-gnueabihf/ cd usr/lib ln -sf libreadline.so.6.3 libreadline.so.6
should solve the build problem.
libcgroup is an abstraction of a linux cgroups. Even though standard cross-compile operations are fairly enough:
./configure --host=arm-linux-gnueabihf make
During compilation, this error happens:
make: Entering directory `/home/arcelik/1511/tools/libcgroup-0.41/src/daemon' CC cgrulesengd.o CCLD cgrulesengd cgrulesengd.o: In function `cgre_store_unchanged_process': /home/arcelik/1511/tools/libcgroup-0.41/src/daemon/cgrulesengd.c:310: undefined reference to `rpl_realloc' cgrulesengd.o: In function `cgre_store_parent_info': /home/arcelik/1511/tools/libcgroup-0.41/src/daemon/cgrulesengd.c:223: undefined reference to `rpl_realloc' ../../src/.libs/libcgroup.so: undefined reference to `rpl_malloc' collect2: error: ld returned 1 exit status
undefined reference to rpl_malloc and rpl_realloc error can be fixed by explicitly telling the configure script that malloc and realloc functions are exist. So that, above linking error should not happen.
However, target platform has to be a glibc system to avoid from runtime error in the future.
ac_cv_func_malloc_0_nonnull=yes ac_cv_func_realloc_0_nonnull=yes ./configure --host=arm-linux-gnueabihf make
Note: an extension of a linaro toolchain 2014.09 is used while cross-compiling. https://github.com/eckucukoglu/arm-linux-gnueabihf.git
There exists some blog posts, articles about SELinux mode configuration already, which easily can be found by searching on google: “how to enable/disable selinux, how to configure selinux”. Moreover, The SELinux Notebook 4th edition has information about SELinux modes and global configuration files, respectively in chapter 2.15 and 3.2.1. However, I think, SELinux has some controversial issues about mode configuration and none of these resources are good enough to clear the mind about confusing SELinux mode configuration.
First of all, linux kernel has some configuration options which allows/disallows SELinux to be disabled/enabled. These options are:
So far, we know that these arguments can be passed to kernel:
Additionally, from Dan Walsh’s blogpost, there is also
boot argument to relabel the system with exact security contexts.
Furthermore, another way to configure SELinux modes is global configuration file: /etc/selinux/config
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded.
According to these options, SELinux modes can be changed. However which option overrides the other one is confusing.
Two way to switch between permissive and enforcing at runtime. Note that after reboot these options will be overriden by the system defaults. Moreover, switching to permissive/enforcing mode is only applicable unless selinux is disabled.
# switching to enforcing echo 1 > /sys/fs/selinux/enforce # switching to permissive echo 0 > /sys/fs/selinux/enforce
# switching to enforcing setenforce 1 /* or setenforce Enforcing */ # switching to permissive setenforce 0 /* or setenforce Permissive */
The second confusing thing about SELinux mode configuration is that even though kernel boot parameters override the config file, the exact opposite of this action is also possible. To clarify this options, I have made some tests on my running ARM platform. Note that, I compiled kernel with these configs:
CONFIG_SECURITY_SELINUX=y CONFIG_SECURITY_SELINUX_BOOTPARAM=y CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1 CONFIG_SECURITY_SELINUX_DISABLE=y CONFIG_SECURITY_SELINUX_DEVELOP=y CONFIG_SECURITY_SELINUX_AVC_STATS=y CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1 # CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
So, boot argument selinux=0 overrides /etc/config/selinux, however selinux=1 does not overrides. Moreover, while passing enforcing=0, even though in /etc/selinux/config includes SELINUX=enforcing, system starts w/ permissive mode. However if config file includes SELINUX=disabled, system starts w/o SELinux. I think this is confusing and kind of inconsistent but there should be a good rationale for that. Most people hardly ever get confronted with these situations.
As mentioned gentoo:selinux tutorials and here, If system booted with SELinux disabled, we need to relabel filesystem to enable again SELinux. After disabling SELinux, switching back to enforcing mode causes kernel crash, since relabeling can not be possible in enforcing mode. So that, switching from disabled to enabled mode is only possible in permissive mode. After booting in permissive mode:
can be used to relabel filesystem. Also, kernel boot parameter
will force the system to relabel, too. In my experiments, after disabling SELinux, passing enforcing=1 as kernel parameters causes kernel panic (as expected). Here the logs:
[ 8.245513] SELinux: Disabled at runtime. [ 8.474853] type=1404 audit(946686015.345:2): selinux=0 auid=4294967295 ses=4294967295 can't load SELinux Policy. Machine is in enforcing mode. Halting now.
If your intention is to disable SELinux permanently, and never ever want to be enabled again, even though it is not recommended, passing selinux=0 as kernel boot parameter is the best option. For this case, kernel boot argument overrides options in /etc/selinux/config. However, unless SELinux is intented to be disabled, passing selinux=1 or none as kernel boot parameter and modifying /etc/selinux/config file, according to intention, will be proper action.
In linux, it is possible to extend the size of a disk image without losing already existed content.
For example, to extend rootfs image to a fixed 1GB size:
dd if=/dev/zero of=rootfs.ext2 bs=1M seek=1000 count=0 e2fsck -f rootfs.ext2 resize2fs rootfs.ext2
Redirection (>, <) and pipe (|), both are used to pass output of a process. However, there exists one fundamental differences between them. Even though redirection can be used to pass to stream which also can be a file, pipe can pass output of a process to another process.
For example, this command does that:
dmesg | grep selinux > temp.log
In fact, above-command can be typed like that:
dmesg > dmesg.out && grep selinux < dmesg.out > temp.log
Which does same job, except that it writes dmesg output to file named dmesg.out. Instead of double redirection, using pipe eases our job.