Using Linux Kernel Sequence Files / Habr

A characteristic feature of modern programming is the use of the global network as a source of reference information, in particular, a source of patterns for solving unknown or little-known problems for a specific programmer. Such an approach saves a lot of time and often gives quite qualitative results. However, the solutions laid out in the network although usually correct, do not always take into account all the subtleties of solving a problem, which leads to the appearance in the source code of sections that usually work correctly, but under not quite standard circumstances become sources of unpleasant surprises.

Consider the topic of using sequence files in the Linux kernel, such files are considered to be the most convenient mechanism for printing from kernel mode. But in practice, using them correctly is much more difficult than you would think.

A lot of materials on this topic are available online. The best is the source code of the kernel itself which has quite detailed comments. The problem with this source of information is its volume. If you do not know exactly what to look for, it is better if you only have limited time, not to try at all. For me, when I became interested in the topic, Google provided several seemingly excellent sources of information relating to my search: the famous book The Linux Kernel Module Programming Guide and a series of articles by Rob Day. These sources are not new, but very solid.

Let's first consider in more detail when it is natural to use sequence files. The most common situation is to create your own file in the /proc file system. By reading the files of this system, you can get a variety of information about the equipment used, its drivers, RAM, processes, etc.

It would seem that the printout of anything is the simplest task in programming. But working in kernel mode the OS imposes many restrictions that may seem completely unimaginable to the developer of the application software. In kernel mode, the size of the print buffer is limited by the size of a virtual memory page. For the x86 architecture it is four kilobytes. Therefore, a good program when printing large amounts of data must first achieve filling the buffer to the maximum, then print it, and then repeat this iteration until the data for printing is completely exhausted. You can of course, print character by character, which would greatly simplify everything, but we are talking about good programs.

The above-mentioned sources were slightly worse than expected. In the book for example, some of the information turned out to be generally incorrect and that is what moved me to write this note. It is common to consider that information given in the form of a scheme-picture is the easiest to understand and use. However in this book the picture related to the subject is incorrect. Using such a scheme can lead to serious errors, although the example in the book works correctly and follows this very scheme. This is due to the fact that in this example, only a few bytes are printed at a time when /proc/iter is accessed. If you use it as a template for printing texts larger than a page of memory, there will be surprises. The above-mentioned series of articles does not contain obvious errors, but does not report on some details that are important for understanding the topic.

So, let's first consider the correct scheme of how to work with a sequence file.

To work with such a file, you need to create the functions start(), stop(), next() and show(). The names of these functions can be any, I chose the shortest words that correspond in meaning to the actions of the functions. When such functions are present and properly connected to kernel systems, they start working automatically when accessing the file associated with them in the /proc directory. The most confusing thing is the use of the stop() function, which can be called in three different contexts. Calling it after start() means ending the print job. Calling it after show() means that the last print operation to the buffer (usually the seq_printf function is used for this) overflowed the page buffer and this print operation was canceled. Its call after next() is the most interesting case that occurs when printing some data to the buffer ends and you need to either finish the job or use new data. For example, suppose that our file in the /proc directory, when accessing it, first produces some information on block devices, and then on character ones. Firstly, the start() function initializes printing for block devices, and the next() and, possibly, show() functions use this initialization data to print step-by-step information about the block devices. When everything is ready, after the last call to next(), the considered call to stop() will be made, after which start() is called, which this time should already initiate further printing for character devices.

I give a slightly modified example (the contents of the file evens.c) from the article by Rob Day. I had to replace the call of a function, which is absent in modern kernels with its actual equivalent. The comments are also slightly changed.

#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/proc_fs.h>
#include <linux/fs.h>
#include <linux/seq_file.h>
#include <linux/slab.h>

static int limit = 10; //default value, it can be changed here or
module_param(limit, int, S_IRUGO); //transfered as a module parameter

static int* even_ptr; //we will work with dynamic memory

/**
 * start
 */
static void *ct_seq_start(struct seq_file *s, loff_t *pos) {
     printk(KERN_INFO "Entering start(), pos = %Ld, seq-file pos = %lu.\n", *pos, s->count);

     if (*pos >= limit) {     // are we done?
         printk(KERN_INFO "Apparently, we're done.\n");
         return NULL;
     }
     //Allocate an integer to hold our increasing even value
     even_ptr = kmalloc(sizeof(int), GFP_KERNEL);

     if (!even_ptr)     // fatal kernel allocation error
         return NULL;

     printk(KERN_INFO "In start(), even_ptr = %pX.\n", even_ptr);
     *even_ptr = (*pos)*2;
     return even_ptr;
}

/**
 * show
 */
static int ct_seq_show(struct seq_file *s, void *v) {
     printk(KERN_INFO "In show(), even = %d.\n", *(int*)v);
     seq_printf(s, "The current value of the even number is %d\n", *(int*)v);
     return 0;
}

/**
 * next
 */
static void *ct_seq_next(struct seq_file *s, void *v, loff_t *pos) {
     printk(KERN_INFO "In next(), v = %pX, pos = %Ld, seq-file pos = %lu.\n", v, *pos, s->count);

     (*pos)++;              //increase my position counter
     if (*pos >= limit)     //are we done?
          return NULL;

     *(int*)v += 2;         //to the next even value

     return v;
 }

/**
 * stop
 */
static void ct_seq_stop(struct seq_file *s, void *v) {
     printk(KERN_INFO "Entering stop().\n");

     if (v)
         printk(KERN_INFO "v is %pX.\n", v);
     else
         printk(KERN_INFO "v is null.\n");

     printk(KERN_INFO "In stop(), even_ptr = %pX.\n", even_ptr);

     if (even_ptr) {
         printk(KERN_INFO "Freeing and clearing even_ptr.\n");
         kfree(even_ptr);
         even_ptr = NULL;
     } else
         printk(KERN_INFO "even_ptr is already null.\n");
}

/**
 * This structure gathers functions which control the sequential reading
 */
static struct seq_operations ct_seq_ops = {
     .start = ct_seq_start,
     .next  = ct_seq_next,
     .stop  = ct_seq_stop,
     .show  = ct_seq_show
};

/**
 * This function is called when a file from /proc is opened
 */
static int ct_open(struct inode *inode, struct file *file) {
     return seq_open(file, &ct_seq_ops);
};

/**
 * This structure gathers functions for a /proc-file operations
 */
static struct file_operations ct_file_ops = {
     .owner   = THIS_MODULE,
     .open    = ct_open,
     .read    = seq_read,
     .llseek  = seq_lseek,
     .release = seq_release
};

/**
 * This function is called when this module is loaded into the kernel
 */
static int __init ct_init(void) {
     proc_create("evens", 0, NULL, &ct_file_ops);
     return 0;
}

/**
 * This function is called when this module is removed from the kernel
 */
static void __exit ct_exit(void) {
     remove_proc_entry("evens", NULL);
}

module_init(ct_init);
module_exit(ct_exit);

MODULE_LICENSE("GPL");

Functions to work with a sequence file use two pointers with overlapping functionality (this is also somewhat confusing). One of them should point to the current object to be printed to the buffer by show() – it is `v'-pointer in the program. The other pointer `pos' is usually used to point to the counter.

For those who may for the first time want to run their program in kernel mode, I give an example of a Makefile for a successful build. Of course, for a successful build, you must have Linux kernel source headers in the system.

obj-m += evens.o

all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Connecting to the kernel is done with the command sudo insmod evens.ko, checking the functionality of the /proc/evens-file, that appeared after this, with the command cat /proc/evens, reading the event log explaining the system operations with the command sudo cat /var/log/messages.

To overflow the page buffer, set the limit parameter to a larger value, for example, 200. This value can be entered into the text of the program or used when loading the module with a command sudo insmod events.ko limit=200.

Log analysis can explain the remaining unclear points. For example, you may notice that before calling stop() after next() or start(), the system zeroes the variable `v'. You may also notice that before calling start() after stop(), the system prints the contents of the buffer.

I would be grateful if someone would report any inaccuracies found in my note or anything else that should be mentioned.

The source code is also available here.