(Haiku Blog-O-Sphere) Bits and Pieces: The Small BCardLayout
2012-01-21
21:19
A short post about something that's not really documented. When working on a communication application for Haiku, I needed to create a typical configuration wizard window. I required a few views to be present in one spot, with only one being shown at the same time - with the ability to switch between them on user Next/Prev button press. Since Haiku exports a neat layout API, I wanted to use one of those if only possible. And then I found the BCardLayout.
Come visit my Haiku Blog-O-Sphere page and read my new blog-entry - Bits and Pieces: The Small BCardLayout.
Really short post, but still it's something that's not really mentioned anywhere. Have a read if you're interested in Haiku operating system programming. Enjoy!
Plymouth bits
2011-12-29
16:27
Quite recently I had the need and 'pleasure' of playing around with the Plymouth bootsplash. For those that don't know, Plymouth is an application which runs very early during the boot process and displays either textual or graphical boot animation, hiding the actual boot process in the background.
There isn't much documentation available on the configuration and installation process - usually this is done by system distributors, not users themselves. As noted on the homepage, Plymouth isn't really designed to be built from source by end users. You can find some basic howto's around the internet, but today I would like to concentrate on the few bits that are harder to find.
For Plymouth to work correctly, it needs to be included in the initial ramdisk of the kernel (initrd/initramfs) - since we want the bootsplash to appear even before the actual filesystem is available. After building/cross-compiling Plymouth to the platform of interest, all that is left is installing it to the initial filesystem.
The source package for Plymouth has an INSTALL file with some basic information on how to configure, build Plymouth, prepare the initramfs and how to proceed in the init scripts with running plymouth during the boot process.
In Ubuntu based systems, the plymouth and initramfs-tools packages provide some interesting tools for this in the /usr/share/initramfs-tools directory. The hooks/plymouth script can be used for installing the necessary plymouth files into the selected initramfs directory from the currently working system (this can be done, for instance, by a DESTDIR="/initramfs" hooks/plymouth call). This hook copies all libraries for both the text and graphical themes that are currently selected in the system. The script is also an useful hint as to which files are needed.
On the other hand, the scripts/ subdirectory includes some Plymouth initialization scripts used by the Ubuntu initramfs. Look into the scripts/init-top/plymouth, scripts/init-bottom/plymouth and scripts/panic/plymouth files for more information.
After preparing the initramfs, there are a few things that are useful to know (thanks Steve Langasek for clearing up things for me!):
- For the splash-screen to be visible, the kernel needs to be given the splash boot parameter. Otherwise, no splash screen will be shown.
- When the graphical theme cannot be used, Plymouth falls back to the default text theme available.
- Adding a console= command line parameter might confuse Plymouth ('might' is the keyword here, since it's only something I've been told by someone)
- For the graphical theme - not every library as noted by the hooks/plymouth script is needed for Plymouth to work. Most of these libraries are needed for the label plugin to work, not the actual splash screen.
The last thing is important when for some reasons we cannot fit all these libraries in our initramfs. For instance, the ubuntu-logo theme available in the Ubuntu repositories - besides the actual graphical theme, also many many X and font libraries are installed. All of which weight additional megabytes.
These are required by the label.so plugin, which is responsible for displaying text messages during the boot phase. It is used when an error appears and the user needs to be notified or prompted for interaction. If such features are not required, the label.so, fonts and its dependencies (most of which are installed in /usr/lib and /usr/share in the initramfs directory) can be omitted. All that is required are the core graphics libraries and renderers.
For instance, in the case of ubuntu-logo, the required files would be: ubuntu-logo/*, script.so, renderers/*, ubuntu-logo.png and the like.
I think I'll play around with creating custom themes for Plymouth in my free time, since it seems quite easy. For now, I still have some work to do in other projects. I seem to be so busy lately - and this flu certainly isn't helping either. There might be one a short Haiku related post coming up in the nearest days. Hope you'll enjoy it!
Maliit Input Method
2011-11-02
20:59
Recently, I did some experimenting with the available OSK's (on-screen keyboards) around, ultimately focusing my attention on Maliit. Maliit is an OSK project mainly known for its use on the MeeGo mobile platform - but in reality it can also be used as an input method for both Qt and GTK+ standard applications on any Linux based operating system. Since the project is being actively developed and changes are made quite rapidly, a bit of work was needed to make it work for all possible IM cases. Nothing too complicated though. Let me help you dive in into the world of Maliit.
Big thanks to all Maliit developers for their swift and professional help!
Maliit is certainly a great OSK framework. Its design and implementation allow for much customization, with the default environment providing 'all that is needed'. Let's proceed with a short guide on how to get started.
To get everything working for Qt, GTK-2.0 and GTK-3.0 altogether, we will need the most recent version from the maliit-framework git repository (available here). The latest changes include the merging of meego-inputmethodbridges for GTK+ IM support, as well as new fixes for making GTK+ support work once again. Big thanks to Jon Nordby for finding and fixing the root cause of this issue. Cooperation regarding this bug was magnificent!
After fetching, compiling and installing the framework, we will also need the maliit-plugins package installed (most recent sources - here). Detailed instructions on how to build the sources can be found on the Maliit webpage, but in short - basically it's nothing more than just: qmake; make; sudo make install.
Maliit consists of the framework (API and the OSK server etc.) and plugins. Actually, a plugin is the keyboard that we see being displayed on screen. By default, a QML based plugin is built and used, but anyone is free to create their own plugins and use them instead.
How to get Maliit up and running? As noted on the running Maliit section of their web page - after installing both packages, we need to make the application of interest use Maliit as the current input method. Of course, we can do it manually through the "Input method" context menu. Another way is just setting the QT_IM_MODULE (Qt) and GTK_IM_MODULE (GTK+) environment variables. The web-page has a bit outdated information, as currently we should set both of them to Maliit.
If needed, we can use pam_env for setting the environment variables for all applications on the system (e.g. the /etc/environment file).
Now we need to make the Maliit OSK server running. Just running maliit-server somewhere in the background is enough, as long as it is started as part of the current session. It is important for the server to be running all the time (for now, because of small problems with GTK+ support - those will be fixed soon), otherwise GTK+ applications will crash (bug report 23949). We can use XDG autostart for starting maliit-server on startup. An exemplary /etc/xdg/autostart/maliit-server.desktop entry could look like this:
[Desktop Entry]
Name=Maliit OSK
Exec=/usr/bin/maliit-server -bypass-wm-hint
Terminal=false
Type=Application
X-GNOME-Autostart-Phase=Initialization
X-GNOME-AutoRestart=true
NoDisplay=true
For enabling GTK+ support, sadly, we have to perform 2 additional tasks. We need to update both the GTK+ input method module caches. Currently the installation script does not perform this step for us, so we have to do it manually. We have to update two caches - one for GTK-2.0 and one for GTK-3.0. On a Ubuntu-based system (11.10 in my case), this would look like this:
eval `dpkg-architecture -s` # This is needed to get $DEB_HOST_MULTIARCH
/usr/bin/gtk-query-immodules-3.0 >/usr/lib/gtk-3.0/3.0.0/immodules.cache
/usr/bin/gtk-query-immodules-2.0 >/usr/lib/$DEB_HOST_MULTIARCH/gtk-2.0/2.10.0/gtk.immodules
Now, Maliit should be enabled for all GTK+ and Qt applications.
Maliit supports keyboard rotation (landscape, portrait) - although it's not really useful for Desktop uses. You can see how it works through the maliit-exampleapp-plainqt example application. Writing custom plugins for the framework is also very pleasant (I might return to this a bit later). I personally use the QML-based quick plugin with a few smaller modifications.
As far as on-screen keyboards go, I prefer Maliit over Onboard or Florence, so I certainly recommend giving it a try. Currently, the GTK+ input method has a bug that might pop-up sometimes, making the plugin invisible until the server is restarted. This should be fixed soon enough, since jonnor and mikhas from Maliit already mentioned a way to hack-fix it. Basing on their propositions, I prepared a small patch as a fix - it's available here. If you encounter a problem with the OSK not re-appearing, apply this patch with patch -p1, rebuild and reinstall. Then, re-run the server with maliit-server -force-show -bypass-wm-hint and it should work. Have fun!
UPDATE! It seems Jon put up a merge request for an almost identical fix to the mainstream! Meaning soon no patching will be required. You can check out the merge request here. Thanks!
UPDATE! Changes adding automatical IM cache update have been added! The Ubuntu cache update during make install patch that I prepared has been merged with the official tree. Jon also added Fedora support as well. Magnificent!
Basic kernel debugging
2011-10-10
20:46
A modified kernel, a custom system - this can lead to the kernel not being able to boot properly. What to do in such case? Usually we can try getting as much information as possible to locate the underlying problem. We can use some quite basic techniques to achieve our goal.
When working with a relatively sophisticated-debugging-unprepared system, it's best to just see what the kernel says, deducing which part causes the system to halt. In most cases, if there is a display device present and configured, we should be able to see the kernel messages on this device - if, of course, the respective kernel config variables are set (in this case, the CONFIG_VGA_CONSOLE or CONFIG_FRAMEBUFFER_CONSOLE).
The case is different when a display is not present. We can either use a serial console or a net-console here, whichever is available. The easiest approach is using a serial console. We just need to be sure that our kernel configuration includes all necessary entries, such as CONFIG_SERIAL_CORE, CONFIG_SERIAL_CORE_CONSOLE and respective serial drivers (e.g. CONFIG_SERIAL_8250 and CONFIG_SERIAL_8250_CONSOLE in case of a 8250 UART chip). We then just append to the CONFIG_CMDLINE configuration the console=ttyS[console number],[baud rate] parameter and we're ready to go.
In some cases, however, the kernel halts even before we can see some actual output, for instance, before the console driver or the video device are setup. In this case, we might get lucky by using the so-called earlyprintk's mechanism. The Linux kernel has a feature allowing the kernel to output messages to the serial console or VGA buffer directly even before the real console code is initialized. This feature can be enabled by setting the CONFIG_EARLY_PRINTK variable in the kernel config, additionally providing an earlyprintk= parameter to boot arguments. It can be either vga, or ttyS0/ttyS1 (with the baud rate added as necessary). After the real console is initialized, the earlyprintk console is disabled by default - but if you want, you can keep it running by appending a ,keep argument to the earlyprintk parameter. But most of the time, it is not needed.
This can give us a good overview of where the problem lies. There are some flags and kernel command-line parameters which can aid us in debugging certain features, like e.g. initcall_debug for making initcall execution more verbose. This can help a bit when your kernel hangs up and we have problems in locating the source of the problem.
More useful parameters can be found in Documentation/kernel-parameters.txt in the kernel source.
My common way for fast problem localization is using the usual "print it!" debugging, using printk()'s around suspicious kernel areas. Early printk's help in this as well.
If we know that the kernel itself has no problems but problems probabbly appear during or right after rootfs setup, we can also try preparing a small initramfs to include in our image instead. An initramfs is a file-system image that resides directly in the kernel image, being loaded to RAM during boot time. We can then, with the available tools, try hacking the real rootfs manually. Busybox is a good choice for a fast, lightweight and working environment for the RAM file-system. To include an initramfs in our image, we need to set the CONFIG_BLK_DEV_INITRD config option and set the CONFIG_INITRAMFS_SOURCE to point either to the directory to be included or the .cpio archive with our prepared RAM rootfs. We will also need to specify whether the initramfs should be compressed or not, setting the necessary flags as needed. The CONFIG_INITRAMFS_SOURCE also accepts files containing specifications for directories and device nodes to be created on it during building the kernel image. More about this can be found in Documentation/filesystems/ramfs-rootfs-initramfs.txt.
Another useful tool is the SysRq magic key. If we configure our kernel with a CONFIG_MAGIC_SYSRQ option, we can use the specified key combination to command the kernel regardless of what it currently does (most of the time). The key combination varies from architecture to architecture, but from experience I know that usually it's the same ALT + SysRq + [command key] set. The SysRq key is also known on some keyboards as the Print Screen button. If you're working on a serial console, you can try sending the combination through the terminal, raw.
Most useful commands:
- b - reboot the system immediately
- k - kill all programs on the current console - this might be useful if something holds up your system
- m - dump current memory info, useful for debugging memory issues
The SysRq mechanisms are very well documented in the Documentation/sysrq.txt file.
If we encounter a kernel Oops or, even worse, a kernel panic - it is also nice to know how to dig as much information as possible from such a crash. When working on a remote device it is also wise to include the panic=[timeout] kernel parameter to our arguments. This way, when the kernel panics, the device will try to restart itself after the set timeout period, allowing us access to the bootloader without performing a hard-reset. We can set it to a bigger value to still be able to analyze the crash-log.
As for handling the actual crash, the kernel documentation again has a very nicely written guide to Oops handling. Check it out in Documentation/oops-tracing.txt in the source code.
There are cases in which all these methods are useless, and we need something more sophisticated and/or low-level. When such a need arises, we can try our chances with either kgdb-gdb debugging or Linux/gdb-aware JTAG hardware. But I will try to cover these some other time.
Most of the time printk's (and early printk's) will help in finding the problem. Sometimes some disassembly is necessary - for instance a closer look at some parts of the vmlinux image or, specifically, particular object files composing the bootable image. Kernel debugging is usually like crime-solving. It takes much effort, clue-searching, time and thinking. And, as it is also with crime-solving - sometimes you might simply fail. But one must try not to demotivate oneself. If all ideas have been already used up - take your time, switch context, and return with a fresh mindset after a while. This helps.
Gone Canonical
2011-09-08
19:11
Today's post is more private-life related than the others, but still in some means technical. I am proud to inform that I have officially joined the Canonical team as a Software Engineer! From now on, I will help enhancing the overall Ubuntu experience, mostly working on their flagship Unity environment.

I intend contributing to the Ubuntu community as much as possible, earnestly carrying out my new responsibilities. Time will tell how well I will do. But I hope for the best! Right now I'm preparing everything that is needed, since my work will commence on the 19th of September.
Using the occasion, I would like to thank the whole ASN team for everything up to now. It was great working with you people, these few years were really magical. Anyway, these guys can do real magic with technology! So remember - if only you're looking for someone to code something for you, ASN is probably the best choice there is. There is no challange too big for this team.
I'll miss working with you! But no worries, I'll be around. Watching!
A little bit of profiling
2011-08-28
11:18
Code profiling is a very important aspect of computer programming - almost every software engineer knows that well. It helps finding bottlenecks in your code, finding which parts need improvement, which cause trouble etc. I'm sure everyone knows of this already. There are many tools for this purpose available around the internet. This short post lists a few of them, as well as a brief introduction to a really simple and naive solution I made in the past.
Here are some tools helpful in application profiling:
- Valgrind - an excellent set of tools for profiling and memory-leak testing - especially callgrind and memcheck
- gprof - the GNU profiler, one of the most essential code profilers available, part of GNU Binutils
- QTestLib - for Qt applications, the QTest unit-testing framework also provides benchmarking functionality
In the past, while working on the Flatconf project for ASN, I needed to do some fast but non-complicated time-profiling for my application. Since I was mostly interested in just knowing the average time spent in selected functions and I somehow couldn't make the existing tools do what I want - I wrote a really simple library for the time profiling I needed.
The timeprof profiler uses the instrumentation callbacks offered by the gcc. The library itself isn't very interesting, since it has been written in a short period of time just to count the time usage of called functions, but it shows nicely the use of the -finstrument-functions flag. You can find the library in its respective ASN Labs git repository here.
The instrumentation handlers can be used for any specific analysis of function behaviour in a program. All that is needed is specifying the flag during compilation and providing the __cyg_profile_func_enter() and __cyg_profile_func_exit() callbacks definitions as needed. We can inform the compiler which functions we do not wish to analyse by declaring them with the __attribute__((no_instrument_function)) attribute.
The instrument-functions mechanism is rather well documented, so there's no use in duplicating information. But it's really useful to know about its existence when specific analysis or debugging is required. Consult the timeprof source code to see an example of its usage.
On a side note - lately many different, interesting things happened, that is why today's post is a rather short one. But I'll return soon and hopefully explain the reasons why. Thanks!
(Haiku Blog-O-Sphere) Bits and Pieces: Notifications and Menu Builders
2011-07-27
20:09
During the weekends, I'm working on enhancing a very old BeOS application long lost in time. While browsing the Haiku kit and application source tree, sometimes I stumble upon some new (at least for me) but also interesting small elements that the Haiku operating system added to the Haiku API during its development. I like to try these elements out. Most of these API additions might change or even disappear in the nearest future, since I understand their development process is not yet finished, but they're interesting to know nevertheless.
Come visit my Haiku Blog-O-Sphere page and read my new blog-entry - Bits and Pieces: Notifications and Menu Builders.
I finally added a new post to my long forgotten Blog-O-Sphere. Oh, and go give Haiku a try while your at it. It's worth it. Thank you and stay tuned!
Flatconf 1.5 concept?
2011-07-06
12:13
The Flatconf project has been part of my interests since long - from the very first moment I started cooperating the ASN Labs team. Not a very well known project, from what I know it was only used in the old Lintrack distribution. It's more of an interesting experimental concept than an innovative solution. Flatconf is an attempt of creating an universal configuration system based on the idea of 'flat files' and the usage of the file-system as a natural database. Currently two versions of the specification exist, with the newer one (2.0 draft) using a new concept for holding variable meta-data - not entirely a good one though. But, just now, I thought about the original idea and came up with some thoughts for modifying it, making it little bit more feasible. Flatconf 1.5 anyone?
The original concept of Flatconf 1.0 had uber-flatness in mind. To understand the motivation, first we need to define what a flat file means in our context. A flat file is a file which contents are not structured in any additional way besides holding data. This means, that to get the actual 'data' from a flat file means just reading its contents - because its contents are plain data with no unnecessary structural meta-data. A structured (not flat) file would be any file which would have to be parsed first to retrieve the data of interest.
Flatconf used flat files for holding configuration variables, i.e. a separate file for every variable. Structurization of data is achieved by using the file-system directory hierarchy. Each variable can have meta-data associated with it - such as information about variable type, variable user-readable description and many more. The main problem and difference between versions 1.0 and 2.0 is the way these meta-data are stored.
- In 1.0, meta-data were flat and also stored as separate files on the file-system. Each configuration sub-directory had an .fc hidden directory in it which held meta-data for all respective variables residing in the given directory.
- In 2.0, meta-data were held in structured files in a different directory tree, formatted using the custom-made FCML (Flatconf Meta Language).
The 1.0 approach had the problem of bloating the whole data directory tree with unnecessary files and directories at every hierarchy level. The 2.0 approach, well, wasn't flat anymore, and since it used a newly designed meta language - created many, many problems and had many drawbacks. It removed the ease of variable manipulation - forcing the system to parse meta-data first, adding complexity.
But what if we would still leave the 1.0 flat concept, although slightly modified? What if we just moved all the meta-data from the data tree to a separate directory like in 2.0, but left them as flat files? This requires the overall specification to be changed, since now we might have a bit more freedom.
Best explaining it with examples of Flatconf configuration trees.
data/ # Main fc data directory
|- hostname # - \
|- welcome_text # - Text variables
\- net/ # A directory variable
|- ip
|- gw
\- ifaces/ # A list variable, i.e. a directory with user element-addition possibilities
|- eth0/ # eth0 interface element added by user
\- wlan0/ # wlan0 interface element added by user
# (both are directories holding more variables inside, like MAC address etc.)
metadata/ # Main fc directory for holding respective meta-data
|- hostname/ # Every variable has a directory named the same way holding its meta-data
| |- .type # Every meta-data is a separate file, starting with a '.'
| |- .descr # e.g. this is the variable human-readable description
| \- .onapply # And this is a script/application that is fired on variable modification
|- welcome_text/
| \- (...)
\- net/
|- .type # e.g. directories have "dir" in their .type contents
|- .descr-en
|- .descr-pl # Descriptions can be locale-specific
|- ip/
|- gw/
\- ifaces/
|- .type
|- .skel/ # The skeleton directory, holding the base meta-data for every new list element
| |- .type # Like for instance a .type data with contents "dir"
| \- (...)
|- eth0/ # Every new list element has a named copy of .skel created
| |- .type # Same contents as in .skel/
| \- (...)
\- wlan0/
This example can be better understood if one already knows the basics behind the Flatconf concept.
In this approach, fetching the description string meta-data for the net/ip variable is simple as issuing cat $FC_METATREE/net/ip/.descr in bash. The same for fetching the variable's data: cat $FC_DATATREE/net/ip.
Of course, in this approach, the meta-tree is not read-only anymore, as addition/deletion of new elements to the list involves creating/removing directories from the list's meta-data directory. But for this concept it is of not much concern.
Why meta-data variables start with a '.' (dot) character? To remove ambiguity. Because of this small modification, we can easily create directories holding new element meta-data named the same way as the element being added, without having to worry that someone would add a descr list element, bringing confusion to the system (confusion like: "is this the descr meta-data variable, or a list element called descr?")
As for list ordering and visibility - we could add a .listorder trivially structured meta-data variable listing the element order or even list element visibility.
Well, all this is just an idea.
As stated before, Flatconf is an experiment. Using the file-system as a natural database for structuring and holding variables is an interesting idea, similar to the GNU/Linux concept of 'everything is a file'. Actually, Flatconf is very similar - in visualization - to the /proc/sys sysctl interface. But is it efficient? When writing bash scripts, it might indeed be much easier just browsing the file-system and reading file contents instead of parsing contents by hand. But when writing C/C++ code - the costs might be higher. One of these days I will do some benchmarks to test performance issues.
I noticed that many of my posts are about less-known, niche topics. This does not mean the unfamiliar cannot be interesting. There is always much to learn from the non-mainstream!
Haiku, the HaikuAPI and the menu
2011-06-27
22:20
Most of you probably at least heard about the Haiku operating system. For those who didn't or just know the name, Haiku is the open source recreation and spiritual-successor of BeOS - an alternative multimedia-oriented operating system discontinued some time ago. Today's post will be a short collection of brief, random informations regarding its application programming interface (known as HaikuAPI). An API that I consider very consistent and intuitive to use.
It's just an overview of all the interesting issues though, since sadly I have many work related things on my head right now. Just to say it this way - wish me luck! But now back to the topic at hand.
In the past, BeOS was relatively popular in the personal computer world. Even long after it was no longer developed, I used it quite frequently (during my university days), writing small BeOS applications, learning the insides of the BeAPI during these processes. History aside, the operating system lives till this day in the form of the still developed Haiku operating system.
The API stays more or less consistent with the original, with a few very interesting additions. It would be useless writing an introductory tutorial right now, since there are many other places where the basics of the Haiku API can be learned from. For instance, Learning to Program with Haiku by DarkWyrm, The Haiku Book and the good old - but still very informative - legacy Be Book.
What does the HaikuAPI have to offer? A short overview of its structure: the API is divided into a variety of so called kits - sets of classes and object types for given purposes. For instance, the Interface Kit contains all the widgets and view specific classes, while the Storage Kit defines file system access primitives.
The Haiku modifications of the BeAPI include the addition of the so called Layout API. This was the one thing missing in the old BeOS system, as all layouting had to be done manually by the programmer. Which was, as everywhere, very tedious. The Layout API is still in the development, so its API might change in the nearest future, but here is a very good article written by my past GSoC mentor Ryan Leavengood explaining its basic usage - Laying It All Out, Part 1 available on the Haiku Blog-O-Sphere.
But quite recently I found another new interesting feature in the Haiku API. The BLayoutBuilder class also offers the functionality of easily building menu contents! Similarly to building the layout, we can also create a menu with all its respective BMenuItem's in a fast and easy way. Consider the following BPopUpMenu below:
// This piece of code is actually part of the new Toku Toku development source
BPopUpMenu *menu = new BPopUpMenu("contact_menu", false, false);
BLayoutBuilder::Menu<>(menu)
.AddItem("Informacje", CONTACTMENU_INFO)
.AddItem("Rozmowa", BEGG_PERSON_ACTION)
.AddItem("Dziennik rozmów", CONTACTMENU_LOGS)
.SetEnabled(false)
.AddSeparator()
.AddItem("Usuń z listy", CONTACTMENU_REMOVE)
.SetEnabled(false)
.AddItem("Ignoruj", CONTACTMENU_IGNORE)
.SetEnabled(false)
;
The newly created BPopUpMenu, a pop-up context menu displayed after a right-click mouse action, is filled with menu items - each with a different message with a different identifier (like CONTACTMENU_INFO, a constant identifier defined elsewhere). Every item is enabled by default, but we can easily change that on the run using the SetEnabled() method, during menu construction. We can add separating elements and also sub-menus - everything in one sequence. Check here for more details.
What I like about the HaikuAPI? It's consistent. Its internal data structures are easy to use, without the complexity of the usual high-shelf C++ code. It uses classes and object-oriented design, but in the same time it still feels like writing standard C code. Just with classes. The messaging is well designed, and the API allows for much freedom. It's nicely bonded with the operating system.
With all the other, various cross-platform application frameworks around, like Qt, GTK+ or XULRunner, it's sad that no one succeeded on nicely detaching the HaikuAPI or BeAPI into an external toolkit to be used on other, more popular systems. I knew of some initiatives with similar ideas in the past, but from what I know, none of them survived.
Or maybe some did, but I just don't know about it? Nevertheless, as an old BeAPI fan, I would certainly like to write a few multi-platform applications using the Haiku API!
Might consider writing a new post to my Haiku Blog-O-Sphere. It's been a while!
Ubiquiti RSPro RedBoot, OpenWRT and the exec
2011-05-22
11:28
Not sure if this is a common bug for everyone using a hand-built OpenWRT on Ubiquiti RouterStation Pro platforms, but at least I notice it on all the boards I have in my possession. When booting the OpenWRT kernel and watching the printk() output, rubbish data can be seen in the kernel command line parameters - in normal cases. Usually this does not break anything, but as we know, RedBoot in Ubiquiti boards passes board-specific parameters to the kernel command line with information such as board type, ethernet MAC address etc. Sometimes those parameters are not passed and parsed correctly because of this.
I did a small investigation why this happens.
First of all, it is best to understand how the Linux kernel fetches the command line in case of MIPS AR71xx boards. In the OpenWRT kernel 2.6.37, all boot-specific parameters are passed through the a0, a1, a2, a3 processor registers during start-up, which are then copied to respective kernel variables from fw_arg0 up to fw_arg3 - and used in this form later on. We are interested in the three first variables. Their meaning is similar to how normal GNU/Linux applications fetch information from the system:
- fw_arg0 - the argc equivalent, tells the kernel how many command like arguments have been passed.
- fw_arg1 - the argv equivalent, a pointer to an array containing the arguments.
- fw_arg2 - the environment table equivalent, an array of pointers used to pass the environment settings.
When in RedBoot, a call to exec executes the loaded program, setting the environment and allowing passing additional kernel arguments to the cmdline (through the -c option). We need to remember that calls to the RedBoot go command only execute the kernel, while exec also sets all the platform-specific parameters in the environment beforehand.
Information such as the aforementioned board type, ethernet address etc. are passed through the environment, so in our case through fw_arg2. The variables fw_arg0 and fw_arg1 are used only in the case of passing additional arguments through the exec -c command. The problem lies in how these two variables are used by the bootloader. Anyway, first the command line arguments are pasted into the kernel, and then concatenated with the environment parameters.
cmdline=[User command line arguments] [Environment variables]
It seems the Ubiqiti-modified RedBoot, whose sources aren't available - but should, passes always a constant number as the number of cmdline arguments = 2. fw_arg1[0] is always an empty, NULL-terminated string and all the other, actual arguments are squeezed inside of fw_arg1[1] as a single string. Besides being strange and illogical, up utill now there is no real problem. But sadly, Ubiquiti's RedBoot (at least version 0.9.00318M.0905121200 - built 12:01:38, May 12 2009 which is on all my RouterStation Pro boards) seems not to initialize the fw_arg1[1] memory area during boot sequence. It only does so when we explicitly pass some additional kernel arguments like exec -c "blah" or even exec -c "" (for no parameters). But before that, there's nothing but chaos in our cmdline.
Some might be thinking - "But wait! If this is how it is, why does the firmware still work correctly most of the time? Shouldn't rubbish in the cmdline make the environment also unreadable?".
Yes, that's true. Since the first part of the cmdline that is passed to the kernel is made out of fw_arg1 and then with the environment glued onto it, the important parameters should not be visible because of all the invalid data. But since the memory area is uninitialized, there is also some chance that sooner or later in that big buffer allocated for the command line a 0 byte from that rubbish will appear. So there still is a chance that the environment will be appended and visible. The kernel ignores most of the chaos, since he is unable to parse it, and simply moves to the environment variables. Or at least that's how I explain it.
At least that seems to be the case on my boards. The fastest way to deal with this problem is either using exec -c "" instead of exec in your bootscripts, or modifying the kernel source to ignore the user-given command line arguments (or adding a magic-sequence mechanism perhaps?).
Lately I'm a bit busy with different work and research stuff, so I don't have much time to spend on my hobby projects. But expect some Haiku OS related posts soon, since I intend to get back to Toku Toku as soon as possible.
Stackguard in gcc
2011-04-27
14:42
During programming in C for work yesterday, I popped into a small issue I did not expect. I was concerned because a piece of code that I normally thought would work (and it did, but in other circumstances) - this time did not. I wanted to better understand this problem and in the end learned a bit about the Stackguard in gcc. Some of you probably heard about it already. Consider the following piece of code below.
#include <stdio.h>
#include <string.h>
struct data {
char stuff[2];
char more[510];
};
int main(void)
{
struct data foo;
strcpy((char *)foo.more, "Something bigger than 2 bytes long");
return 0;
}
When compiled with the standard gcc test.c -o test line, the program will probably work normally. Yes, stuff is only 2 bytes long, but afterwards we still have 510 more bytes available, so everything is fine - there's plenty of space. We can even add __attribute__ ((__packed__)) to make sure of that.
But then, let's try compiling the same code with optimization. On my compiler version (4.4.3-4ubuntu5), even building with -O1 and bigger suddenly made this code detect an buffer overflow during compilation and during runtime - which is understandable, of course. But why now? The warning informs of some __builtin___strcpy_chk() being called. In disassembly we see that indeed instead of the standard strcpy() we wanted, a different, safe version of the function is called. In libssp/strcpy-chk.c of the gcc 4.3.3 source code we can see the internals of __strcpy_chk(). This version is called instead in cases when a statical, known sized buffer is used. It first checks if the copied string can fit in the destination buffer, and bails out when not. Such mechanisms are part of the GNU C compiler's Stackguard - you can read a bit more about it here.
Why during optimization? It seems that the Ubuntu version of gcc, even the basic optimization levels have the _FORTIFY_SOURCE define set to 2 by default. This can be disabled by adding a -D_FORTIFY_SOURCE=0 to our gcc flags during compilation when needed. Then we can finally do what all we want with our static buffers. At least almost!
Atom feed added
2011-04-13
22:57
Quite recently, I was asked by a colleague to maybe include an RSS feed on my web page. I thought that it might not be a bad idea, considering that I am not doing updates too frequently. So I added an Atom Feed for the development part of my web page. Since I am not using any CMS system here - only some minor PHP scripting help - having a small amount of time, I decided writing a quick Python app converting my HTML content into an Atom web feed XML file.
Putting aside the question why I do not want to use CMS for my web content, the Python script I prepared is really, really badly written. I probably shouldn't even post it here, but since it works, maybe at least one person will find it a bit useful. For parsing, I used the HTMLParser module (known as html.parser in 3.0). The script works as a very naive finite state machine, looking for particular div's and other HTML tags I use and fetches their contents respectively. One day I might clean it up and make it better. For now, something that just works is fine.
You can check out this ugly piece of Python code [here].
UPDATE! It seems the server had some problems with serving files with the .py extention. Now the download should work correctly. I apologise for the problem.
In the nearest update, I will also add an Atom feed for the art section of my web log. Stay tuned.
UniConf: part I
2011-03-26
In this post, I will try to write about how to use the basic UniConf API. This is an unofficial guide to UniConf, so be advised. I will concentrate on the native C++ version of the API. This can be thought of as something like a small UniConf tutorial.
If you read my previous UniConf post, you probably have an overview of how simple and straightforward it can be. Everything starts off with the creation of an UniConfRoot object defining our UniConf configuration root. If we interpret the configuration tree as a hierarchical tree similar to those in file systems, the UniConfRoot is the variable tree that is mounted at the root path ("/"). Its constructor, in one of the most commonly used forms, accepts a moniker string as the argument.
Monikers are strings used to represent generators available in a UniConf system. I already mentioned some of them previously. Using generators allows using different configuration system backends and modificators. Monikers can be mixed together.
Some of the available monikers:
- ini: a standard .ini file-like parser
- temp: a writeable hierarchy stored only in volatile memory, used for temporary values
- unix: a UNIX domain socket communication with an uniconfd daemon
- tcp: a TCP socket communication with an uniconfd daemon
- ssl: a SSL-encrypted TCP domain socket communication with an uniconfd daemon
- readonly: make the tree read-only
- cache: use cache to make variable-access faster
- list: makes UniConf browse through the list of other generators in order to find the variable
By defining the root UniConf entry, we need to decide what generators we want to use. Let us consider 2 .ini files we would like to use as part of our configuration: foo.ini and bar.ini.
foo.ini:
hostname = ala.lan
ip_address = 192.168.0.41/24
gateway = 192.168.0.3
bar.ini:
[misc]
welcome_text = Hello world!
[terminal]
width = 90
height = 40
As stated above, we first start by defining the UniConfRoot. We will use both files at once. Consider the following code:
#include <iostream>
#include <wvstreams/uniconfroot.h>
using namespace std;
int main(void)
{
UniConfRoot root("cache:list:ini:foo.ini ini:bar.ini");
UniConf alias(root["/terminal"]);
cout << "hostname = " << root["hostname"].getme().cstr() << endl;
cout << "/terminal/height = " << alias["height"].getme().cstr() << endl;
root["hostname"].setme("yuki.lan");
root.commit();
cout << "hostname (new) = " << root["hostname"].getme().cstr() << endl;
return 0;
}
We defined the UniConfRoot to include caching of values for faster accessibility, and told UniConf to use a list containing two ini: generators for fetching the configuration variables. This means that in result, mounted on top of the UniConf tree we will have the contents of our two .ini files at once. Once we want to access a variable, UniConf will browse the whole list looking for the variable definition.
Almost every object in a UniConf tree is of UniConf type. This is quite intuitive, because if we consider the configuration tree as a directory hierarchy tree, even the root of the tree is in fact just another directory. The UniConf type (and, since UniConfRoot is his derivative, this type as well) provides a handy [] operator by which the programmer can easily access variables with a given path relative to that variable.
As argument, this operator takes a UniConfKey - a very useful type for UniConf path storage. But since an UniConfKey can be constructed from standard C strings, we can insert a string as the variable path to the [] operator. The resulting UniConf object representing the given variable/object relative to the queried object is returned.
In our example above, the code root["hostname"] will return the object corresponding to the foo.ini's hostname variable, which's VFS-like path is /hostname (because we mounted the configuration file at the root).
.ini file sections act as directories used for holding section variables. After defining the root, we define an 'alias' to the terminal section present int the bar.ini file. What I called an 'alias' is nothing more than simply the UniConf object to the terminal object (section). We can now use the alias object to access variables in the terminal .ini section. So, now, instead of writing root["terminal/height"] we can use alias["height"].
Now for fetching variable values. Every UniConf object exports a getme() method that can be used for this purpose. The method returns a WvString, which is a wvstream library string format. In the case when you do not want to use any other elements from the wvstream library instead of the UniConf part - WvString's provide a cstr() method that returns the standard C char string equivalent of the string.
The same way setting variables can be done. Analogously, every UniConf object has a setme() method. As an argument it reqires a WvStringParm (aka WvFastString) object, which is nothing more than a faster WvString for passing function parameters. It's faster and more memory-efficient, because these objects are created only from const char *'s. It does not allocate its own memory for holding the string and copying it, but uses the const char * directly. They are not advised to be used for other purposes.
We can therefore use root["hostname"].setme("yuki.lan"); for modifying the contents of the variable hostname to yuki.lan.
After we set the new value of the variable, we want to make sure the change has been propagated to the given configuration subsystem. That is why we commit the changes with a root.commit() call. After changes are committed, the new variable values will be saved to their corresponding .ini files. We can then check if the change really happened by reading the value again and checking the configuration files later.
But now, let us suppose that we have a new configuration tree we want to attach to our configuration system. We could of course define a new, detached UniConfRoot for this purpose, but let us suppose we want it present in the tree we already have. Consider adding the following code.
(...)
root["/tcp"].mount("tcp:localhost:4111");
root["/tcp/test"].setme("something");
cout << "From uniconfd = " << root["/tcp/test"].getme().cstr() << endl;
(...)
The mount() method mounts a given generator at the selected UniConf key. In this case, we try to mount the tree exported by an uniconfd server running on localhost's port 4111 to the key at path /tcp. This way we can extend our UniConf configuration tree dynamically using a similar scheme to the one present on Unix filesystems.
For this example to work, one would need a running uniconfd server on localhost. The simplest way would be running uniconfd in the following way:
# As for uniconfd version 4.6.1
uniconfd -l tcp:4111 /=temp:
This way we will have an uniconfd server running an empty in-memory configuration - consult the uniconfd manual and help text for more details. Since we use the temp: generator, the /tcp configuration tree is empty by default. So to have any variables to access, we have to create them by ourselves first.
One last thing I will explain during this post are iterators. UniConf provides us with a handful of different iterators that can iterate through our variable tree. The most basic one is UniConf::Iter, which simply iterates through all immediate children of an UniConf node. This means we can use this iterator to browse through variables on one level, not looking into sub-branches of the tree (like iterating through files in a given directory, not including sub-directories). If we want a depth-first recursive search of a given branch, we can use the UniConf::RecursiveIter iterator. There are also sorted iterator versions of the two - the UniConf::SortedIter and UniConf::SortedRecursiveIter - which traverse the variables in alphabetic order, by full path.
Consider the following piece of code:
(...)
UniConf::RecursiveIter i(root);
for (i.rewind(); i.next(); )
cout << i.ptr()->fullkey().cstr() << " = " << i.ptr()->getme().cstr() << endl;
(...)
We create a recursive iterator to browse through our whole tree. We first call rewind() to position the iterator on the beginning of the branch in mention, and then move the iterator through consequent next() calls at every iteration. To get the object currently pointed at by the given iterator, all we need to do is a ptr() call. The fullkey() is another UniConf method that returns us the UniConfKey object of the given variable. It contains the full path to the variable. We print it along with the variable's value.
This is all for today's post. In the nearest future I will try to explain some other, maybe a little less basic aspects of UniConf, such as notifications, copying and many others. But as you have probably noticed by now, UniConf is simple. Very simple. And that makes it so interesting.
You can download the source code used in this post (with a small edition and a Makefile) [here].
Remember to have UniConf development libraries and headers installed beforehand - and uniconfd, if you want to check how fetching through the tcp: generator works.
UniConf: introduction
2011-02-15
While writing my thesis some time ago, I did a small review of existing configuration systems in use. One of them especially caught my eye - UniConf. The authors once called it the "One True Configuration System", which might sound a bit provocative, but has a seed of truth in it.
In short - UniConf aims to meld many configuration systems together. This means UniConf tries to understand most existing configuration systems, written in various file formats, and exports them in an unified way. UniConf, by its own, does not really define one particular way of storing variables, but provides an API for reading most of the common configurations (though the use of so called generators) and adds some features that can be useful when designing/using/implementing a configuration system.

In the graph above, temp: cache: unix: ssl: are some of the possible generators that can be used by UniConf. For instance, the temp: moniker makes the variables of the given UniConf tree be only saved in memory, where unix: makes use of UNIX sockets to read the configuration variables from a listening UniConf server (uniconfd - we'll get to that). As another example, an ini: generator can be used to read from and store variables in the standard .ini configuration file format.
UniConf provides an C++ API with respective wrappers for the C language. Designing a simple UniConf application is very easy and is a rather convenient process.
#include <iostream>
#include <wvstreams/uniconfroot.h>
using namespace std;
int main(void)
{
// Define the root of the configuration.
// Use an .ini file stored in the current working directory
// Since this is our first UniConf tree, we mount it as "/"
UniConfRoot root("ini:tmp.conf");
// Reading the value of a variable
cout << "Value before: " << root["ala"].getme().cstr() << endl;
// Setting the value of a variable
root["ala"].setme("is the best");
// We can also ask for the virtual name and 'pathname' of the variable
cout << "Path: " << root["ala"].fullkey().cstr() << endl;
// In this choice of monikers, if we want to save the changes, we
// need to commit them
root.commit();
return 0;
}
Besides having the basic get/set variable functionality, UniConf also offers such features like variable on-change notification (by using the add_callback() method) or mounting multiple configurations into different mount points a single UniConf tree. But since this is only an introductory post, we will get to these features later.
But configurations not only need to be accessed locally. UniConf includes a configuration serving daemon, uniconfd, which can be used either to export variables outside of the local space or act as a local server, binding configuration systems and making them available in one consistent form. The uniconfd server can either listen on a given TCP port or use a UNIX socket for communication with clients. Useful, although from what I know some features seem to be still missing.
I am rather fond of UniConf, since the idea of a hybrid system that could bind configurations together reminds me of Flatconf. One problem might be the lack of beginner-targeting documentation, since one of the only programming related help one can get can be found either in the source code or the doxygen documentation. But I'll try to shed some more light on the programming aspects of this system in my future UniConf posts. Stay tuned.
Haiku optional packages
2011-01-24
Another short post. I've seen some people having trouble with proper including of development tools in the Haiku image. The process is very simple, but indeed there is not really much information about it. All the development tools (like gcc, ld, autotools, perl and more) are packed in so called Haiku Optional Packages. You can browse the list of available optional software in the build/jam/OptionalPackages file in the Haiku source tree. Some of them can be installed by using the in-system application installoptionalpackages as well, but that's another story.
As for our dev-tools, there are 3 packages available for this purpose: Development, DevelopmentMin and DevelopmentBase, each of them adding some set of tools. To add them to the image during build time, the user friendly build/jam/UserBuildConfig script can be used. How the configuration file should look like can be seen in UserBuildConfig.sample or UserBuildConfig.ReadMe files in the same directory - but right now we're only interested in the AddOptionalHaikuImagePackages command. A package set with this command is downloaded from the internet, prepared and included in the image. Easy.
AddOptionalHaikuImagePackages Development ;
AddOptionalHaikuImagePackages DevelopmentBase ;
AddOptionalHaikuImagePackages DevelopmentMin ;
HAIKU_IMAGE_SIZE = 400 ;
With this, the development tools are added to the Haiku image during the build process. Note the HAIKU_IMAGE_SIZE, since without increasing the size of the image our additional packages might not fit.
Now you can experiment with the Haiku API in real-time.
Slow git status response
2010-11-27
This is a trick that my colleague Michał Wróbel showed me when I had problems with slow git response on my local repository.
Sometimes it strangely so happens that git status becomes really really slow, taking even a few minutes to complete - on a big, multi submodule repository in this case. This can be really annoying. Quoting my colleague:
"Under some (still not fully known) circumstances git starts to work in a strange mode scanning the contents of the files instead of just stat()ing them:"
Right:
firmware/code/linux$ strace git status
(...)
lstat(".mailmap", {st_mode=S_IFREG|0664, st_size=4021, ...}) = 0
lstat("COPYING", {st_mode=S_IFREG|0664, st_size=18693, ...}) = 0
lstat("CREDITS", {st_mode=S_IFREG|0664, st_size=94027, ...}) = 0
(...)
Wrong:
firmware/code/linux$ strace git status
(...)
lstat(".mailmap", {st_mode=S_IFREG|0664, st_size=4021, ...}) = 0
open(".mailmap", O_RDONLY) = 3
read(3, "#\n# This list is used by git-sho"..., 4021) = 4021
close(3) = 0
lstat("COPYING", {st_mode=S_IFREG|0664, st_size=18693, ...}) = 0
open("COPYING", O_RDONLY) = 3
read(3, "\n NOTE! This copyright does *n"..., 18693) = 18693
close(3) = 0
lstat("CREDITS", {st_mode=S_IFREG|0664, st_size=94027, ...}) = 0
open("CREDITS", O_RDONLY) = 3
mmap(NULL, 94027, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f52ad1fb000
munmap(0x7f52ad1fb000, 94027) = 0
close(3) = 0
(...)
The trick that worked with me was doing git checkout <your_branch_here> and git update-index --refresh on the main repository and all its submodules, with an additional git submodule update at the end, just in case.
Kernel sysctl
2010-10-02
More basics. The Linux kernel offers an interface for browsing and modifying system parameters, mostly kernel related. This interface is called sysctl. In Linux, sysctl variables are available to the user as normal, editable files through a virtual filesystem browsable in the /proc/sys directory - or by the usage of the sysctl application. In my today's post I would like to concentrate on how to create sysctl configurations in kernel code.
As with most Linux kernel related topics, the internal mechanisms and definitions vary from version to version. Most of my knowledge comes from the one used in the 2.6.31 series kernels, but I also have some experience with the more recent 2.6.34 series - and there were some changes made somewhere in-between those two releases.
In short - kernel code and modules can export internal variables and data through mechanisms of sysctl or procfs. Usually the proc interface is used for read-only variables and internally structured data, where sysctl is used typically for read/write operations on short, non-structured data.
Both choices have a hierarchical tree-like structure, in which directories that can be used for organization of variables.
sysctl variables are defined in code by ctl_table structures. Each sysctl variable (both file and directory) has its own ctl_table object representing the variable. The ctl_table structures need to be grouped into arrays representing parts of a given level in the directory tree (e.g. the contents of the /proc/sys/dev directory). Such arrays need to be terminated by a NULL ctl_table entry - i.e. a variable with the name (procname) and ID (ctl_name, but only in 2.6.31 and similar) equal to NULL.
/* 2.6.31 */
struct ctl_table
{
int ctl_name; /* Binary ID, not present in later versions */
const char *procname; /* Text ID for /proc/sys, or zero */
void *data;
int maxlen;
mode_t mode;
struct ctl_table *child;
struct ctl_table *parent; /* Automatically set */
proc_handler *proc_handler; /* Callback for text formatting */
ctl_handler *strategy; /* Callback function for all r/w, not present in later versions */
void *extra1;
void *extra2;
};
The procname defines the name of the variable and ctl_name the ID for the given variable in the current directory. The data and maxlen fields can be used by some handling functions, while mode defines the basic access rights to the variable. There are also the extra1 and extra2 fields - reserved for any extra data you may need. The important fields are child and proc_handler. When the child pointer is different than NULL, then the variable in mention is a directory and proc_handler is ignored. The child pointer then points to another ctl_table array defining the contents of the subdirectory. A directory of a given path can be defined by more than one ctl_table array.
The proc_handler function pointer is the function that is called during I/O operations performed on the sysctl variable.
typedef int proc_handler (struct ctl_table *ctl, int write, struct file * filp,
void __user *buffer, size_t *lenp, loff_t *ppos);
There is a set of predefined routines for basic operations on variables that can be used as a proc_handler. These include proc_dostring() for reading/writing a string, proc_dointvec() for reading/writing one or more integers - as well as a few other variants of the latter function. In case of using these functions, the data and maxlen fields are used. Data points to the buffer holding the variable in-system, and maxlen the length of the buffer.
A kernel programmer can also define his/her own proc_handler function. In this case, the write function parameter shows whether the operation was a read (write == 0) or write (write == 1) operation. The buffer pointer is a pointer to the buffer with the data being read/written. The lenp is a pointer to the size of the user buffer holding (or to be used for holding) the data, and ppos is the offset from the beginning of the sysctl file of the variable during the operation I/O. These two are pointers so that they can be modified during handling.
So, what did change between 2.6.31 and 2.6.34? As noted in the comments, the ctl_name and strategy fields have been removed. I never used the strategy field before, but it seems it was used to optionally initialize and format data before display or storage. The proc_handler functions do not include the filp parameter anymore as well. No big changes really. At least I didn't notice anything else of interest.
The ctl_name field was indeed useless since long. Most variables used CTL_UNNUMBERED as the ctl_name since they did not care about an unique ID. There were times it was useful though, for instance while creating one proc_handler for many sysctl variables, later identified by ctl_name - but still the 'extra' fields can be used for that now. Or even a strcmp of the procname field.
But how are these variables positioned in the sysctl tree? The function register_sysctl_table() needs to be called for the main ctl_table array. The main root of all sysctl variables is /proc/sys. From that, you need to provide the ctl_table's of all directories in the path e.g. if we want to have a variable accessible at /proc/sys/dev/ala0/name, the ctl_table arrays for dev and ala0 need to be created and linked with each other using the child fields. If no other kernel code already defined the given directory, it is created in the virtual filesystem.
The sysctl interface is one of the recommended ways of exchanging data between the user and the kernel. Just remember to always use copy_from_user()/copy_to_user() when writing your own proc_handler functions! I must say that I like the idea of how sysctl configurations are created, accessed and exported - reminds me of Flatconf somehow... But more about this in the nearest future.
g++ and C++ - class method definitions and shared objects
2010-08-07
During work on some things regarding my master thesis, I stumbled upon something I did not know about before. I'm not much of a reverse engineer, and I don't really have the time to look into it closer.
I wanted to create a shared object file using a class of my creation. I wanted to export an already defined object of that class to be accessible through the applications dynamically linking the object.
Consider the following example:
#include <iostream>
class ala {
public:
int test(void) {
return 1;
}
};
extern "C" {
ala test;
}
Those who had to deal with .so files in Linux systems already know that there is a difference in the way how symbols are stored in C and C++. C++ names are mangled to support function overloading, so for us to be able to correctly access a variable we need to use the extern "C" qualifier.
Our example should work perfectly fine. We can now dlopen() the object file and dlsym() the symbol test. Everything is as it should. Virtualization is also fine, as long as there are no loose ends on both classes. What should be remembered: you cannot create class objects from a shared object using the default new operator. Creation of new objects needs to be done in shared object code, e.g. using wrapper functions. But this you probably already know while reading other articles on the internet.
I don't use C++ too much. What I did not know is that there is a slight difference between defining a method inside the class body and defining it outside, leaving just the declaration inside. In the first case, when the g++ compiler doesn't see the method being used, it doesn't seem to be included in the binary at all.
Interesting. But maybe it's just a coincidence? I would have to check gcc source code to be sure.
Kernel writing to file
2010-05-31
Today a quick post about something obvious - file reading/writing in kernel space. First of all, I'm obliged to inform you that this practice is very bad and shouldn't be done for purposes other than e.g. temporary debugging. You can find out why it's bad and how to do it properly here.
But there are times when you want to dump something (e.g. binary data) to a file on a filesystem fast, just once and just for debugging.
struct file *file;
loff_t pos = 0;
int fd;
mm_segment_t old_fs = get_fs();
set_fs(KERNEL_DS);
file = filp_open("/tmp/dump", O_WRONLY | O_CREAT, 0644);
if (file != NULL) {
vfs_write(file, mem_addr, mem_length, &pos);
filp_close(file);
}
sys_close(fd);
set_fs(old_fs);
Reading can be done analogically, using a different flag instead (O_RDONLY or O_RDWR) and vfs_read().
Fakebox
2010-05-06
Today will include a small advertisement-like post. Cross compilation has always been a bothersome process. Right now it's not as bad as it once was though - yet, still it can be time consuming, especially if you want to create you own GNU/Linux system for a different platform. For proper cross-compilation, we first need a specific toolchain present that will be able to generate code for our architecture of interest (we can use crosstool or OpenWrt here, for instance). Usually, this can be a troublesome as well, even more if the platform we're interested in is relatively unpopular - but we will skip this case for now and return to it in a later post.
There are many ways of building applications for non-local architectures. Having a ready toolchain it's really just a matter of using their binaries instead of ours. OpenWrt, for instance, uses their so called buildroot - a cross compilation system utilizing Makefiles. Most cross-magic is done there thanks to the power of configure and Makefile scripts, allowing different binaries from the toolchain to be used instead of the local ones. Not bad.
In one of my first posts I wrote that I prefer Fakebox - a cross compilation toolkit developed some time ago by ASN (advertisement!). Fakebox, similarly to Scratchbox, attempts to emulate a Linux machine on a Linux machine. I never used Scratchbox before, but from what I heard from my colleagues, even though it really does everything you need and it does it really good, it can be a troublesome and complex beast.
That is why they created Fakebox, a much simpler toolkit for the same purpose. It's really interesting how such a simple thing can work well for most uses.
Fakebox is nothing more then a batch of shell scripts and configuration files. To get it working, all you need is the toolchain and qemu installed. How does it work? The Fakebox website actually explains it nicely. What Fakebox does is:
- change $PATH so shell chooses development tools from Fakebox wrappers instead of /usr/bin, etc.
- most of these wrappers are symbolic links to one simple shell script which basically just adds toolchain prefix
- replaces uname with trivial shell script which returns contents of $FB_UNAME
- registers a binfmt_misc wrapper so binaries compiled for the target CPU architecture are executed via qemu emulation, in the target root filesystem
After this is done, we can run and compile programs for the specified architecture as we please. Since we are using binfmt_mist, all executables are ran using qemu emulation (e.g. qemu-mips or qemu-arm), so we can use them later on in the build process. Fakebox has its problems, but for what it is, it's still sufficient.
For instance, if we would like to setup an environment for a mips architecture in Fakebox (latest version from the git repositories), our fakebox.conf could look like this:
#!/bin/bash
FB_QEMU=mips
FB_UNAME="Linux amatsu 2.6.31 #1 Mon May 7 20:21:51 CEST 2007 mips unknown"
FB_TOOLCHAIN=$FB_TOP/mips/toolchain
FB_FTPAGENT="wget --continue --passive-ftp --tries=3 --waitretry=3 --timeout=10"
FB_PKGDW=$FB_TOP/mips/src
FB_PKGDEST=$FB_TOP/mips/build
FB_ROOTFS=$FB_TOP/mips/rootfs
FB_PATH="$FB_TOP/some_additional_tools_of_ours"
After putting this configuration file in the mips/ directory, we can run our new environment by typing "fakebox mips/". Fakebox then sets the $PATH variable to include our custom paths (FB_PATH), our toolchain path, our fakebox wrappers and so on, registers binfmt_misc for our architecture (if supported) and runs a new shell. Besides emulation, Fakebox also offers an integrated simple package manager with building capabilities (pkgtools) with overall functionality similar to the ones seen in OpenWrt (with the exception of being implemented in bash instead).
This is a very nice feature of Fakebox as well. You can organize your applications/libraries as packages, download and build them easily when needed. After a package gets build, you can then install/uninstall it on the target root file system with one command. It saves up some time and makes development cleaner. Since it's an important feature, I will return to it in the next post about Fakebox. Stay tuned.
In-kernel module unloading and the usermode helper
2010-04-27
It seems safely unloading a Linux kernel module in kernel space is not a very straightforward task, but once you see how it's done, it seems awfully trivial - and strange. Those of you that had some kernel programming experience before probably know about the request_module() and its non blocking version request_module_nowait() functions declared in include/linux/kmod.h. Just provide the name of the module you want to load as the parameter and the function takes care of the rest. Sadly, there is no such function for loaded module removal.
There are many tools that could be used for this purpose in the Linux kernel, but none of them are explicitly exported for us to use.
But if we look into the __request_module function internals (the base for both the request_module functions) in kernel/kmod.c, we can see that it actually doesn't do anything with the module at all. All it does is run modprobe (yes, the one from userspace) to do the module loading instead - at least in versions 2.6.28 and .31.
It does this using a usermode helper, with the call_usermodehelper() function. Its syntax is similar to the execve() user mode function - requiring the binary path, a NULL-terminated string array of arguments, a NULL-terminated string array of environment variables and a flag indicating the wait policy. The wait policy defines whether the caller should wait for the process to finish (UMH_WAIT_PROC), just wait for the exec call to finish (UMH_WAIT_EXEC) or not to wait at all (UMH_NO_WAIT).
Now that we know how request_module() works, we can do the same thing when we need to remove a module by name. We can either do a modprobe -r or a rmmod call. This way, we're not unloading modules entirely by force, and it's safe for the operating system. This can be done, for instance, like this:
static char *argv[] = { "/sbin/rmmod", "my_module", NULL },
*env[] = { "HOME=/", "PATH=/sbin:/usr/sbin:/bin:/usr/bin", NULL };
if (call_usermodehelper(argv[0], argv, env, UMH_WAIT_PROC))
PRINTD(KERN_WARNING "Failed unloading the module\n");
Just remember that with just that we cannot actually tell if the module has been unloaded or not. This requires additional checking done by ourselves later on.
You can of course use usermode helpers to call any other user mode applications when needed. Just use it with care. It's best if the kernel doesn't ask for help from user space. Kernel is strong.
Mikrotik RouterBoard 433AH
2010-04-12
Some time ago, I have been given the opportunity to work on the Mikrotik RouterBoard 433AH platform. The RB433AH is mostly same as the RB433, just a bit more powerful. It has the standard Atheros AR7130 chipset clocked at 640 MHz (some weaker versions involve a 300MHz processor), 128MB or RAM, 3 10/100 ethernet ports, 3 MiniPCI slots and a microSD card reader. A more detailed hardware specification can be found on the RouterBoard page here. Pretty interesting piece of hardware.
One thing that we need to get ready for is the serial port, since in this platform we need a null modem cable for connection.

By default, it comes with a pre-installed RouterBOOT bootloader and a RouterOS Level5 system. The best thing is that the RB433AH chipset is supported by OpenWRT out of the box. Building the system for this platform is quite straightforward, since it only requires to choose the AR71xx target system during configuration - but the OpenWRT team provided some additional informations on their old wiki page here if necessary.
Thanks to this we have an open path for lightweight hacking.
The RouterBOOT uses the 'kernel' partition for kernel load during system boot from NAND flash memory. The partition needs to be a yaffs2 filesystem with the kernel as an elf executable of the same name placed on / of the partition directory tree.
The toolchain built during OpenWRT compilation can be later used for building your own system using any cross-compilation toolkit available like Scratchbox or Fakebox (or even no toolkit at all). My choice is, of course, Fakebox - as to why, I will get back to it in a different post later.
I also had to hack the toolchain a little bit, since I needed a full iberty library with all the extra object files. But having done this, it's all just compile. Building the kernel will require for us to apply the AR71xx specific patches from OpenWRT first. Now - the interesting part. Basing on the OpenWRT support patches, the flash partition table is defined statically during compile time in drivers/mtd/nand/rb4xx_nand.c. This cannot be modified by a mtdparts cmdline parameter in the kernel. The code is quite obvious. Why not mtdparts? Maybe because the RB4xx are MIPS based and make use of the RouterBOOT bootloader, which seems not to have any means of passing additional commandline arguments? Not sure.
static struct mtd_partition rb4xx_nand_partitions[] = {
{
.name = "booter",
.offset = 0,
.size = (256 * 1024),
.mask_flags = MTD_WRITEABLE,
},
{
.name = "kernel",
.offset = (256 * 1024),
.size = (4 * 1024 * 1024) - (256 * 1024),
},
{
.name = "rootfs",
.offset = MTDPART_OFS_NXTBLK,
.size = MTDPART_SIZ_FULL,
},
};
After examining kernel output we can notice that the RB433 has some interesting cmdline arguments passed to it during boot time, such as board, boot, HZ, console etc. Parameters that are not added in CONFIG_CMDLINE in the .config file. How do those paramters get passed to the kernel?
This is another important thing worth noticing in the RB4xx boards - the Atheros SoC has an internal PROM with some board specific data included. The MIPS-configured kernel reads the PROM during boot time and appends the parameters. All the code regarding this procedure can be found in arch/mips/ar71xx/prom.c.
With all that, the Mikrotik RouterBoard 433 platform is a relatively powerful device for experimenting. Worth noting is also the fact that the smaller RB411 uses the very same chipset, with almost no notable differences (besides lacking a few features). It can run the same firmware with no problems. Killing two birds with one stone.
EDIT: After more experiments and gaining experience, I know now that not everything is as I thought during writing this post.
A welcoming post
2010-04-10
With this first post, I would like to welcome everyone to my personal web log. I intend to post here about my experiences with technologies that I encounter at work and during private research. Since my interests include embedded systems, kernel hacking, GNU/Linux programming and the Haiku operating system, you can expect a bit of those in the near future.
Since I am also an amateur artist, you can find some of my artworks in the Art section.
Just a reminder - this page is still under construction. I want to keep it as simple as possible, but there still is much to do. Also, I am not a web designer - keep that in mind when browsing through.
Enjoy.