2014 GSoC LSB projects

Go back to the main GSoC Linux Foundation page

Mailing list: lsb-discuss at lists dot linux-foundation dot org

IRC: #lsb on irc.freenode.net

Workgroup Resources

Code License: GPL/BSD, specs: GNU FDL

Mentors: Alexey Khoroshilov (alexey dot khoroshilov at gmail dot com), Jeff Licquia, Vladimir Rubanov, Denis Silakov (dsilakov at gmail dot com).

Implement C++ Support for the LSB Data Import Tools

The core of the LSB infrastructure is its database that stores different information about standardized elements (libraries, functions, etc.). In addition, the database stores information about the major Linux distributions (so called “community data”) – which libraries they contain, which interfaces provides, etc. This information is actively used when choosing candidates to be included in LSB.

The size of data stored in the LSB DB is huge (thousands of standardized elements and hundreds of millions of elements in different distributions). The data is mainly added to the database by means of automated tools - LSB Library Import Tools and Distribution and Application analyzers.

One of the weakness of these tools is poor support for C++-specific aspects. The tools are able to recognize C++ libraries, unmangle names of binary symbols (functions and variables) and decide the classes these symbols belong to, but that's all.

However, besides functions and variables, for LSB it is important to know structure of class virtual tables. Currently such data is added manually for standardized elements and it is completely missing in the community part of the database.

The aim of this project is to add possibility to collect data about virtual tables to LSB data collection tools. In addition, it will be required to improve LSB DB schema to store these data and improve LSB Navigator to display data about virtual tables of different classes in different distributions. Hint: one of the possible ways is to integrate the vtable-dumper tool with LSB data collection tools.

Desired knowledge: C++, Perl (LSB data collection tools are written in Perl), MySQL, PHP (LSB Navigator is written in PHP).

Mentor: Denis Silakov

Extend and Update LSB Core Test Suite

LSB Core tests [1,2] is the main part of LSB certification tests. Originally they were developed for POSIX (IEEE Std 1003.1-1990) and LSB 3.1. LSB specification was expanded since that time (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to fix open issues in the tests, i.e. to fix issues appearing on modern Linux distributions and to accomodate all relevant changes in text of specifications, as well as to expand coverage of the test suite to newly added LSB Core interfaces.

VSX-PCTS was implemented using TET/VSXgen test harness. The OLVER test suite was developed using model-based testing technology UniTESK. The technology follows well-known Design by Contract approach to represent formal specifications. Program contracts consist of preconditions and postconditions of interface operations and data type invariants.  The main advantage of program contracts is that they enable automatic construction of test oracles that check the conformance of the target system behavior to the specifications.

Traceability of formal specifications to text of original LSB/POSIX specifications is supported using the following ideas:

  • each elementary requirement in original specification obtains a unique identifier;
  • each of the requirements is explicitly linked to a fragment of text in the original specification;
  • all checks in preconditions and postconditions are marked by identifiers of corresponding requirements.

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/olver-core-tests/files 
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/runtime-test/files/

Desired knowledge: C, SEC, model based testing
  
Mentor: Alexey Khoroshilov, Roman Zybin

Develop New LSB Desktop Test Suite

LSB Desktop tests [1,2] is a part of LSB certification tests. Existing tests do not cover all interfaces included into LSB and LSB specification continues to grow (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to develop tests for newly included libraries and for interfaces that have no tests yet (http://linuxbase.org/navigator/coverage/cov2.php).

Gtk3/Gdk3 and friends are at the moment very high on our list of untested items. The test development ideally should be coordinated with upstream projects so the tests can be useful for them as well.

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/desktop-test/files
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/t2c-desktop-tests/files

Desired knowledge: C, Testing

Mentor: Alexey Khoroshilov, Roman Zybin

Parallel Verification of Linux Kernel Modules

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. Currently LDV tools is used to check Linux kernel modules against more then 30 safety rules, each of them formally defines what is a correct usage of a particular part of the Linux kernel API. By the moment, the framework has already helped to find and fix more than 130 issues in modules of the vanilla Linux kernel.

The fastest verification tools can check all modules of current versions of the Linux kernel against one rule specification for about a day. The number of Linux kernel modules, rule specifications, verification tools and their configurations is already big and it continues to grow rapidly. Eventually all this entailed that at the moment just a default verification tool of LDV tools (BLAST) in its default configuration is used to verify all modules of major releases of the Linux kernel against the most valuable rule specifications. And even this requires too much manual efforts to set up a verification environment, to track verification process fails and successes and to collect final verification results.

This project is aimed to extend LDV tools with the ability of parallel verification of Linux kernel modules. It includes parallelism both at the level of module preparation for verification (extraction, generation an environment model and instrumentation with a specified rule specification) and at the level of verification tools invocation. The former requires deep understanding of LDV tools internals. The latter requires experience in organization of efficient cluster systems since verification tools need huge computational resources and thus should be launched on different machines.

We expect that parallel verification of Linux kernel modules will considerably speed up verification of modules against many rule specifications as well as it will simplify deploying of LDV tools on many machines.

Desired knowledge: Perl (LDV tools is written mainly in Perl), cluster technologies, Linux kernel modules verification

Mentor: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Public Pool of Bugs in Linux Kernel Modules

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. Currently LDV tools is used to check Linux kernel modules against more then 30 safety rules, each of them formally defines what is a correct usage of a particular part of the Linux kernel API. By the moment, the framework has already helped to find and fix more than 130 issues in modules of the vanilla Linux kernel.

Last two years LDV tools aided to find even more bugs in Linux kernel modules then framework developers can report. From one hand, this points to further encouraging trends of advanced verification techniques application. From the other hand, it is a reason to establish more collaboration with developers of Linux kernel modules. First step in this direction is to share possible but not yet reported bugs in modules with them, so that they will be able to examine that these bugs are actual, fix them and report to LDV tools developers that bugs were fixed in respective commits.

To make collaboration more efficient it is required to develop an infrastructure for arrangement of a public pool of bugs in Linux kernel modules. This infrastructure should be based on a knowledge base - an internal base of LDV Toolkit used to mark up verification results (both possible bugs and false alarms). The knowledge base should be extended to keep information on when bugs were found, when they were fixed, who fixed them, what commits do that, etc.

The infrastructure should keep in a consistent state the knowledge base and the public pool of bugs. It should provide to module developers all data needed for analysis, like a kernel version, a module name, a rule specification checked and a link to a corresponding error trace that shows a path in module source code leading to a possible bug. And LDV Toolkit developers need convinient tools to manage the knowledge base and the public pool.

As a result we expect to awake several Linux kernel module developers interest in analysis of possible bugs in modules and to get feedback from them.

Desired knowledge: Linux kernel development, Perl (LDV tools is written mainly in Perl), MySQL, PHP (for a web user interface to the public pool), C, basis of verification techniques is welcome.

Mentor: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Formalization and Checking of Correct Usage of Kernel Core API

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. Currently LDV tools are used to check Linux device drivers against several safety rules, each of them formally defines what is correct usage of a particular part of kernel core API. Another way of looking at the safety rules is that a rule represents one or more typical misuse of kernel core API.

By the moment, the framework contains around 30 rules that has already helped to find and fix more than 130 issues in device drivers of vanilla Linux kernel.

This project (or several ones) is aimed at extending number of rules supported by the framework.

We have a collection of several dozens rules written in informal way, but other rules can be used as well. Formalization process usually includes the following steps:

  • identification of all elements of kernel core API relevant to the rule;
  • writing a finite state machine like formal model that consists of
    • model state (one or more C variables);
    • transitions between states linked to appropriate events in device driver (usually calls of kernel core API);
    • a set of error states (that represents misuse of kernel core API);
  • writing unit tests covering all the interfaces;
  • checking all drivers against the rule, analyzing results, reporting bugs and fixing the rule.

A notation used for formal representation of the rules is C code and AspectJ-like notation to merge the model code with original source code of device driver in appropriate places.

More than one application can be approved. Formalization of ~5 rules is expected from each student.

Desired knowledge: C, aspect-oriented programming, Linux kernel space API
  
Mentor: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Groups: