Blog posts for tags/bsd

  1. psutil 5.5.0 is twice as fast

    OK, this is a big one. Starting from psutil 5.0.0 you can query multiple Process information around twice as fast than with previous versions (see original ticket and updated doc). It took me 7 months, 108 commits and a massive refactoring of psutil internals (here is the big commit), and I can safely say this is one of the best improvements and long standing issues which have been addressed in a major psutil release. Here goes.

    The problem

    Except for some cases, the way different process information are retrieved varies depending on the OS. Sometimes it requires reading a file in /proc filesystem (Linux), some other times it requires using C (Windows, BSD, OSX, SunOS), but every time it's done differently. Psutil abstracts this complexity by providing a nice high-level interface so that you, say, call Process.name() without worrying about what happens behind the curtains or on what OS you're on.

    Internally, it is not rare that multiple process info (e.g. name(), ppid(), uids(), create_time()) may be fetched by using the same routine. For example, on Linux we read /proc/stat to get the process name, terminal, CPU times, creation time, status and parent PID, but only one value is returned and the others are discarded. On Linux the code below reads /proc/stat 6 times:

    >>> import psutil
    >>> p = psutil.Process()
    >>> p.name()
    >>> p.cpu_times()
    >>> p.create_time()
    >>> p.ppid()
    >>> p.status()
    >>> p.terminal()
    

    Another example is BSD. In order to get process name, memory, CPU times and other metrics, a single sysctl() call is necessary, but again, because of how psutil used to work so far that same sysctl() call is executed every time (see here, here, and so on), one information is returned (say name()) and the rest is discarded. Not anymore.

    Do it in one shot

    It appears clear how the approach described above is not efficient, also considering that applications similar to top, htop, ps or glances usually collect more than one info per-process. psutil 5.0.0 introduces a new oneshot() context manager. When used, the internal routine is executed once (in the example below on name()) and the other values are cached. The subsequent calls sharing the same internal routine (read /proc/stat, call sysctl() or whatever) will return the cached value. With psutil 5.0.0 the code above can be rewritten like this, and on Linux it will run 2.4 times faster:

    >>> import psutil
    >>> p = psutil.Process()
    >>> with p.oneshot():
    ...     p.name()
    ...     p.cpu_times()
    ...     p.create_time()
    ...     p.ppid()
    ...     p.status()
    ...     p.terminal()
    

    Implementation

    One great thing about psutil design is its abstraction. It is dived in 3 "layers". The first layer is represented by the main Process class (python), which is what dictates the end-user high-level API. The second layer is the OS-specific Python module which is thin wrapper on top of the OS-specific C extension module (third layer). Because this was organized this way (modularly) the refactoring was reasonably smooth. In order to do this I first refactored those C functions collecting multiple info and grouped them in a single function (e.g. see BSD implementation). Then I wrote a decorator which enables the cache only when requested (when entering the context manager) and decorated the "grouped functions" with with it. The whole thing is enabled on request by the highest-level oneshot() context manager, which is the only thing which is exposed to the end user. Here's the decorator:

    def memoize_when_activated(fun):
        """A memoize decorator which is disabled by default. It can be
        activated and deactivated on request.
        """
        @functools.wraps(fun)
        def wrapper(self):
            if not wrapper.cache_activated:
                return fun(self)
            else:
                try:
                    ret = cache[fun]
                except KeyError:
                    ret = cache[fun] = fun(self)
                return ret
    
        def cache_activate():
            """Activate cache."""
            wrapper.cache_activated = True
    
        def cache_deactivate():
            """Deactivate and clear cache."""
            wrapper.cache_activated = False
            cache.clear()
    
        cache = {}
        wrapper.cache_activated = False
        wrapper.cache_activate = cache_activate
        wrapper.cache_deactivate = cache_deactivate
        return wrapper
    

    In order to measure the various speedups I finally wrote a benchmark script (well, two actually) and kept tuning until I was sure the various changes made psutil actually faster. The benchmark scripts calculate the speedup you can get if you call all the "grouped" methods together (best case scenario).

    Linux: +2.56x speedup

    Linux process is the only pure-python implementation as (almost) all process info are gathered by reading files in the /proc filesystem. /proc files typically contain different information about the process and /proc/PID/stat and /proc/PID/status are the perfect examples. That's why on Linux we aggregate them in 3 groups. The relevant part of the Linux implementation can be seen here.

    Windows: from +1.9x to +6.5x speedup

    Windows is an interesting one. In normal circumstances, if we're querying a process owned by our user, we group together only process' num_threads(), num_ctx_switches() and num_handles(), getting a +1.9x speedup if we access those methods in one shot. Windows is particular though, because certain methods use a dual implementation: a "fast method" is attempted first, but if the process is owned by another user it fails with AccessDenied. In that case psutil falls back on using a second "slower" method (see here for example). The second method is slower because it iterates over all PIDs but differently than "plain" Windows APIs it can be used to get multiple info in one shot: num threads, context switches, handles, CPU times, create time and IO counters. That is why querying processes owned by other users results in an impressive +6.5 speedup.

    OSX: +1.92x speedup

    On OSX we can get 2 groups of information. With sysctl() syscall we get process parent PID, uids, gids, terminal, create time, name. With proc_info() syscall we get CPU times (for PIDs owned by another user) memory metrics and ctx switches. Not bad.

    BSD: +2.18x speedup

    BSD was an interesting one as we gather a tons of process info just by calling sysctl() (see implementation). In a single shot we get process name, ppid, status, uids, gids, IO counters, CPU and create times, terminal and ctx switches.

    SunOS: +1.37 speedup

    SunOS implementation is similar to Linux implementation in that it reads files in /proc filesystem but differently from Linux this is done in C. Also in this case, we can group different metrics together (see here and here).

  2. psutil NetBSD support

    Roughly two months have passed since I last announced psutil added support for OpenBSD platforms. Today I am happy to announce we also have NetBSD support! This was contributed by Thomas Klausner, Ryo Onodera and myself in PR #570.

    Differences with FreeBSD (and OpenBSD)

    NetBSD implementation has similar limitations as the ones I encountered with OpenBSD. Again, FreeBSD presents itself as the BSD variant with the best support in terms of kernel functionalities.

    • Process.memory_maps() is not implemented. The kernel provides the necessary pieces but I didn't do this yet (hopefully later).
    • Process.num_ctx_switches()'s involuntary field is always 0. kinfo_proc syscall provides this info but it is always set to 0.
    • Process.cpu_affinity() (get and set) is not supported.
    • psutil.cpu_count(logical=False) always return None.

    As for the rest: it is all there. All memory, disk, network and process APIs are fully supported and functioning.

    Other enhancements available in this psutil release

    Other than NetBSD support this new release has a couple of interesting enhancements:

    • #708: [Linux] psutil.net_connections() and Process.connections() on Python can be up to 3x faster in case of many connections.
    • #718: process_iter() is now thread safe.

    You can read the rest in the HISTORY file, as usual.

    Move to Prague

    As a personal note I'd like to add that I'm currently in Prague (Czech Republic) and I'm thinking about moving down here for a while. The city is great and girls are beautiful. ;-)

    External discussions

  3. psutil OpenBSD support

    OK, this is a big one: starting from version 3.3.0 (released just now) psutil will officially support OpenBSD platforms. This was contributed by Landry Breuil (thanks dude!) and myself in PR-615. The interesting parts of the code changes are this and this.

    Differences with FreeBSD

    As expected, OpenBSD implementation is very similar to FreeBSD's (which was already in place), that is why I decided to merge most of it in a single C file (_psutil_bsd.c) and use 2 separate C files for when the two implementations differed too much: freebsd.c and openbsd.c. In terms of functionality here's the differences with FreeBSD. Unless specified, these differences are due to the kernel which does not provide the information natively (meaning we can't do anything about it).

    • Process.memory_maps() is not implemented. The kernel provides the necessary pieces but I didn't do this yet (hopefully later).
    • Process.num_ctx_switches()'s involuntary field is always 0. kinfo_proc provides this info but it is always set to 0.
    • Process.cpu_affinity() (get and set) is not supported.
    • Process.exe() is determined by inspecting the command line so it may not always be available (return None).
    • psutil.swap_memory() sin and sout (swap in and swap out) values are not available and hence are always set to 0.
    • psutil.cpu_count(logical=False) always return None.

    Similarly to FreeBSD, also OpenBSD implementation of Process.open_files() is problematic as it is not able to return file paths (FreeBSD can sometimes). Other than these differences the functionalities are all there and pretty much the same, so overall I'm pretty satisfied with the result.

    Considerations about BSD platforms

    psutil has been supporting FreeBSD basically since the beginning (year 2009). At the time it made sense to support FreeBSD instead of other BSD variants because it is the most popular, followed by OpenBSD and NetBSD. Compared to FreeBSD, OpenBSD appears to be more "minimal" both in terms of facilities provided by the kernel and the number of system administration tools available. One thing which I appreciate a lot about FreeBSD is that the source code of all CLI tools installed on the system is available under /usr/bin/src, which was a big help for implementing all psutil APIs. OpenBSD source code is also available but it uses CSV and I am not sure it includes the source code for all CLI tools. There are still two more BSD variants for which it may be worth to add support for: NetBSD and DragonflyBSD (in this order). About a year ago some guy provided a patch for adding basic NetBSD support so it is likely that will happen sooner or later.

    Other enhancements available in this release

    The only other enhancement is issue #558, which allows specifying a different location of /proc filesystem on Linux.

    External discussions

Social

Feeds