Blog posts for tags/compatibility

  1. Reimplementing ifconfig in Python

    Here we are. It's been a long time since my last blog post and my last psutil release. The reason? I've been travelling! I mean... a lot. I've spent 3 months in Berlin, 3 weeks in Japan and 2 months in New York City. While I was there I finally had the chance to meet my friend Jay Loden in person. We originally started working on psutil together 7 years ago.

    Back then I didn't know any C (and I'm still a terrible C developer), so he was crucial in developing the initial psutil skeleton, including macOS and Windows support. Needless to say that this release builds on that work.

    net_if_addrs()

    We're now able to list network interface addresses similarly to the ifconfig command on UNIX:

    >>> import psutil
    >>> from pprint import pprint
    >>> pprint(psutil.net_if_addrs())
    {'ethernet0': [snic(family=<AddressFamily.AF_INET: 2>,
                        address='10.0.0.4',
                        netmask='255.0.0.0',
                        broadcast='10.255.255.255'),
                   snic(family=<AddressFamily.AF_PACKET: 17>,
                        address='9c:eb:e8:0b:05:1f',
                        netmask=None,
                        broadcast='ff:ff:ff:ff:ff:ff')],
     'localhost': [snic(family=<AddressFamily.AF_INET: 2>,
                        address='127.0.0.1',
                        netmask='255.0.0.0',
                        broadcast='127.0.0.1'),
                   snic(family=<AddressFamily.AF_PACKET: 17>,
                        address='00:00:00:00:00:00',
                        netmask=None,
                        broadcast='00:00:00:00:00:00')]}
    

    This is limited to AF_INET (IPv4), AF_INET6 (IPv6) and AF_LINK (Ethernet) address families. If you want something more powerful (e.g. AF_BLUETOOTH) you can take a look at the netifaces extension. If you want to see how this is implemented, here's the code for POSIX and Windows:

    net_if_stats()

    This new function returns information about network interface cards:

    >>> import psutil
    >>> from pprint import pprint
    >>> pprint(psutil.net_if_stats())
    {'ethernet0': snicstats(isup=True,
                            duplex=<NicDuplex.NIC_DUPLEX_FULL: 2>,
                            speed=100,
                            mtu=1500),
     'localhost': snicstats(isup=True,
                            duplex=<NicDuplex.NIC_DUPLEX_UNKNOWN: 0>,
                            speed=0,
                            mtu=65536)}
    

    The implementation on each platform:

    Also in 3.0

    Beyond the network-interface APIs, psutil 3.0 ships a few other notable changes.

    Several integer/string constants (IOPRIO_CLASS_*, NIC_DUPLEX_*, *_PRIORITY_CLASS) now return enum values on Python 3.4+.

    Support for zombie processes on UNIX was broken. Covered in a separate post.

    Removal of deprecated APIs

    All aliases deprecated in the psutil 2.0 porting guide (January 2014) are gone. For the full list see the changelog.

    Final words

    I must say I'm pretty satisfied with how psutil is evolving and with the enjoyment I still get every time I work on it. It now gets almost 800,000 downloads a month, which is quite remarkable for a Python library.

    At this point, I consider psutil almost "complete" feature-wise, meaning I'm starting to run out of ideas for what to add next (see TODO). Going forward, development will likely focus on supporting more exotic platforms (OpenBSD #562, NetBSD PR-557, Android #355).

    There have also been discussions on the python-ideas mailing list about including psutil in the Python stdlib, but even if that happens, it's still a long way off, as it would require a significant time investment that I currently don't have.

  2. Proper zombie process handling

    This is part of the psutil 3.0 release (see the full release notes).

    Except on Linux and Windows (which does not have them), support for zombie processes was broken. The full story is in #428.

    The problem

    Say you create a zombie process and instantiate a Process for it:

    import os, time
    
    def create_zombie():
        pid = os.fork()  # the zombie
        if pid == 0:
            os._exit(0)  # child exits immediately
        else:
            time.sleep(1000)  # parent does NOT call wait()
    
    pid = create_zombie()
    p = psutil.Process(pid)
    

    Up until psutil 2.X, every time you tried to query it you'd get a NoSuchProcess exception:

    >>> p.name()
      File "psutil/__init__.py", line 374, in _init
        raise NoSuchProcess(pid, None, msg)
    psutil.NoSuchProcess: no process found with pid 123
    

    This was misleading, because the PID technically still existed:

    >>> psutil.pid_exists(p.pid)
    True
    

    Depending on the platform, some process information could still be retrieved:

    >>> p.cmdline()
    ['python']
    

    Worst of all, psutil.process_iter() didn't return zombies at all. That was a real problem, because identifying them is a legitimate use case: a zombie usually indicates a bug where a parent process spawns a child, kills it, but never calls wait() to reap it.

    What changed

    • A new ZombieProcess exception is raised whenever a process cannot be queried because it is a zombie.
    • It replaces NoSuchProcess, which was incorrect and misleading.
    • ZombieProcess inherits from NoSuchProcess, so existing code keeps working.
    • psutil.process_iter() now correctly includes zombie processes, so you can reliably identify them:
    import psutil
    
    zombies = []
    for p in psutil.process_iter():
        try:
            if p.status() == psutil.STATUS_ZOMBIE:
                zombies.append(p)
        except psutil.NoSuchProcess:
            pass
    
  3. Announcing psutil 2.0

    psutil 2.0 is out. This is a major rewrite and reorganization of both the Python and C extension modules. It costed me four months of work and more than 22,000 lines (the diff against old 1.2.1). Many of the changes are not backward compatible; I'm sure this will cause some pain, but I think it's for the better and needed to be done.

    API changes

    I already wrote a detailed blog post about this, so use that as the official reference on how to port your code.

    RST documentation

    I've never been happy with the old doc hosted on Google Code. The markup language provided by Google is pretty limited, plus it's not under revision control. The new doc is more detailed, uses reStructuredText as the markup language, lives in the same code repository as psutil, and is hosted on the excellent Read the Docs: http://psutil.readthedocs.org/

    Physical CPUs count

    You're now able to distinguish between logical and physical CPUs. The full story is in #427.

    >>> psutil.cpu_count()  # logical
    4
    >>> psutil.cpu_count(logical=False)  # physical cores only
    2
    

    Process instances are hashable

    psutil.Process instances can now be compared for equality and used in sets and dicts. The most useful application is diffing process snapshots:

    >>> before = set(psutil.process_iter())
    >>> # ... some time passes ...
    >>> after = set(psutil.process_iter())
    >>> new_procs = after - before  # processes spawned in between
    

    Equality is not just PID-based. It also includes the process creation time, so a Process whose PID got reused by the kernel won't be mistaken for the original. The full story is in #452.

    Speedups

    • #477: Process.cpu_percent() is about 30% faster.
    • #478: (Linux) almost all APIs are about 30% faster on Python 3.X.

    Other improvements and bugfixes

    • #424: published Windows installers for Python 3.X 64-bit.
    • #447: the psutil.wait_procs() timeout parameter is now optional.
    • #459: a Makefile is now available for running tests and other repetitive tasks (also on Windows).
    • #463: the timeout parameter of cpu_percent* functions defaults to 0.0, because the previous default was a common source of slowdowns.
    • #340: (Windows) Process.open_files() no longer hangs.
    • #448: (Windows) fixed a memory leak affecting Process.children() and Process.ppid().
    • #461: namedtuples are now pickle-able.
    • #474: (Windows) Process.cpu_percent() is no longer capped at 100%.
  4. Porting your code to psutil 2.0

    This blog post is going to be about psutil 2.0, a major release in which I decided to reorganize the existing API for the sake of consistency. At the time of writing, psutil 2.0 is still under development, and the intent of this blog post is to serve as an official reference that describes how you should port your existing code base. In doing so, I will also explain why I decided to make these changes. Even though many APIs will still be available as aliases pointing to the newer ones, the overall changes are numerous and many of them are not backward compatible. I'm sure many people will be sorely bitten, but I think this is for the better and it needed to be done, hopefully for the first and last time.

    Module constants turned into functions

    What changed

    Old name Replacement
    psutil.BOOT_TIME psutil.boot_time()
    psutil.NUM_CPUS psutil.cpu_count()
    psutil.TOTAL_PHYMEM psutil.virtual_memory().total

    Why I did it

    I already talked about this more extensively in the previous Making constants part of your API is evil blog post. In short: other than introducing unnecessary slowdowns, calculating a module-level constant at import time is dangerous because if something goes wrong the whole app will crash. Also, the represented values may be subject to change (think about the system clock), but the constant cannot be updated. Thanks to this hack, accessing the old constants still works and produces a DeprecationWarning.

    Renamed module functions

    What changed

    Old name Replacement
    psutil.get_boot_time() psutil.boot_time()
    psutil.get_pid_list() psutil.pids()
    psutil.get_users() psutil.users()

    Why I did it

    They were the only module-level functions with a get_ prefix. None of the others had one.

    Renamed Process class methods

    All methods lost their get_ and set_ prefixes. A single method can now be used for both getting and setting (if a value is passed). Assuming p = psutil.Process():

    Old name Replacement
    p.get_children() p.children()
    p.get_connections() p.connections()
    p.get_cpu_affinity() p.cpu_affinity()
    p.get_cpu_percent() p.cpu_percent()
    p.get_cpu_times() p.cpu_times()
    p.get_io_counters() p.io_counters()
    p.get_ionice() p.ionice()
    p.get_memory_info() p.memory_info()
    p.get_ext_memory_info() p.memory_info_ex()
    p.get_memory_maps() p.memory_maps()
    p.get_memory_percent() p.memory_percent()
    p.get_nice() p.nice()
    p.get_num_ctx_switches() p.num_ctx_switches()
    p.get_num_fds() p.num_fds()
    p.get_num_threads() p.num_threads()
    p.get_open_files() p.open_files()
    p.get_rlimit() p.rlimit()
    p.get_threads() p.threads()
    p.getcwd() p.cwd()

    ...as for set_* methods:

    Old name Replacement
    p.set_cpu_affinity() p.cpu_affinity(cpus)
    p.set_ionice() p.ionice(ioclass, value=None)
    p.set_nice() p.nice(value)
    p.set_rlimit() p.rlimit(resource, limits=None)

    Why I did it

    I wanted to be consistent with system-wide module-level functions, which have no get_ prefix. After I got rid of the get_ prefixes, removing set_ too seemed natural and helped reduce the number of methods.

    Process properties are now methods

    What changed

    Assuming p = psutil.Process():

    Old name Replacement
    p.cmdline p.cmdline()
    p.create_time p.create_time()
    p.exe p.exe()
    p.gids p.gids()
    p.name p.name()
    p.parent p.parent()
    p.ppid p.ppid()
    p.status p.status()
    p.uids p.uids()
    p.username p.username()

    Why I did it

    Different reasons:

    • Having a mixed API that uses both properties and methods for no particular reason is confusing and messy, because you don't know whether to use () or not.
    • A property is usually expected not to perform heavy computations internally, whereas psutil invokes a function every time it is accessed. This has two drawbacks:
      • You may get an exception just by accessing the property (e.g. p.name may raise NoSuchProcess or AccessDenied).
      • You may erroneously think properties are cached, but this is true only for name, exe, and create_time.

    CPU percent intervals

    What changed

    The timeout parameter of cpu_percent* functions now defaults to 0.0 instead of 0.1. The functions affected are:

    • Process.cpu_percent()
    • psutil.cpu_percent()
    • psutil.cpu_times_percent()

    Why I did it

    I originally set 0.1 as the default timeout because you need to wait some time in order to get a meaningful percent value. Having an API that "sleeps" by default is risky, though, because it's easy to forget it does so. That is particularly problematic when calling Process.cpu_percent() for all processes: it's very easy to forget to specify timeout=0, resulting in dramatic slowdowns that are hard to spot. For example, this code snippet might take a variable number of seconds to complete depending on the number of active processes:

    >>> # this will be slow
    >>> for p in psutil.process_iter():
    ...    print(p.cpu_percent())
    

    Migration strategy

    Except for Process properties (name, exe, cmdline, etc.), all the old APIs are still available as aliases pointing to the newer names and raising DeprecationWarning. psutil will be very clear on what you should use instead of the deprecated API, as long as you start the interpreter with the -Wd option. This will enable deprecation warnings, which were silenced in Python 2.7 (IMHO, from a developer standpoint this was a bad decision).

    giampaolo@ubuntu:/tmp$ python -Wd
    Python 2.7.3 (default, Sep 26 2013, 20:03:06)
    [GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import psutil
    >>> psutil.get_pid_list()
    __main__:1: DeprecationWarning: psutil.get_pid_list is deprecated; use psutil.pids() instead
    [1, 2, 3, 6, 7, 13, ...]
    >>>
    >>>
    >>> p = psutil.Process()
    >>> p.get_cpu_times()
    __main__:1: DeprecationWarning: get_cpu_times() is deprecated; use cpu_times() instead
    pcputimes(user=0.08, system=0.03)
    >>>
    

    If you have a solid test suite, you can run tests and fix the warnings one by one. As for the Process properties that were turned into methods, it's more difficult because, whereas psutil 1.2.1 returns the actual value, psutil 2.0.0 returns the bound method:

    # psutil 1.2.1
    >>> psutil.Process().name
    'python'
    >>>
    
    # psutil 2.0.0
    >>> psutil.Process().name
    <bound method Process.name of psutil.Process(pid=19816, name='python') at 139845631328144>
    >>>
    

    What I would recommend, if you want to drop support for 1.2.1, is to grep for ".name", ".exe", etc. and just replace them with ".exe()" and ".name()" one by one. If, on the other hand, you want to write code that works with both versions, I see two possibilities:

    • #1 check version info, like this:
    >>> PSUTIL2 = psutil.version_info >= (2, 0)
    >>> p = psutil.Process()
    >>> name = p.name() if PSUTIL2 else p.name
    >>> exe = p.exe() if PSUTIL2 else p.exe
    
    • #2 get rid of all ".name", ".exe" occurrences you have in your code and use Process.as_dict() instead:
    >>> p = psutil.Process()
    >>> pinfo = p.as_dict(attrs=["name", "exe"])
    >>> pinfo
    {'exe': '/usr/bin/python2.7', 'name': 'python'}
    >>> name = pinfo['name']
    >>> exe = pinfo['exe']
    

    New features introduced in 2.0.0

    psutil 2.0.0 is not only about code breakage. I also had the chance to integrate a bunch of interesting features.

    • #427: you're now able to distinguish between the number of logical and physical CPUs:
    >>> psutil.cpu_count()  # logical
    4
    >>> psutil.cpu_count(logical=False)  # physical cores only
    2
    
    • #452: Process instances are now hashable and can be checked for equality. That means you can use Process objects with sets (finally!).
    • #447: the timeout parameter of psutil.wait_procs() is now optional.
    • #461: functions returning namedtuples are now picklable.
    • #459: a Makefile is now available to automate repetitive tasks such as build, install, running tests, etc. There's also a make.bat for Windows.
    • Introduced the unittest2 module as a requirement for running tests.

Social

Feed