Blog posts for tags/psutil

  1. Real process memory and environ in Python

    New psutil 4.0.0 is out, with some interesting news about process memory metrics. I'll just get straight to the point and describe what's new.

    "Real" process memory info

    Determining how much memory a process really uses is not an easy matter (see this and this). RSS (Resident Set Size), which is what most people usually rely on, is misleading because it includes both the memory which is unique to the process and the memory shared with other processes. What would be more interesting in terms of profiling is the memory which would be freed if the process was terminated right now. In the Linux world this is called USS (Unique Set Size), and this is the major feature which was introduced in psutil 4.0.0 (not only for Linux but also for Windows and OSX).

    USS memory

    The USS (Unique Set Size) is the memory which is unique to a process and which would be freed if the process was terminated right now. On Linux this can be determined by parsing all the "private" blocks in /proc/pid/smaps. The Firefox team pushed this further and managed to do the same also on OSX and Windows, which is great. New version of psutil is now able to do the same:

    >>> psutil.Process().memory_full_info()
    pfullmem(rss=101990, vms=521888, shared=38804, text=28200, lib=0, data=59672, dirty=0, uss=81623, pss=91788, swap=0)
    

    PSS and swap

    On Linux there are two additional metrics which can also be determined via /proc/pid/smaps: PSS and swap. PSS, aka "Proportional Set Size", represents the amount of memory shared with other processes, accounted in a way that the amount is divided evenly between the processes that share it. I.e. if a process has 10 MBs all to itself (USS) and 10 MBs shared with another process, its PSS will be 15 MBs. "swap" is simply the amount of memory that has been swapped out to disk. With memory_full_info() it is possible to implement a tool like this, similar to smem on Linux, which provides a list of processes sorted by "USS". It is interesting to notice how RSS differs from USS:

    ~/svn/psutil$ ./scripts/procsmem.py
    PID     User    Cmdline                            USS     PSS    Swap     RSS
    ==============================================================================
    ...
    3986    giampao /usr/bin/python3 /usr/bin/indi   15.3M   16.6M      0B   25.6M
    3906    giampao /usr/lib/ibus/ibus-ui-gtk3       17.6M   18.1M      0B   26.7M
    3991    giampao python /usr/bin/hp-systray -x    19.0M   23.3M      0B   40.7M
    3830    giampao /usr/bin/ibus-daemon --daemoni   19.0M   19.0M      0B   21.4M
    20529   giampao /opt/sublime_text/plugin_host    19.9M   20.1M      0B   22.0M
    3990    giampao nautilus -n                      20.6M   29.9M      0B   50.2M
    3898    giampao /usr/lib/unity/unity-panel-ser   27.1M   27.9M      0B   37.7M
    4176    giampao /usr/lib/evolution/evolution-c   35.7M   36.2M      0B   41.5M
    20712   giampao /usr/bin/python -B /home/giamp   45.6M   45.9M      0B   49.4M
    3880    giampao /usr/lib/x86_64-linux-gnu/hud/   51.6M   52.7M      0B   61.3M
    20513   giampao /opt/sublime_text/sublime_text   65.8M   73.0M      0B   87.9M
    3976    giampao compiz                          115.0M  117.0M      0B  130.9M
    32486   giampao skype                           145.1M  147.5M      0B  149.6M
    

    Implementation

    In order to get these values (USS, PSS and swap) we need to pass through the whole process address space. This usually requires higher user privileges and is considerably slower than getting the "usual" memory metrics via Process.memory_info(), which is probably the reason why tools like ps and top show RSS/VMS instead of USS. A big thanks goes to the Mozilla team which figured out all this stuff on Windows and OSX, and to Eric Rahm who put the PRs for psutil together (see #744, #745 and #746). For those of you who don't use Python and would like to port the code on other languages here's the interesting parts:

    Memory type percent

    After reorganizing process memory APIs I decided to add a new memtype parameter to Process.memory_percent(). With this it is now possible to compare a specific memory type (not only RSS) against the total physical memory. E.g.

    >>> psutil.Process().memory_percent(memtype='pss')
    0.06877466326787016
    

    Process environ

    Second biggest improvement in psutil 4.0.0 is the ability to determine the process environment variables. This opens interesting possibilities about process recognition and monitoring techniques. For instance, one might start a process by passing a certain custom environment variable, then iterate over all processes to find the one of interest (and figure out whether it's running or whatever):

    import psutil
    for p in psutil.process_iter():
        try:
            env = p.environ()
        except psutil.Error:
            pass
        else:
            if 'MYAPP' in env:
                ...
    

    Process environ was a long standing issue (year 2009) who I gave up to implement because the Windows implementation worked for the current process only. Frank Benkstein solved that and the process environ can now be determined on Linux, Windows and OSX for all processes (of course you may still bump into AccessDenied for processes owned by another user):

    >>> import psutil
    >>> from pprint import pprint as pp
    >>> pp(psutil.Process().environ())
    {...
     'CLUTTER_IM_MODULE': 'xim',
     'COLORTERM': 'gnome-terminal',
     'COMPIZ_BIN_PATH': '/usr/bin/',
     'HOME': '/home/giampaolo',
     'PWD': '/home/giampaolo/svn/psutil',
      }
    >>>
    

    It must be noted that the resulting dict usually does not reflect changes made after the process started (e.g. os.environ['MYAPP'] = '1'). Again, for whoever is interested in doing this in other languages, here's the interesting parts:

    Extended disk IO stats

    psutil.disk_io_counters() has been extended to report additional metrics on Linux and FreeBSD:

    • busy_time, which is the time spent doing actual I/Os (in milliseconds).
    • read_merged_count and write_merged_count (Linux only), which is number of merged reads and writes (see iostats doc).

    With these new metrics it is possible to have a better representation of actual disk utilization, similarly to iostat command on Linux.

    OS constants

    Given the increasing number of platform-specific metrics I added a new set of constants to quickly differentiate what platform you're on: psutil.LINUX, psutil.WINDOWS, etc. Main bug fixes:

    • #734: on Python 3 invalid UTF-8 data was not correctly handled for proces name(), cwd(), exe(), cmdline() and open_files() methods, resulting in UnicodeDecodeError. This was affecting all platforms. Now surrogateescape error handler is used as a workaround for replacing the corrupted data.
    • #761: [Windows] psutil.boot_time() no longer wraps to 0 after 49 days.
    • #767: [Linux] disk_io_counters() may raise ValueError on 2.6 kernels and it's broken on 2.4 kernels.
    • #764: psutil can now be compiled on NetBSD-6.X.
    • #704: psutil can now be compiled on Solaris sparc.

    Complete list of bug fixes is available here.

    Porting code

    Being 4.0.0 a major version, I took the chance to (lightly) change / break some APIs.

    • Process.memory_info() no longer returns just an (rss, vms) namedtuple. Instead it returns a namedtuple of variable length, changing depending on the platform (rss and vms are always present though, also on Windows). Basically it returns the same result of old memory_info_ex(). This shouldn't break your existent code, unless you were doing rss, vms = p.memory_info().
    • At the same time process_memory_info_ex() is now deprecated. The method is still there as an alias for memory_info(), issuing a deprecation warning.
    • psutil.disk_io_counters() returns 2 additional fields on Linux and 1 additional field on FreeBSD.
    • psutil.disk_io_counters() on NetBSD and OpenBSD no longer return write_count and read_count metrics because the kernel do not provide them (we were returning the busy time instead). I also don't expect this to be a big issue because NetBSD and OpenBSD support is very recent.

    Final notes and looking for a job

    OK, this is it. I would like to spend a couple more words to announce the world that I'm currently unemployed and looking for a remote gig again! =) I want remote because my plan for this year is to remain in Prague (Czech Republic) and possibly spend 2-3 months in Asia. If you know about any company who's looking for a Python backend dev who can work from afar feel free to share my Linkedin profile or mail me at g.rodola [at] gmail [dot] com.

  2. psutil OpenBSD support

    OK, this is a big one: starting from version 3.3.0 (released just now) psutil will officially support OpenBSD platforms. This was contributed by Landry Breuil (thanks dude!) and myself in PR-615. The interesting parts of the code changes are this and this.

    Differences with FreeBSD

    As expected, OpenBSD implementation is very similar to FreeBSD's (which was already in place), that is why I decided to merge most of it in a single C file (_psutil_bsd.c) and use 2 separate C files for when the two implementations differed too much: freebsd.c and openbsd.c. In terms of functionality here's the differences with FreeBSD. Unless specified, these differences are due to the kernel which does not provide the information natively (meaning we can't do anything about it).

    • Process.memory_maps() is not implemented. The kernel provides the necessary pieces but I didn't do this yet (hopefully later).
    • Process.num_ctx_switches()'s involuntary field is always 0. kinfo_proc provides this info but it is always set to 0.
    • Process.cpu_affinity() (get and set) is not supported.
    • Process.exe() is determined by inspecting the command line so it may not always be available (return None).
    • psutil.swap_memory() sin and sout (swap in and swap out) values are not available and hence are always set to 0.
    • psutil.cpu_count(logical=False) always return None.

    Similarly to FreeBSD, also OpenBSD implementation of Process.open_files() is problematic as it is not able to return file paths (FreeBSD can sometimes). Other than these differences the functionalities are all there and pretty much the same, so overall I'm pretty satisfied with the result.

    Considerations about BSD platforms

    psutil has been supporting FreeBSD basically since the beginning (year 2009). At the time it made sense to support FreeBSD instead of other BSD variants because it is the most popular, followed by OpenBSD and NetBSD. Compared to FreeBSD, OpenBSD appears to be more "minimal" both in terms of facilities provided by the kernel and the number of system administration tools available. One thing which I appreciate a lot about FreeBSD is that the source code of all CLI tools installed on the system is available under /usr/bin/src, which was a big help for implementing all psutil APIs. OpenBSD source code is also available but it uses CSV and I am not sure it includes the source code for all CLI tools. There are still two more BSD variants for which it may be worth to add support for: NetBSD and DragonflyBSD (in this order). About a year ago some guy provided a patch for adding basic NetBSD support so it is likely that will happen sooner or later.

    Other enhancements available in this release

    The only other enhancement is issue #558, which allows specifying a different location of /proc filesystem on Linux.

    External discussions

  3. psutil 3.0, aka how I reimplemented ifconfig in Python

    Here we are. It's been a long time since my last blog post and my last psutil release. The reason? I've been travelling! I mean... a lot. I've spent 3 months in Berlin, 3 weeks in Japan and 2 months in New York City. While I was there I finally had the chance to meet my friend Jay Loden in person. Jay and I originally started working on psutil together 7 years ago.

    Back then I didn't know any C (and I still am a terrible C developer) so he's been crucial to develop the initial psutil skeleton including OSX and Windows support. I'm back home now (but not for long ;-)), so I finally have some time to write this blog post and tell you about the new psutil release. Let's see what happened.

    net_if_addrs()

    In a few words, we're now able to list network interface addresses similarly to "ifconfig" command on UNIX:

    >>> import psutil
    >>> from pprint import pprint
    >>> pprint(psutil.net_if_addrs())
    {'ethernet0': [snic(family=<AddressFamily.AF_INET: 2>,
                        address='10.0.0.4',
                        netmask='255.0.0.0',
                        broadcast='10.255.255.255'),
                   snic(family=<AddressFamily.AF_PACKET: 17>,
                        address='9c:eb:e8:0b:05:1f',
                        netmask=None,
                        broadcast='ff:ff:ff:ff:ff:ff')],
     'localhost': [snic(family=<AddressFamily.AF_INET: 2>,
                        address='127.0.0.1',
                        netmask='255.0.0.0',
                        broadcast='127.0.0.1'),
                   snic(family=<AddressFamily.AF_PACKET: 17>,
                        address='00:00:00:00:00:00',
                        netmask=None,
                        broadcast='00:00:00:00:00:00')]}
    

    This is limited to AF_INET (IPv4), AF_INET6 (IPv6) and AF_LINK (Ethernet) address families. If you want something more poweful (e.g. AF_BLUETOOTH) you can take a look at netifaces extension. And here's the code which does these tricks on POSIX and Windows:

    Also, here's some doc.

    net_if_stats()

    This will return a bunch of information about network interface cards:

    >>> import psutil
    >>> from pprint import pprint
    >>> pprint(psutil.net_if_stats())
    {'ethernet'0: snicstats(isup=True,
                            duplex=<NicDuplex.NIC_DUPLEX_FULL: 2>,
                            speed=100,
                            mtu=1500),
     'localhost': snicstats(isup=True,
                            duplex=<NicDuplex.NIC_DUPLEX_UNKNOWN: 0>,
                            speed=0,
                            mtu=65536)}
    

    Again, here's the code for each platform:

    ...and the doc.

    Enums

    Enums are a nice new feature introduced in Python 3.4. Very briefly (or at least, this is what I appreciate the most about them), they help you write an API with human-readable constants. If you use Python 2 you'll see something like this:

    >>> import psutil
    >>> psutil.IOPRIO_CLASS_IDLE
    3
    

    On Python 3.4 you'll see a more informative:

    >>> import psutil
    >>> psutil.IOPRIO_CLASS_IDLE
    <IOPriority.IOPRIO_CLASS_IDLE: 3>
    

    They are backward compatible, meaning if you're sending serialized data produced with psutil through the network you can safely use comparison operators and so on. The psutil APIs returning enums (on Python >=3.4) are:

    • psutil.net_connections() (the address families):
    • psutil.Process.connections() (same as above)
    • psutil.net_if_stats() (all NIC_DUPLEX_* constants)
    • psutil.Process.nice() on Windows (for all the *_PRIORITY_CLASS constants)
    • psutil.Process.ionice() on Linux (for all the IOPRIO_CLASS_* constants)

    All the other existing constants remained plain strings (STATUS_*) or integers (CONN_*).

    Zombie processes

    This is a big one. The full story is here but basically the support for zombie processes on UNIX was broken (except on Linux, and Windows doesn't have zombie processes). Up until psutil 2.X we could instantiate a zombie process:

    >>> pid = create_zombie()
    >>> p = psutil.Process(pid)
    

    ...but every time we queried it we got a NoSuchProcess exception:

    >>> psutil.name()
      File "psutil/__init__.py", line 374, in _init
        raise NoSuchProcess(pid, None, msg)
    psutil.NoSuchProcess: no process found with pid 123
    

    That was misleading though because the PID technically still existed:

    >>> psutil.pid_exists(p.pid)
    True
    

    Furthermore, depending on what platform you were on, certain process stats could still be queried (instead of raising NoSuchProcess):

    >>> psutil.cmdline()
    ['python']
    

    Also process_iter() did not return zombie processes at all. This was probably the worst aspect because being able to identify them is an important use case, as they signal an issue with process: if a parent process spawns a child, terminates it (via kill()), but doesn't wait() for it it will create a zombie. Long story short, the way this changed in psutil 3.0 is that:

    • we now have a new ZombieProcess exception, raised every time we're not able to query a process because it's a zombie
    • it is raised instead of NoSuchProcess (which was incorrect and misleading)
    • it is still backward compatible (meaning you won't have to change your old code) because it inherits from NoSuchProcess
    • process_iter() finally works, meaning you can safely identify zombie processes like this:
    import psutil
    zombies = []
    for p in psutil.process_iter():
        try:
            if p.status() == psutil.STATUS_ZOMBIE:
                zombies.append(p)
        except NoSuchProcess:
            pass
    

    Removal of deprecated APIs

    This is another big one, probably the biggest. In a previous blog post I already talked about deprecated APIs. What I did back then (January 2014) was to rename and officially deprecate different APIs and provide aliases for them so that people wouldn't yell at me because I broke their existent code. The most interesting deprecation was certainly the one affecting module constants and the hack which was used in order to provide "module properties". With this new release I decided to get rid of all those aliases. I'm sure this will cause problems but hey! This is a new major release, right? =). Plus the amount of crap which was removed is impressive (see the commit). Here's the old aliases which are now gone for good (or bad, depending on how much headache they will cause you):

    Removed module functions and constants

    Already deprecated name New name
    psutil.BOOT_TIME() psutil.boot_time()
    psutil.NUM_CPUS() psutil.cpu_count()
    psutil.TOTAL_PHYMEM() psutil.virtual_memory().total
    psutil.avail_phymem() psutil.virtual_memory().free
    psutil.avail_virtmem() psutil.swap_memory().free
    psutil.cached_phymem() psutil.virtual_memory().cached
    psutil.get_pid_list() psutil.pids().cached
    psutil.get_process_list()  
    psutil.get_users() psutil.users()
    psutil.network_io_counters() psutil.net_io_counters()
    psutil.phymem_buffers() psutil.virtual_memory().buffers
    psutil.phymem_usage() psutil.virtual_memory()
    psutil.total_virtmem() psutil.swap_memory().total
    psutil.used_virtmem() psutil.swap_memory().used
    psutil.used_phymem() psutil.virtual_memory().used
    psutil.virtmem_usage() psutil.swap_memory()

    Process methods (assuming p = psutil.Process()):

    Already deprecated name New name
    p.get_children() p.children()
    p.get_connections() p.connections()
    p.get_cpu_affinity() p.cpu_affinity()
    p.get_cpu_percent() p.cpu_percent()
    p.get_cpu_times() p.cpu_times()
    p.get_io_counters() p.io_counters()
    p.get_ionice() p.ionice()
    p.get_memory_info() p.memory_info()
    p.get_ext_memory_info() p.memory_info_ex()
    p.get_memory_maps() p.memory_maps()
    p.get_memory_percent() p.memory_percent()
    p.get_nice() p.nice()
    p.get_num_ctx_switches() p.num_ctx_switches()
    p.get_num_fds() p.num_fds()
    p.get_num_threads() p.num_threads()
    p.get_open_files() p.open_files()
    p.get_rlimit() p.rlimit()
    p.get_threads() p.threads()
    p.getcwd() p.cwd()
    p.set_cpu_affinity() p.cpu_affinity()
    p.set_ionice() p.ionice()
    p.set_nice() p.nice()
    p.set_rlimit() p.rlimit()

    If your code suddenly breaks with AttributeError after you upgraded psutil it means you were using one of those deprecated aliases. In that case just take a look at the table above and rename stuff in accordance.

    Bug fixes

    I fixed a lot of stuff (full list here), but here's the list of things which I think are worth mentioning:

    • #512: [FreeBSD] fix segfault in net_connections().
    • #593: [FreeBSD] Process.memory_maps() segfaults.
    • #606: Process.parent() may swallow NoSuchProcess exceptions.
    • #614: [Linux]: cpu_count(logical=False) return the number of physical CPUs instead of physical cores.
    • #628: [Linux] Process.name() truncates process name in case it contains spaces or parentheses.

    Ease of development

    These are not enhancements you will directly benefit from but I put some effort into making my life easier every time I work on psutil.

    • I care about psutil code being fully PEP8 compliant so I added a pre-commit GIT hook which runs flake8 on every commit and rejects it if the coding style is not compliant. The way I install this is via make install-git-hooks.
    • I added a make install-dev-deps command which installs all deps and stuff which is useful for testing (ipdb, coverage, etc).
    • A new make coverage command which runs coverage. With this I discovered some of parts in the code which weren't covered by tests and I fixed that.
    • I started using tox to easily test psutil against all supported Python versions (from 2.6 to 3.4) in one shot.
    • I reorganized tests so that now they can be easily executed with py.test and nose (before, only unittest runner was fully supported)

    Final words

    I must say I'm pretty satisfied with how psutil is going and the satisfaction I still get every time I work on it. Right now it gets almost 800.000 download a month, which is pretty great for a Python library. As of right now I consider psutil almost "completed" in terms of features, meaning I'm basically running out of ideas on what I should add next (see TODO). From now on the future development will probably focus on adding support for more exotic platforms (OpenBSD, NetBSD, Android). There also have been some discussions on python-ideas mailing list about including psutil into Python stdlib but, assuming that will ever happen, it's still far away in the future as it would require a lot of time which I currently don't have. That should be all. I hope you will all enjoy this new release.

  4. psutil 2.1.2 and Python wheels

    psutil 2.1.2 is out. This release has been cooking for a while now, and that's because I've been travelling for the past 3 months between Spain, Japan and Germany. Hopefully I will be staying in Berlin for a while now, so I will have more time to dedicate to the project. The main new "feature" of this release is that other than the exe files, Windows users can now also benefit of Python wheels (full story is here) which are available on PYPI. Frankly I don't know much about the new wheels packaging system but long story short is that Windows users can now install psutil via pip and therefore also include it as a dependency into requirements.txt. Other than this 2.1.2 can basically be considered a bug-fix release, including some important fixes amongst which:

    • #506: restored Python 2.4 compatibility
    • #340: Process.get_open_files() no longer hangs on Windows (this was a very old and high-priority issue)
    • #501: disk_io_counters() may return negative values on Windows
    • #504: (Linux) couldn't build RPM packages via setup.py

    The list of all fixes can be found here. For the next release I plan to drop support for Python 2.4 and 2.5 and hopefully network interfaces information similarly to ifconfig.

  5. psutil 2.0

    The time has finally come: psutil 2.0 is out! This is a release which took me a considerable amount of effort and careful thinking during the past 4 months as I went through a major rewrite and reorganization of both python and C extension modules. To get a sense of how much has changed you can compare the differences with old 1.2.1 version by running "hg diff -r release-1.2.1:release-2.0.0" which will produce more than 22,000 lines of output! In those 22k lines I tried to nail down all the quirks the project had accumulated since its start 4 years ago and the resulting code base is now cleaner than ever, more manageable and fully compliant with PEP-7 and PEP-8 guidelines. There were some difficult decisions because many of the changes I introduced are not backward compatible so I was concerned with the pain this may cause existing users. I kind of still am, but I'm sure the transition will be well perceived on the long run as it will result in more manageable user code. OK, enough with the preface and let's see what changed.

    API changes

    I already wrote a detailed blog post about what changed so I recommend you to use that as the official reference on how to port your code.

    RST documentation

    I've never been happy with old doc hosted on Google Code. The markup language provided by Google is pretty limited, plus it's not put under revision control. New doc is more detailed, it uses reStructuredText as the markup language, it lives in the same code repository as psutil's and it is hosted on the excellent readthedocs web site: http://psutil.readthedocs.org/

    Physical CPUs count

    You're now able to distinguish between logical and physical CPUs:

    >>> psutil.cpu_count()  # logical
    4
    >>> psutil.cpu_count(logical=False)  # physical cores only
    2
    

    Full story is in issue 427.

    Process instances are hashable

    Basically this means process instances can now be checked for equality and can be used with set()s:

    >>> p1 = psutil.Process()
    >>> p2 = psutil.Process()
    >>> p1 == p2
    True
    >>> set((p1, p2))
    set([<psutil.Process(pid=8217, name='python') at 140007043550608>])
    

    Full story is in issue 452.

    Speedups

    • #477: process cpu_percent() is about 30% faster.
    • #478: (Linux) almost all APIs are about 30% faster on Python 3.X.

    Other improvements and bugfixes

    • #424: Windows installers for Python 3.X 64-bit
    • #447: psutil.wait_procs() timeout parameter is now optional.
    • #459: a Makefile is now available for running tests and other repetitive tasks (also on Windows)
    • #463: timeout parameter of cpu_percent* functions default to 0.0 because it was a common trap to introduce slowdowns.
    • #340: (Windows) process open_files() no longer hangs.
    • #448: (Windows) fixed a memory leak affecting children() and ppid() methods.
    • #461: namedtuples are now pickle-able.
    • #474: (Windows) Process.cpu_percent() is no longer capped at 100%

    OK, that's all folks. I hope you will enjoy this new version and report your feedback.

Social

Feeds