Blog posts for tags/psutil

  1. FreeBSD process environ and resource limits

    New psutil 5.7.3 is out. This release adds support for 2 functionalities which were not available on BSD platforms: the ability to get the process environment (all BSD) and to get or set process resource limits (FreeBSD only), similarly to what can be done on Linux.

    Process environ

    Quite simply:

    >>> import psutil, pprint, os
    >>> pid = os.getpid()
    >>> pprint.pprint(psutil.Process(pid).environ())
    {'BLOCKSIZE': 'K',
     'EDITOR': 'vi',
     'GROUP': 'vagrant',
     'HOME': '/home/vagrant',
     'HOST': 'freebsd',
     'HOSTTYPE': 'FreeBSD',
     'LOGNAME': 'vagrant',
     'MACHTYPE': 'x86_64',
     'MAIL': '/var/mail/vagrant',
     'OSTYPE': 'FreeBSD',
     'PAGER': 'less',
     'PATH': '/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/home/vagrant/bin',
     'PWD': '/home/vagrant/psutil',
     'REMOTEHOST': '10.0.2.2',
     'SHELL': '/bin/csh',
     'SHLVL': '1',
     'SSH_CLIENT': '10.0.2.2 58102 22',
     'SSH_CONNECTION': '10.0.2.2 58102 10.0.2.15 22',
     'SSH_TTY': '/dev/pts/0',
     'TERM': 'xterm-256color',
     'USER': 'vagrant',
     'VENDOR': 'amd'}
    

    This feature was already available on all other platforms except BSD. It was contributed by Armin Gruner in PR-1800 and supports all BSD variants (FreeBSD, NetBSD and OpenBSD). BSD kernels expose this information via kvm_getenvv syscall, but there are subtle differences between BSD variants which made this integration particularly thorny. This is how it's done:

    PyObject *
    psutil_proc_environ(PyObject *self, PyObject *args) {
        int i, cnt = -1;
        long pid;
        char *s, **envs, errbuf[_POSIX2_LINE_MAX];
        PyObject *py_value=NULL, *py_retdict=NULL;
        kvm_t *kd;
    #ifdef PSUTIL_NETBSD
        struct kinfo_proc2 *p;
    #else
        struct kinfo_proc *p;
    #endif
    
        if (!PyArg_ParseTuple(args, "l", &pid))
            return NULL;
    
    #if defined(PSUTIL_FREEBSD)
        kd = kvm_openfiles(NULL, "/dev/null", NULL, 0, errbuf);
    #else
        kd = kvm_openfiles(NULL, NULL, NULL, KVM_NO_FILES, errbuf);
    #endif
        if (!kd) {
            convert_kvm_err("kvm_openfiles", errbuf);
            return NULL;
        }
    
        py_retdict = PyDict_New();
        if (!py_retdict)
            goto error;
    
    #if defined(PSUTIL_FREEBSD)
        p = kvm_getprocs(kd, KERN_PROC_PID, pid, &cnt);
    #elif defined(PSUTIL_OPENBSD)
        p = kvm_getprocs(kd, KERN_PROC_PID, pid, sizeof(*p), &cnt);
    #elif defined(PSUTIL_NETBSD)
        p = kvm_getproc2(kd, KERN_PROC_PID, pid, sizeof(*p), &cnt);
    #endif
        if (!p) {
            NoSuchProcess("kvm_getprocs");
            goto error;
        }
        if (cnt <= 0) {
            NoSuchProcess(cnt < 0 ? kvm_geterr(kd) : "kvm_getprocs: cnt==0");
            goto error;
        }
    
        // On *BSD kernels there are a few kernel-only system processes without an
        // environment (See e.g. "procstat -e 0 | 1 | 2 ..." on FreeBSD.)
        // Some system process have no stats attached at all
        // (they are marked with P_SYSTEM.)
        // On FreeBSD, it's possible that the process is swapped or paged out,
        // then there no access to the environ stored in the process' user area.
        // On NetBSD, we cannot call kvm_getenvv2() for a zombie process.
        // To make unittest suite happy, return an empty environment.
    #if defined(PSUTIL_FREEBSD)
    #if (defined(__FreeBSD_version) && __FreeBSD_version >= 700000)
        if (!((p)->ki_flag & P_INMEM) || ((p)->ki_flag & P_SYSTEM)) {
    #else
        if ((p)->ki_flag & P_SYSTEM) {
    #endif
    #elif defined(PSUTIL_NETBSD)
        if ((p)->p_stat == SZOMB) {
    #elif defined(PSUTIL_OPENBSD)
        if ((p)->p_flag & P_SYSTEM) {
    #endif
            kvm_close(kd);
            return py_retdict;
        }
    
    #if defined(PSUTIL_NETBSD)
        envs = kvm_getenvv2(kd, p, 0);
    #else
        envs = kvm_getenvv(kd, p, 0);
    #endif
        if (!envs) {
            // Map to "psutil" general high-level exceptions
            switch (errno) {
                case 0:
                    // Process has cleared it's environment, return empty one
                    kvm_close(kd);
                    return py_retdict;
                case EPERM:
                    AccessDenied("kvm_getenvv");
                    break;
                case ESRCH:
                    NoSuchProcess("kvm_getenvv");
                    break;
    #if defined(PSUTIL_FREEBSD)
                case ENOMEM:
                    // Unfortunately, under FreeBSD kvm_getenvv() returns
                    // failure for certain processes ( e.g. try
                    // "sudo procstat -e <pid of your XOrg server>".)
                    // Map the error condition to 'AccessDenied'.
                    sprintf(errbuf,
                            "kvm_getenvv(pid=%ld, ki_uid=%d): errno=ENOMEM",
                            pid, p->ki_uid);
                    AccessDenied(errbuf);
                    break;
    #endif
                default:
                    sprintf(errbuf, "kvm_getenvv(pid=%ld)", pid);
                    PyErr_SetFromOSErrnoWithSyscall(errbuf);
                    break;
            }
            goto error;
        }
    
        for (i = 0; envs[i] != NULL; i++) {
            s = strchr(envs[i], '=');
            if (!s)
                continue;
            *s++ = 0;
            py_value = PyUnicode_DecodeFSDefault(s);
            if (!py_value)
                goto error;
            if (PyDict_SetItemString(py_retdict, envs[i], py_value)) {
                goto error;
            }
            Py_DECREF(py_value);
        }
    
        kvm_close(kd);
        return py_retdict;
    
    error:
        Py_XDECREF(py_value);
        Py_XDECREF(py_retdict);
        kvm_close(kd);
        return NULL;
    }
    

    Process resource limits

    There's a Linux-only syscall named prlimit which lets you get or set process limits on a per-process basis. I found out very recently that this can be emulated on FreeBSD via sysctl + KERN_PROC_RLIMIT. Here's the PR and here's the relevant part. Example on how to get/set resource limits:

    >>> import psutil, os
    >>> pid = os.getpid()
    >>> p = psutil.Process(pid)
    >>> p.rlimit(psutil.RLIMIT_NOFILE, (128, 128))   # process can open max 128 file descriptors
    >>> p.rlimit(psutil.RLIMIT_FSIZE, (1024, 1024))  # can create files no bigger than 1024 bytes
    >>> p.rlimit(psutil.RLIMIT_FSIZE)                # get
    (1024, 1024)
    >>>
    

    Relevant C code:

    PyObject *
    psutil_proc_getrlimit(PyObject *self, PyObject *args) {
        pid_t pid;
        int ret;
        int resource;
        size_t len;
        int name[5];
        struct rlimit rlp;
    
        if (! PyArg_ParseTuple(args, _Py_PARSE_PID "i", &pid, &resource))
            return NULL;
    
        name[0] = CTL_KERN;
        name[1] = KERN_PROC;
        name[2] = KERN_PROC_RLIMIT;
        name[3] = pid;
        name[4] = resource;
        len = sizeof(rlp);
    
        ret = sysctl(name, 5, &rlp, &len, NULL, 0);
        if (ret == -1)
            return PyErr_SetFromErrno(PyExc_OSError);
    
    #if defined(HAVE_LONG_LONG)
        return Py_BuildValue("LL",
                             (PY_LONG_LONG) rlp.rlim_cur,
                             (PY_LONG_LONG) rlp.rlim_max);
    #else
        return Py_BuildValue("ll",
                             (long) rlp.rlim_cur,
                             (long) rlp.rlim_max);
    #endif
    }
    
    
    PyObject *
    psutil_proc_setrlimit(PyObject *self, PyObject *args) {
        pid_t pid;
        int ret;
        int resource;
        int name[5];
        struct rlimit new;
        struct rlimit *newp = NULL;
        PyObject *py_soft = NULL;
        PyObject *py_hard = NULL;
    
        if (! PyArg_ParseTuple(
                args, _Py_PARSE_PID "iOO", &pid, &resource, &py_soft, &py_hard))
            return NULL;
    
        name[0] = CTL_KERN;
        name[1] = KERN_PROC;
        name[2] = KERN_PROC_RLIMIT;
        name[3] = pid;
        name[4] = resource;
    
    #if defined(HAVE_LONG_LONG)
        new.rlim_cur = PyLong_AsLongLong(py_soft);
        if (new.rlim_cur == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
        new.rlim_max = PyLong_AsLongLong(py_hard);
        if (new.rlim_max == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
    #else
        new.rlim_cur = PyLong_AsLong(py_soft);
        if (new.rlim_cur == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
        new.rlim_max = PyLong_AsLong(py_hard);
        if (new.rlim_max == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
    #endif
        newp = &new;
        ret = sysctl(name, 5, NULL, 0, newp, sizeof(*newp));
        if (ret == -1)
            return PyErr_SetFromErrno(PyExc_OSError);
        Py_RETURN_NONE;
    }
    

    It turns out some of this can probably also be emulated on Windows via SetInformationObject and QueryInformationJobObject (see ticket), so that is something I'm looking forward to experiment with.

  2. New Pelican website

    Hello there. I present you my new blog / personal website! This is something I've been wanting to do for a very long time, since the old blog hosted at https://grodola.blogspot.com/ was... well, too old. =) This new site is based on Pelican, a static website generator similar to Jekyll. Differently from Jekyll, it uses Python instead of Ruby, and that's why I chose it. It's minimal, straight to the point and I feel I have control of things. This is what Pelican gave me out of the box:

    • blog functionality
    • ability to write content by using reStructuredText
    • RSS & Atom feed
    • easy integration with GitHub pages
    • ability to add comments via Disqus

    To this I added a mailing list (I used feedburner), so that users can subscribe and receive an email every time I make a new blog post. As you can see the website is very simple, but it's exactly what I wanted (minimalism). As for the domain name I opted for gmpy.dev, mostly because I know my name is hard to type and pronounce for non-english speakers. And also because I couldn't come up with a better name. ;)

    GIT-based workflow

    The main reason why I blogged so rarely over the years was mostly because blogger.com provided me no way to edit content in RsT or markdown, and the lack of GIT integration. This made me lazy. With Pelican + GitHub pages the workflow to create and publish new content is very straightforward. I use 2 branches: gh-pages, which is the source code of this web-site, and master, which is where the generated HTML content lives and it is being served by GitHub pages. This is what I do when I have to create a new blog post:

    • under gh-pages branch I create a new file, e.g. content/blog/2020/new-blog-post.rst:
    New blog post
    #############
    
    :date: 2020-06-26
    :tags: announce, python
    
    Hello world!
    
    • commit it:
    git add content/blog/2020/new-blog-post.rst
    git ci -am "new blog post"
    git push
    
    • publish it:
    make github
    

    Within 1 minute or something, GitHub will automatically serve gmpy.dev with the updated content. And this is why I think I will start blogging more often. =) The core of Pelican is pelicanconf.py, which lets you customize a lot of things by remaining independent from the theme. I still ended up modifying the default theme though, writing a customized "archives" view and editing CSS to make the site look better on mobile phones. All in all, I am very satisfied with Pelican, and I'm keen on recommending it to anyone who doesn't need dynamism.

    About me

    I spent most of last year (2019) in China, dating my girlfriend, remote working from a shared office space in Shenzhen, and I even taught some Python to a class of Chinese folks with no previous programming experience. The course was about the basics of the language + basic filesystem operations, and the most difficult thing to explain was indentation. I guess that shows the difference between knowing a language and knowing how to teach it.

    I got back to Italy in December 2020, just before the pandemic occurred. Because of my connections with China, I knew about the incoming pandemic sooner than the rest of my friends, which for a while (until the lockdown) made them think I was crazy. =)

    Since I knew I would be stuck at home for a long time, I bought a quite nice acoustic guitar (Taylor) after many years, and resumed playing (and singing). I learned a bunch of new songs, mostly about Queen, including Bohemian Rhapsody, which is something I've been wanting to do since forever.

    I also spent some time working on a personal project that I'm keeping private for the moment, something to speed up file copies, which follows the experiments I made in BPO-33671. It's still very beta, but I managed to make file copies around 170% faster compared to cp command on Linux, which is pretty impressive (and I think I can push it even further). I will blog about that once I'll have something more solid / stable. Most likely it'll become my next OSS project, even tough I have mixed feelings about that, since the amount of time I'm devoting to psutil is already a lot.

    Speaking of which, today I'm also releasing psutil 5.7.1, which adds support for Windows Nano.

    I guess this is all. Cheers and subscribe!

  3. System load average on Windows in Python

    New psutil 5.6.2 release implements an emulation of os.getloadavg() on Windows which was kindly contributed by Ammar Askar who originally implemented it for cPython's test suite. This idea has been floating around for quite a while. The first proposal dates back to 2010, when psutil was still hosted on Google Code, and it popped up multiple times throughout the years. There was/is a bunch of info on internet mentioning the bits with which it's theoretically possible to do this (the so called System Processor Queue Length), but I couldn't find any real implementation. A Google search tells there is quite some demand for this, but very few tools out there providing this natively (the only one I could find is this sFlowTrend tool and Zabbix), so I'm very happy this finally landed into psutil / Python.

    Other improvements and bugfixes in psutil 5.6.2

    The full list is here but I would like to mention a couple:

    • 1476: the possibility to set process' high I/O priority on Windows
    • 1458: colorized test output. I admit nobody will use this directly but it's very cool and I'm porting it to a bunch of other projects I work on (e.g. pyftpdlib). Also, perhaps this could be a good candidate for a small module to put on PYPI which can also include some functionalities taken from pytest and which I'm gradually re-implementing in unittest module amongst which:
      • 1478: re-running failed tests
      • display test timings/durations: this is something I'm contributing to cPython, see BPO-4080 and and PR-12271

    About me

    I'm currently in China (Shenzhen) for a mix of vacation and work, and I will likely take a break from Open Source for a while (likely 2.5 months, during which I will also go to Philippines and Japan - I love Asia ;-)).

    External

  4. psutil 5.6.0 and process parents

    Hello world =)

    It was a long time since my last blog post (over 1 year and a half). During this time I moved between Italy, Prague and Shenzhen (China), and also contributed a couple of nice patches for Python I want to blog about when Python 3.8 will be out: zero-copy for shutil.copy() functions and socket.create_server() utility function. But let's move on and talk about what this blog post is about: the next major psutil version.

    Process parents()

    From the doc: return the parents of this process as a list of Process instances. If no parents are known return an empty list.

    >>> import psutil
    >>> p = psutil.Process(5312)
    >>> p.parents()
    [psutil.Process(pid=4699, name='bash', started='09:06:44'),
     psutil.Process(pid=4689, name='gnome-terminal-server', started='0:06:44'),
     psutil.Process(pid=1, name='systemd', started='05:56:55')]
    

    Nothing really new here, as it's a convenience method based on the existing parent() method, but still it's something nice to have implemented as a builtin and which can be used to work with process trees in conjunction with children() method. The idea was proposed by Ghislain Le Meur.

    Windows

    A bunch of interesting improvements occurred on Windows.

    The first one is that certain Windows APIs requiring to be dynamically loaded from DLL libraries are now loaded only once on startup (instead of on per function call), significantly speeding up different functions and methods. This is described and implemented in PR #1422 which also provides benchmarks.

    Another one is Process' suspend() and resume() methods. Before they were using CreateToolhelp32Snapshot() to iterate over all process' threads which was somewhat unorthodox and didn't work if process was suspended via Process Hacker. Now it relies on undocumented NtSuspendProcess and NtResumeProcess APIs, which is the same approach used by ProcessHacker and other famous Sysinternals tools. The change was proposed and discussed in issue #1379 and implemented in PR #1435. I think I will later propose the addition of suspend() and resume() method in subprocess module in Python.

    Last nice improvement about Windows it's about SE DEBUG mode. SE DEBUG mode can be seen as a "bit" which you can set on the Python process on startup so that we have more chances of querying processes owned by other users, including many owned by Administrator and Local System. Practically speaking this means we will get less AccessDenied exceptions for low PID processes. It turns out the code doing this has been broken presumably for years, and never set SE DEBUG. This is fixed now and the change was made in PR #1429.

    Removal of Process.memory_maps() on OSX

    This was somewhat controversial. The history about memory_maps() on OSX is a painful one. It was based on an undocumented and probably broken Apple API called proc_regionfilename() which made memory_maps() either randomly raise EINVAL or result in segfault! Also, memory_maps() could only be used for the current process, limiting its usefulness to os.getpid() only. For any other process it raised AccessDenied. This has been a known problem for a long time but sometime over the last few years I got tired of seeing random test failures on Travis that I couldn't reproduce locally, so I commented the unit-test and forget about it until last week, when I realized the real impact this has on production code. I tried looking for a solution once again, spending quite some time looking for public source codes which managed to do this right with no luck. The only tool I'm aware of which does this right is vmmap from Apple, but it's closed source. After careful thinking, since no solution was found, I decided to just remove memory_maps() from OSX. This is not something I took lightly, but considering the alternative is getting a segfault I decided to sacrifice backward compatibility (hence the major version bump).

    Improved exceptions

    One problem which afflicted psutil maintenance over the years was receiving bug reports including tracebacks which didn't provide any information on what syscall failed exactly. This was especially painful on Windows where a single routine can invoke different Windows APIs. Now the OSError (or WindowsError) exception will include the syscall from which the error originated, see PR-#1428.

    Other important bugfixes

    • #1353: process_iter() is now thread safe
    • #1411: [BSD] segfault could occur on Process instantiation
    • #1427: [OSX] Process cmdline() and environ() may erroneously raise OSError on failed malloc().
    • #1447: original exception wasn't turned into NoSuchProcess / AccessDenied exceptions when using Process.oneshot() ctx manager.

    A full list of enhancements and bug fixes is available here.

  5. psutil 5.4.0 and AIX support

    After a long time psutil finally adds support for a brand new exotic platform: AIX! Honestly I am not sure how many AIX Python users are out there (I suppose not many) but still, here it is. For this we have to thank Arnon Yaari who started working on the porting a couple of years ago. To be honest I was skeptical at first because AIX is the only platform which I cannot virtualize and test on my laptop so that made me a bit nervous but Arnon did a very good job. The final PR is huge, it required a considerable amount of work on his part and a review process of over 140 messages which were exchanged between me and him over the course of over 1 month during which I was travelling through China. The final result is very good, basically (almost) all original unit tests pass and the quality of the submitted code is awesome which (I must say) is kind of unusual for an external contribution like this one. Kudos to you Arnon! ;-)

    Other than AIX support, release 5.4.0 also includes a couple of important bug fixes for sensors_temperatures() and sensors_fans() functions on Linux and the fix of a bug on OSX which could cause a segmentation fault when using Process.open_files(). Complete list of bugfixes is here.

    In terms of future contributions for exotic and still unsupported platforms it is worth mentioning a (still incomplete) PR for Cygwin which looks promising and Mingw32 compiler support on Windows. It looks like psutil is gradually getting to a point where the addition of new functionalities is becoming more rare, so it is good that support for new platforms happens now when the API is mature and stable. Future development in this direction can also include Android and (hopefully) IOS support. Now that would be really awesome to have! =)

    Stay tuned.

Page 1 / 4 »

Social

Feeds