Blog posts for tags/announce

  1. FreeBSD process environ and resource limits

    New psutil 5.7.3 is out. This release adds support for 2 functionalities which were not available on BSD platforms: the ability to get the process environment (all BSD) and to get or set process resource limits (FreeBSD only), similarly to what can be done on Linux.

    Process environ

    Quite simply:

    >>> import psutil, pprint, os
    >>> pid = os.getpid()
    >>> pprint.pprint(psutil.Process(pid).environ())
    {'BLOCKSIZE': 'K',
     'EDITOR': 'vi',
     'GROUP': 'vagrant',
     'HOME': '/home/vagrant',
     'HOST': 'freebsd',
     'HOSTTYPE': 'FreeBSD',
     'LOGNAME': 'vagrant',
     'MACHTYPE': 'x86_64',
     'MAIL': '/var/mail/vagrant',
     'OSTYPE': 'FreeBSD',
     'PAGER': 'less',
     'PATH': '/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/home/vagrant/bin',
     'PWD': '/home/vagrant/psutil',
     'REMOTEHOST': '10.0.2.2',
     'SHELL': '/bin/csh',
     'SHLVL': '1',
     'SSH_CLIENT': '10.0.2.2 58102 22',
     'SSH_CONNECTION': '10.0.2.2 58102 10.0.2.15 22',
     'SSH_TTY': '/dev/pts/0',
     'TERM': 'xterm-256color',
     'USER': 'vagrant',
     'VENDOR': 'amd'}
    

    This feature was already available on all other platforms except BSD. It was contributed by Armin Gruner in PR-1800 and supports all BSD variants (FreeBSD, NetBSD and OpenBSD). BSD kernels expose this information via kvm_getenvv syscall, but there are subtle differences between BSD variants which made this integration particularly thorny. This is how it's done:

    PyObject *
    psutil_proc_environ(PyObject *self, PyObject *args) {
        int i, cnt = -1;
        long pid;
        char *s, **envs, errbuf[_POSIX2_LINE_MAX];
        PyObject *py_value=NULL, *py_retdict=NULL;
        kvm_t *kd;
    #ifdef PSUTIL_NETBSD
        struct kinfo_proc2 *p;
    #else
        struct kinfo_proc *p;
    #endif
    
        if (!PyArg_ParseTuple(args, "l", &pid))
            return NULL;
    
    #if defined(PSUTIL_FREEBSD)
        kd = kvm_openfiles(NULL, "/dev/null", NULL, 0, errbuf);
    #else
        kd = kvm_openfiles(NULL, NULL, NULL, KVM_NO_FILES, errbuf);
    #endif
        if (!kd) {
            convert_kvm_err("kvm_openfiles", errbuf);
            return NULL;
        }
    
        py_retdict = PyDict_New();
        if (!py_retdict)
            goto error;
    
    #if defined(PSUTIL_FREEBSD)
        p = kvm_getprocs(kd, KERN_PROC_PID, pid, &cnt);
    #elif defined(PSUTIL_OPENBSD)
        p = kvm_getprocs(kd, KERN_PROC_PID, pid, sizeof(*p), &cnt);
    #elif defined(PSUTIL_NETBSD)
        p = kvm_getproc2(kd, KERN_PROC_PID, pid, sizeof(*p), &cnt);
    #endif
        if (!p) {
            NoSuchProcess("kvm_getprocs");
            goto error;
        }
        if (cnt <= 0) {
            NoSuchProcess(cnt < 0 ? kvm_geterr(kd) : "kvm_getprocs: cnt==0");
            goto error;
        }
    
        // On *BSD kernels there are a few kernel-only system processes without an
        // environment (See e.g. "procstat -e 0 | 1 | 2 ..." on FreeBSD.)
        // Some system process have no stats attached at all
        // (they are marked with P_SYSTEM.)
        // On FreeBSD, it's possible that the process is swapped or paged out,
        // then there no access to the environ stored in the process' user area.
        // On NetBSD, we cannot call kvm_getenvv2() for a zombie process.
        // To make unittest suite happy, return an empty environment.
    #if defined(PSUTIL_FREEBSD)
    #if (defined(__FreeBSD_version) && __FreeBSD_version >= 700000)
        if (!((p)->ki_flag & P_INMEM) || ((p)->ki_flag & P_SYSTEM)) {
    #else
        if ((p)->ki_flag & P_SYSTEM) {
    #endif
    #elif defined(PSUTIL_NETBSD)
        if ((p)->p_stat == SZOMB) {
    #elif defined(PSUTIL_OPENBSD)
        if ((p)->p_flag & P_SYSTEM) {
    #endif
            kvm_close(kd);
            return py_retdict;
        }
    
    #if defined(PSUTIL_NETBSD)
        envs = kvm_getenvv2(kd, p, 0);
    #else
        envs = kvm_getenvv(kd, p, 0);
    #endif
        if (!envs) {
            // Map to "psutil" general high-level exceptions
            switch (errno) {
                case 0:
                    // Process has cleared it's environment, return empty one
                    kvm_close(kd);
                    return py_retdict;
                case EPERM:
                    AccessDenied("kvm_getenvv");
                    break;
                case ESRCH:
                    NoSuchProcess("kvm_getenvv");
                    break;
    #if defined(PSUTIL_FREEBSD)
                case ENOMEM:
                    // Unfortunately, under FreeBSD kvm_getenvv() returns
                    // failure for certain processes ( e.g. try
                    // "sudo procstat -e <pid of your XOrg server>".)
                    // Map the error condition to 'AccessDenied'.
                    sprintf(errbuf,
                            "kvm_getenvv(pid=%ld, ki_uid=%d): errno=ENOMEM",
                            pid, p->ki_uid);
                    AccessDenied(errbuf);
                    break;
    #endif
                default:
                    sprintf(errbuf, "kvm_getenvv(pid=%ld)", pid);
                    PyErr_SetFromOSErrnoWithSyscall(errbuf);
                    break;
            }
            goto error;
        }
    
        for (i = 0; envs[i] != NULL; i++) {
            s = strchr(envs[i], '=');
            if (!s)
                continue;
            *s++ = 0;
            py_value = PyUnicode_DecodeFSDefault(s);
            if (!py_value)
                goto error;
            if (PyDict_SetItemString(py_retdict, envs[i], py_value)) {
                goto error;
            }
            Py_DECREF(py_value);
        }
    
        kvm_close(kd);
        return py_retdict;
    
    error:
        Py_XDECREF(py_value);
        Py_XDECREF(py_retdict);
        kvm_close(kd);
        return NULL;
    }
    

    Process resource limits

    There's a Linux-only syscall named prlimit which lets you get or set process limits on a per-process basis. I found out very recently that this can be emulated on FreeBSD via sysctl + KERN_PROC_RLIMIT. Here's the PR and here's the relevant part. Example on how to get/set resource limits:

    >>> import psutil, os
    >>> pid = os.getpid()
    >>> p = psutil.Process(pid)
    >>> p.rlimit(psutil.RLIMIT_NOFILE, (128, 128))   # process can open max 128 file descriptors
    >>> p.rlimit(psutil.RLIMIT_FSIZE, (1024, 1024))  # can create files no bigger than 1024 bytes
    >>> p.rlimit(psutil.RLIMIT_FSIZE)                # get
    (1024, 1024)
    >>>
    

    Relevant C code:

    PyObject *
    psutil_proc_getrlimit(PyObject *self, PyObject *args) {
        pid_t pid;
        int ret;
        int resource;
        size_t len;
        int name[5];
        struct rlimit rlp;
    
        if (! PyArg_ParseTuple(args, _Py_PARSE_PID "i", &pid, &resource))
            return NULL;
    
        name[0] = CTL_KERN;
        name[1] = KERN_PROC;
        name[2] = KERN_PROC_RLIMIT;
        name[3] = pid;
        name[4] = resource;
        len = sizeof(rlp);
    
        ret = sysctl(name, 5, &rlp, &len, NULL, 0);
        if (ret == -1)
            return PyErr_SetFromErrno(PyExc_OSError);
    
    #if defined(HAVE_LONG_LONG)
        return Py_BuildValue("LL",
                             (PY_LONG_LONG) rlp.rlim_cur,
                             (PY_LONG_LONG) rlp.rlim_max);
    #else
        return Py_BuildValue("ll",
                             (long) rlp.rlim_cur,
                             (long) rlp.rlim_max);
    #endif
    }
    
    
    PyObject *
    psutil_proc_setrlimit(PyObject *self, PyObject *args) {
        pid_t pid;
        int ret;
        int resource;
        int name[5];
        struct rlimit new;
        struct rlimit *newp = NULL;
        PyObject *py_soft = NULL;
        PyObject *py_hard = NULL;
    
        if (! PyArg_ParseTuple(
                args, _Py_PARSE_PID "iOO", &pid, &resource, &py_soft, &py_hard))
            return NULL;
    
        name[0] = CTL_KERN;
        name[1] = KERN_PROC;
        name[2] = KERN_PROC_RLIMIT;
        name[3] = pid;
        name[4] = resource;
    
    #if defined(HAVE_LONG_LONG)
        new.rlim_cur = PyLong_AsLongLong(py_soft);
        if (new.rlim_cur == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
        new.rlim_max = PyLong_AsLongLong(py_hard);
        if (new.rlim_max == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
    #else
        new.rlim_cur = PyLong_AsLong(py_soft);
        if (new.rlim_cur == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
        new.rlim_max = PyLong_AsLong(py_hard);
        if (new.rlim_max == (rlim_t) - 1 && PyErr_Occurred())
            return NULL;
    #endif
        newp = &new;
        ret = sysctl(name, 5, NULL, 0, newp, sizeof(*newp));
        if (ret == -1)
            return PyErr_SetFromErrno(PyExc_OSError);
        Py_RETURN_NONE;
    }
    

    It turns out some of this can probably also be emulated on Windows via SetInformationObject and QueryInformationJobObject (see ticket), so that is something I'm looking forward to experiment with.

  2. New Pelican website

    Hello there. I present you my new blog / personal website! This is something I've been wanting to do for a very long time, since the old blog hosted at https://grodola.blogspot.com/ was... well, too old. =) This new site is based on Pelican, a static website generator similar to Jekyll. Differently from Jekyll, it uses Python instead of Ruby, and that's why I chose it. It's minimal, straight to the point and I feel I have control of things. This is what Pelican gave me out of the box:

    • blog functionality
    • ability to write content by using reStructuredText
    • RSS & Atom feed
    • easy integration with GitHub pages
    • ability to add comments via Disqus

    To this I added a mailing list (I used feedburner), so that users can subscribe and receive an email every time I make a new blog post. As you can see the website is very simple, but it's exactly what I wanted (minimalism). As for the domain name I opted for gmpy.dev, mostly because I know my name is hard to type and pronounce for non-english speakers. And also because I couldn't come up with a better name. ;)

    GIT-based workflow

    The main reason why I blogged so rarely over the years was mostly because blogger.com provided me no way to edit content in RsT or markdown, and the lack of GIT integration. This made me lazy. With Pelican + GitHub pages the workflow to create and publish new content is very straightforward. I use 2 branches: gh-pages, which is the source code of this web-site, and master, which is where the generated HTML content lives and it is being served by GitHub pages. This is what I do when I have to create a new blog post:

    • under gh-pages branch I create a new file, e.g. content/blog/2020/new-blog-post.rst:
    New blog post
    #############
    
    :date: 2020-06-26
    :tags: announce, python
    
    Hello world!
    
    • commit it:
    git add content/blog/2020/new-blog-post.rst
    git ci -am "new blog post"
    git push
    
    • publish it:
    make github
    

    Within 1 minute or something, GitHub will automatically serve gmpy.dev with the updated content. And this is why I think I will start blogging more often. =) The core of Pelican is pelicanconf.py, which lets you customize a lot of things by remaining independent from the theme. I still ended up modifying the default theme though, writing a customized "archives" view and editing CSS to make the site look better on mobile phones. All in all, I am very satisfied with Pelican, and I'm keen on recommending it to anyone who doesn't need dynamism.

    About me

    I spent most of last year (2019) in China, dating my girlfriend, remote working from a shared office space in Shenzhen, and I even taught some Python to a class of Chinese folks with no previous programming experience. The course was about the basics of the language + basic filesystem operations, and the most difficult thing to explain was indentation. I guess that shows the difference between knowing a language and knowing how to teach it.

    I got back to Italy in December 2020, just before the pandemic occurred. Because of my connections with China, I knew about the incoming pandemic sooner than the rest of my friends, which for a while (until the lockdown) made them think I was crazy. =)

    Since I knew I would be stuck at home for a long time, I bought a quite nice acoustic guitar (Taylor) after many years, and resumed playing (and singing). I learned a bunch of new songs, mostly about Queen, including Bohemian Rhapsody, which is something I've been wanting to do since forever.

    I also spent some time working on a personal project that I'm keeping private for the moment, something to speed up file copies, which follows the experiments I made in BPO-33671. It's still very beta, but I managed to make file copies around 170% faster compared to cp command on Linux, which is pretty impressive (and I think I can push it even further). I will blog about that once I'll have something more solid / stable. Most likely it'll become my next OSS project, even tough I have mixed feelings about that, since the amount of time I'm devoting to psutil is already a lot.

    Speaking of which, today I'm also releasing psutil 5.7.1, which adds support for Windows Nano.

    I guess this is all. Cheers and subscribe!

Social

Feeds