Blog posts for tags/new-api

  1. Detecting memory leaks in C extensions with psutil and psleak

    Memory leaks in Python are usually straightforward to diagnose. Just look at RSS, track Python object counts, follow reference graphs, etc. But leaks inside C extension modules are another story. Traditional memory metrics such as RSS and VMS fail to reveal them because Python's memory allocator (pymalloc) sits above the platform's native heap. If something in an extension calls malloc() without a corresponding free(), that memory often won't show up in RSS / VMS. You have a leak, and you don't know.

    psutil 7.2.0 introduces two new APIs for C heap introspection, designed specifically to catch these kinds of native leaks. They give you a window directly into the underlying platform allocator (e.g. glibc's malloc), letting you track how much memory the C layer actually allocates. If your RSS is flat but your C heap usage climbs, you now have a way to see it.

    Why native heap introspection matters

    Many Python projects rely on C extensions: psutil, NumPy, pandas, PIL, lxml, psycopg, PyTorch, custom in-house modules, etc. And even CPython itself, which implements many of its standard library modules in C. If any of these components mishandle memory at the C level, you get a leak that doesn't show up in:

    • Python reference counts (sys.getrefcount).
    • tracemalloc module.
    • Python's gc stats.
    • RSS, VMS or USS due to allocator caching, especially for small objects. This can happen, for example, when you forget to Py_DECREF a Python object.

    psutil's new functions let you query the allocator (e.g. glibc) directly, returning low-level metrics from the platform's native heap.

    heap_info(): direct allocator statistics

    psutil.heap_info() exposes the following metrics:

    • heap_used: total number of bytes currently allocated via malloc() (small allocations).
    • mmap_used: total number of bytes currently allocated via mmap() or via large malloc() allocations.
    • heap_count: (Windows only) number of private heaps created via HeapCreate().

    Example:

    >>> import psutil
    >>> psutil.heap_info()
    pheap(heap_used=5177792, mmap_used=819200)
    

    Reference for what contributes to each field:

    Platform Allocation type Field affected
    UNIX / Windows small malloc() ≤128 KB without free() heap_used
    UNIX / Windows large malloc() >128 KB without free(), or mmap() without munmap() (UNIX) mmap_used
    Windows HeapAlloc() without HeapFree() heap_used
    Windows VirtualAlloc() without VirtualFree() mmap_used
    Windows HeapCreate() without HeapDestroy() heap_count

    heap_trim(): returning unused heap memory

    psutil.heap_trim() provides a cross-platform way to request that the underlying allocator free any unused memory it's holding in the heap (typically small malloc() allocations).

    In practice, modern allocators rarely comply, so this is not a general-purpose memory-reduction tool and won't meaningfully shrink RSS in real programs. Its primary value is in leak detection tools. Calling psutil.heap_trim() before taking measurements helps reduce allocator noise, giving you a cleaner baseline so that changes in heap_used come from the code you're testing, not from internal allocator caching or fragmentation.

    Real-world use: finding a C extension leak

    The workflow is simple:

    1. Take a baseline snapshot of the heap.
    2. Call the C extension hundreds of times.
    3. Take another snapshot.
    4. Compare.
    import psutil
    
    psutil.heap_trim()  # reduce noise
    
    before = psutil.heap_info()
    for _ in range(200):
        my_cext_function()
    after = psutil.heap_info()
    
    print("delta heap_used =", after.heap_used - before.heap_used)
    print("delta mmap_used =", after.mmap_used - before.mmap_used)
    

    If heap_used or mmap_used values increase consistently, you've found a native leak.

    To reduce false positives, repeat the test multiple times, increasing the number of calls on each retry. This approach helps distinguish real leaks from random noise or transient allocations.

    A new tool: psleak

    The strategy described above is exactly what I implemented in a new PyPI package, which I called psleak. It runs the target function repeatedly, trims the allocator before each run, and tracks differences across retries. Memory that grows consistently after several runs is flagged as a leak.

    A minimal test suite looks like this:

    from psleak import MemoryLeakTestCase
    
    class TestLeaks(MemoryLeakTestCase):
        def test_fun(self):
            self.execute(some_c_function)
    

    If the function leaks memory, the test will fail with a descriptive exception:

    psleak.MemoryLeakError: memory kept increasing after 10 runs
    Run # 1: heap=+388160  | uss=+356352  | rss=+327680  | (calls= 200, avg/call=+1940)
    Run # 2: heap=+584848  | uss=+614400  | rss=+491520  | (calls= 300, avg/call=+1949)
    Run # 3: heap=+778320  | uss=+782336  | rss=+819200  | (calls= 400, avg/call=+1945)
    Run # 4: heap=+970512  | uss=+1032192 | rss=+1146880 | (calls= 500, avg/call=+1941)
    Run # 5: heap=+1169024 | uss=+1171456 | rss=+1146880 | (calls= 600, avg/call=+1948)
    Run # 6: heap=+1357360 | uss=+1413120 | rss=+1310720 | (calls= 700, avg/call=+1939)
    Run # 7: heap=+1552336 | uss=+1634304 | rss=+1638400 | (calls= 800, avg/call=+1940)
    Run # 8: heap=+1752032 | uss=+1781760 | rss=+1802240 | (calls= 900, avg/call=+1946)
    Run # 9: heap=+1945056 | uss=+2031616 | rss=+2129920 | (calls=1000, avg/call=+1945)
    Run #10: heap=+2140624 | uss=+2179072 | rss=+2293760 | (calls=1100, avg/call=+1946)
    

    Psleak is now part of the psutil test suite. All psutil APIs are tested (see test_memleaks.py), making it a de facto regression-testing tool.

    It's worth noting that without inspecting heap metrics, missing calls in the C code such as Py_CLEAR and Py_DECREF often go unnoticed, because they don't affect RSS, VMS, and USS. I confirmed this by commenting them out. Monitoring the heap is therefore essential to reliably detect memory leaks in Python C extensions.

    Under the hood

    For those interested in seeing how I did this in terms of code:

    • Linux: uses glibc's mallinfo2() to report uordblks (heap allocations) and hblkhd (mmap-backed blocks).
    • Windows: enumerates heaps and aggregates HeapAlloc / VirtualAlloc usage.
    • macOS: uses malloc zone statistics.
    • BSD: uses jemalloc's arena and stats interfaces.

    References

    • psleak, the new memory leak testing framework.
    • PR-2692, the implementation.
    • #1275, the original proposal from 8 years earlier.
  2. System load average on Windows in Python

    psutil 5.6.2 is out. It implements an emulation of os.getloadavg() on Windows, kindly contributed by Ammar Askar, who originally implemented it for CPython's test suite.

    This idea has been floating around for quite a while. The first proposal dates back to 2010, when psutil was still hosted on Google Code, and it popped up multiple times over the years. There's a bunch of info online mentioning the pieces you'd theoretically use (the so-called System Processor Queue Length), but I couldn't find any real implementation. A quick search suggests there's real demand for this, but very few tools provide it natively (the only ones I could find are sFlowTrend and Zabbix). So I'm glad this finally landed in psutil / Python.

    Other improvements and bugfixes in psutil 5.6.2

    The full list is in the changelog. A couple worth mentioning:

    • #1476: ability to set a process's high I/O priority on Windows.
    • #1458: colorized test output. Nobody will use this directly, but it's nice and I'm porting it to other projects I maintain (e.g. pyftpdlib). Good candidate for a small PyPI module that could also include the unittest extensions I've been re-implementing piece by piece:
      • #1478: re-running failed tests.
      • display test timings / durations. This is something I'm also contributing to CPython: BPO-4080 and PR-12271.

    About me

    I'm currently in China (Shenzhen) for a mix of vacation and work, and I will likely take a break from Open Source for a while (about 2.5 months), during which I'll also go to the Philippines and Japan.

    External

  3. Announcing psutil 5.6.0

    psutil 5.6.0 is out. Highlights: a new Process.parents() method, several important Windows improvements, and the removal of Process.memory_maps() on macOS.

    Process parents()

    The new method returns the parents of a process as a list of Process instances. If no parents are known, an empty list is returned.

    >>> import psutil
    >>> p = psutil.Process(5312)
    >>> p.parents()
    [psutil.Process(pid=4699, name='bash', started='09:06:44'),
     psutil.Process(pid=4689, name='gnome-terminal-server', started='09:06:44'),
     psutil.Process(pid=1, name='systemd', started='05:56:55')]
    

    Nothing fundamentally new here, since this is a convenience wrapper around Process.parent(), but it's still nice to have it built in. It pairs well with Process.children() when working with process trees. The idea was proposed by Ghislain Le Meur.

    Windows

    Certain Windows APIs that need to be dynamically loaded from DLLs are now loaded only once at startup, instead of on every function call. This makes some operations 50% to 100% faster; see benchmarks in PR-1422.

    Process.suspend() and Process.resume() previously iterated over all process threads via CreateToolhelp32Snapshot(), which was unorthodox and broke when the process had been suspended by Process Hacker. They now call the undocumented NtSuspendProcess() / NtResumeProcess() NT APIs, same as Process Hacker and Sysinternals tools. Discussed in #1379, implemented in PR-1435.

    SE DEBUG is a privilege bit set on the Python process at startup so psutil can query processes owned by other users (Administrator, Local System), meaning fewer AccessDenied exceptions for low-PID processes. The code setting it had presumably been broken for years and is now finally fixed in PR-1429.

    Removal of Process.memory_maps() on macOS

    Process.memory_maps() is gone on macOS (#1291). The underlying Apple API would randomly raise EINVAL or segfault the host process, and no amount of reverse-engineering produced a safe fix. So I removed it. This is covered in a separate post.

    Improved exceptions

    One problem that affected psutil maintenance over the years was receiving bug reports whose tracebacks did not indicate which syscall had actually failed. This was especially painful on Windows, where a single routine may invoke multiple Windows APIs. Now the OSError (or WindowsError) exception includes the syscall from which the error originated. See PR-1428.

    Other changes

    See the changelog.

  4. Improved process_iter()

    This is part of the psutil 5.3.0 release (see the changelog for the full list of changes).

    The old pattern

    Iterating over processes and collecting attributes requires more boilerplate than it should. A process returned by psutil.process_iter() may disappear before you access it, or require elevated privileges, so every lookup has to be guarded with a try / except:

    >>> import psutil
    >>> for proc in psutil.process_iter():
    ...     try:
    ...         pinfo = proc.as_dict(attrs=['pid', 'name'])
    ...     except (psutil.NoSuchProcess, psutil.AccessDenied):
    ...         pass
    ...     else:
    ...         print(pinfo)
    ...
    {'pid': 1, 'name': 'systemd'}
    {'pid': 2, 'name': 'kthreadd'}
    {'pid': 3, 'name': 'ksoftirqd/0'}
    

    This is not decorative. It's necessary to avoid the race condition.

    The new pattern

    5.3.0 adds attrs and ad_value parameters to psutil.process_iter(). With these, the loop body becomes:

    >>> import psutil
    >>> for proc in psutil.process_iter(attrs=['pid', 'name']):
    ...     print(proc.info)
    ...
    {'pid': 1, 'name': 'systemd'}
    {'pid': 2, 'name': 'kthreadd'}
    {'pid': 3, 'name': 'ksoftirqd/0'}
    

    Internally, process_iter() attaches an info dict to the Process instance. The attributes are pre-fetched in one shot. Processes that disappear during iteration are silently skipped, and attributes that would raise AccessDenied get assigned ad_value, which defaults to None:

    for p in psutil.process_iter(['name', 'username'], ad_value="N/A"):
        print(p.name(), p.username())
    

    Performance

    Beyond the syntactic win, the new syntax is also faster than calling individual methods in a loop. process_iter(attrs=[...]) is equivalent to using Process.oneshot() on each process (see Making psutil twice as fast for how that works): attributes that share a syscall or a /proc file are fetched together instead of re-read on every method call, which is a lot faster.

    Comprehensions

    With the exception boilerplate out of the way, comprehensions finally work cleanly. E.g. getting processes owned by the current user can be written as:

    >>> import getpass
    >>> from pprint import pprint as pp
    >>> pp([(p.pid, p.info['name']) for p in psutil.process_iter(attrs=['name', 'username']) if p.info['username'] == getpass.getuser()])
    [(16832, 'bash'),
     (19772, 'ssh'),
     (20492, 'python')]
    
  5. Sensors: temperatures, battery, CPU frequency

    psutil 5.1.0 is out. This release introduces new APIs to retrieve hardware temperatures, battery status, and CPU frequency information.

    Temperatures

    You can now retrieve hardware temperatures (PR-962). This is currently available on Linux only.

    • On Windows it's hard to do in a hardware-agnostic way. I ran into 3 WMI-based approaches, none of which worked with my hardware, so I gave up.
    • On macOS it seems relatively easy, but my virtualized macOS box doesn't support sensors, so I gave up for lack of hardware. If someone wants to give it a try, be my guest.
    >>> import psutil
    >>> psutil.sensors_temperatures()
    {'acpitz': [shwtemp(label='', current=47.0, high=103.0, critical=103.0)],
     'asus': [shwtemp(label='', current=47.0, high=None, critical=None)],
     'coretemp': [shwtemp(label='Physical id 0', current=52.0, high=100.0, critical=100.0),
                  shwtemp(label='Core 0', current=45.0, high=100.0, critical=100.0),
                  shwtemp(label='Core 1', current=52.0, high=100.0, critical=100.0),
                  shwtemp(label='Core 2', current=45.0, high=100.0, critical=100.0),
                  shwtemp(label='Core 3', current=47.0, high=100.0, critical=100.0)]}
    

    Battery status

    Battery status information is now available on Linux, Windows and FreeBSD (PR-963).

    >>> import psutil
    >>>
    >>> def secs2hours(secs):
    ...     mm, ss = divmod(secs, 60)
    ...     hh, mm = divmod(mm, 60)
    ...     return "%d:%02d:%02d" % (hh, mm, ss)
    ...
    >>> battery = psutil.sensors_battery()
    >>> battery
    sbattery(percent=93, secsleft=16628, power_plugged=False)
    >>> print("charge = %s%%, time left = %s" % (battery.percent, secs2hours(battery.secsleft)))
    charge = 93%, time left = 4:37:08
    

    CPU frequency

    Available on Linux, Windows and macOS (PR-952). Only Linux reports the real-time value (always changing); other platforms return the nominal "fixed" value.

    >>> import psutil
    >>> psutil.cpu_freq()
    scpufreq(current=931.42925, min=800.0, max=3500.0)
    >>> psutil.cpu_freq(percpu=True)
    [scpufreq(current=2394.945, min=800.0, max=3500.0),
     scpufreq(current=2236.812, min=800.0, max=3500.0),
     scpufreq(current=1703.609, min=800.0, max=3500.0),
     scpufreq(current=1754.289, min=800.0, max=3500.0)]
    

    What CPU a process is on

    Tells you which CPU a process is currently running on, somewhat related to Process.cpu_affinity() (PR-954). It's interesting for visualizing how the OS scheduler keeps evenly reassigning processes across CPUs (see the cpu_distribution.py script).

    CPU affinity

    A new shorthand is available to set affinity against all eligible CPUs:

    Process().cpu_affinity([])
    

    This was added because on Linux (#956) it is not always possible to set affinity against all CPUs directly. It is equivalent to:

    psutil.Process().cpu_affinity(list(range(psutil.cpu_count())))
    

    Other bug fixes

    See the full list in the changelog.

Social

Feeds