Thanks to visit codestin.com
Credit goes to github.com

Skip to content

uuid.get_node is failing to find a stable identifier #587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zanieb opened this issue Apr 16, 2025 · 17 comments
Open

uuid.get_node is failing to find a stable identifier #587

zanieb opened this issue Apr 16, 2025 · 17 comments
Labels
bug Something isn't working

Comments

@zanieb
Copy link
Member

zanieb commented Apr 16, 2025

❯ uvx python -c "import uuid; print(uuid.getnode())"
87776899200804
❯ uvx python -c "import uuid; print(uuid.getnode())"
16554352112436
❯ uvx --no-managed-python python -c "import uuid; print(uuid.getnode())"
178803724683704
❯ uvx --no-managed-python python -c "import uuid; print(uuid.getnode())"
178803724683704

https://docs.python.org/3/library/uuid.html#uuid.getnode

... If all attempts to obtain the hardware address fail, we choose a random 48-bit number with the multicast bit (least significant bit of the first octet) set to 1 as recommended in RFC 4122.

@zanieb zanieb added the bug Something isn't working label Apr 16, 2025
@zanieb
Copy link
Member Author

zanieb commented Apr 16, 2025

Digging into this with a debugger...

> /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/lib/python3.13/uuid.py(585)_unix_getnode()
-> uuid_time, _ = _generate_time_safe()
(Pdb) n
> /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/lib/python3.13/uuid.py(586)_unix_getnode()
-> return UUID(bytes=uuid_time).node
(Pdb) uuid_time
b'\xd0%\xd5,\x1a\xce\x11\xf0\x91v\x956~\xda\xd8u'
(Pdb) _generate_time_safe()
(b'/8\xf3\xbe\x1a\xcf\x11\xf0\x8b\xd3il\x92\xc8$ ', -1)
(Pdb) _generate_time_safe()
(b'/\xfc\x1bn\x1a\xcf\x11\xf0\x8b\xd3il\x92\xc8$ ', 0)
(Pdb) _generate_time_safe()
(b'0\x82b\xd2\x1a\xcf\x11\xf0\x8b\xd3il\x92\xc8$ ', 0)
> /opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/uuid.py(585)_unix_getnode()
-> uuid_time, _ = _generate_time_safe()
(Pdb) n
> /opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/uuid.py(586)_unix_getnode()
-> return UUID(bytes=uuid_time).node
(Pdb) uuid_time
b'\xe9\xdc\xc0\xd4\x1a\xce\x11\xf0\x81\xe0\xa2\x9e\xfc~\x89\xb8'
(Pdb) _generate_time_safe()
(b'Lq\xbc\x18\x1a\xcf\x11\xf0\x88\x80\xa2\x9e\xfc~\x89\xb8', None)
(Pdb) _generate_time_safe()
(b'M\x8aU$\x1a\xcf\x11\xf0\x88\x80\xa2\x9e\xfc~\x89\xb8', None)
(Pdb) _generate_time_safe()
(b'N\x87\xab\xde\x1a\xcf\x11\xf0\x88\x80\xa2\x9e\xfc~\x89\xb8', None)

Then looking at the CPython source to try to discern the difference here...

class SafeUUID:
    safe = 0
    unsafe = -1
    unknown = None
static PyObject *
py_uuid_generate_time_safe(PyObject *Py_UNUSED(context),
                           PyObject *Py_UNUSED(ignored))
{
    uuid_t uuid;
#ifdef HAVE_UUID_GENERATE_TIME_SAFE
    int res;

    res = uuid_generate_time_safe(uuid);
    return Py_BuildValue("y#i", (const char *) uuid, sizeof(uuid), res);
#elif defined(HAVE_UUID_CREATE)
    uint32_t status;
    uuid_create(&uuid, &status);
# if defined(HAVE_UUID_ENC_BE)
    unsigned char buf[sizeof(uuid)];
    uuid_enc_be(buf, &uuid);
    return Py_BuildValue("y#i", buf, sizeof(uuid), (int) status);
# else
    return Py_BuildValue("y#i", (const char *) &uuid, sizeof(uuid), (int) status);
# endif /* HAVE_UUID_CREATE */
#else /* HAVE_UUID_GENERATE_TIME_SAFE */
    uuid_generate_time(uuid);
    return Py_BuildValue("y#O", (const char *) uuid, sizeof(uuid), Py_None);
#endif /* HAVE_UUID_GENERATE_TIME_SAFE */
}

@zanieb
Copy link
Member Author

zanieb commented Apr 16, 2025

It seems like HAVE_UUID_GENERATE_TIME_SAFE is set in our build but not the HomeBrew one (due to the Py_None return). I'm surprised using uuid_generate_time_safe would break getnode though? Perhaps the problem is deeper, or there's a problem with CPython's code? I don't quite understand how using generate_time_safe is a valid way to get an identifier for the node in the first place?

@indygreg
Copy link
Collaborator

indygreg commented Apr 16, 2025

This feels like a CPython bug/quirk. The code prefers obtaining a random time for the node value on Linux if a C function is available. It then falls back to trying to find an actual hardware value.

Feels like the ordering is wrong. But this may have been a choice to favor performance or security or something. But it isn't documented inline in uuid.py.

@zanieb
Copy link
Member Author

zanieb commented Apr 16, 2025

Ah, they extract the node from the UUID:

def _unix_getnode():
    """Get the hardware address on Unix using the _uuid extension module."""
    if _generate_time_safe:
        uuid_time, _ = _generate_time_safe()
        return UUID(bytes=uuid_time).node

@sfc-gh-tteixeira
Copy link

sfc-gh-tteixeira commented Apr 17, 2025

From my limited understanding of py_uuid_generate_time_safe(), the difference between python-build-standalone and the binaries I get from Brew or Python.org is the value of HAVE_UUID_GENERATE_TIME_SAFE / HAVE_UUID_CREATE at compile time.

So could this issue be addressed by setting the value of those directives the same way Brew and Python.org do?

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

As in, we have HAVE_UUID_GENERATE_TIME_SAFE and they do not? It seems wrong to say HAVE_UUID_GENERATE_TIME_SAFE is not available if it is?

@sfc-gh-tteixeira
Copy link

I'm not sure I understand the question, but it seems to me that the issue here is that python-build-standalone says the libraries whose presence govern the value of HAVE_UUID_GENERATE_TIME_SAFE / HAVE_UUID_CREATE are not available during build.

Now, do we know if they indeed aren't available? And would there be some way to make them available?

Again, my understanding here is quite limited, so these may well be stupid questions :D

@sfc-gh-tteixeira
Copy link

One more data point: I just tried installing Python 3.13 with Pyenv and it behaves the same as Brew and Python.org. That is to say, this uuid.get_node issue does not reproduce there either.

So the appropriate libraries that govern HAVE_UUID_GENERATE_TIME_SAFE / HAVE_UUID_CREATE do appear to exist locally in some form (perhaps pyenv brings them in?)

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

I think you have it backwards. Here's what I'm saying:

❯ uvx --managed-python python -m sysconfig | grep UUID_GEN
	HAVE_UUID_GENERATE_TIME_SAFE = "1"
❯ uvx --no-managed-python python -m sysconfig | grep UUID_GEN
	HAVE_UUID_GENERATE_TIME_SAFE = "0"

Our libuuid has generate_time_safe, so CPython uses it and the behavior of getnode changes. I do not think it could be correct for us to say we don't have generate_time_safe to force fallback to the other behavior.

If it's not desirable for getnode to use generate_time_safe, then that should be changed in CPython. Or, if generate_time_safe isn't doing the "right" thing in our version of libuuid, then we need to do more exploration to understand why.

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

For reference, here's the libuuid we're using:

"uuid": {
"url": "https://sourceforge.net/projects/libuuid/files/libuuid-1.0.3.tar.gz",
"size": 318256,
"sha256": "46af3275291091009ad7f1b899de3d0cea0252737550e7919d17237997db5644",
"version": "1.0.3",
"library_names": ["uuid"],
"licenses": ["BSD-3-Clause"],
"license_file": "LICENSE.libuuid.txt",
},

For Homebrew Python, HAVE_UUID_H is not set and I don't see a dependency in the formula — I think they just don't provide libuuid at all? They do have HAVE_UUID_UUID_H which refers to uuid/uuid.h (I do not understand the difference yet). python/cpython#85077 (comment) might be relevant? Looking into that next.

@sfc-gh-tteixeira
Copy link

Ahh, that is so odd. Thanks for clarifying.

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

Looking at a Linux container... HAVE_UUID_GENERATE_TIME_SAFE is set there but we have different behavior. I'll need to look at how they're linking libuuid.

❯ docker run -it python:latest /bin/bash
root@284a617ff8d3:/# python -m sysconfig | grep UUID
	HAVE_UUID_CREATE = "0"
	HAVE_UUID_ENC_BE = "0"
	HAVE_UUID_GENERATE_TIME_SAFE = "1"
	HAVE_UUID_H = "1"
	HAVE_UUID_UUID_H = "0"
	MODULE__UUID_CFLAGS = "-I/usr/include/uuid"
	MODULE__UUID_LDFLAGS = "-luuid"
	MODULE__UUID_STATE = "yes"
root@284a617ff8d3:/# python -c "import uuid; print(uuid.getnode())"
2485723387650
root@284a617ff8d3:/# python -c "import uuid; print(uuid.getnode())"
2485723387650
❯ docker run -it --rm ghcr.io/astral-sh/uv:0.6.12-bookworm-slim /bin/bash
root@5034bceb1c24:/# uvx --managed-python python -m sysconfig | grep UUID
	HAVE_UUID_CREATE = "0"
	HAVE_UUID_ENC_BE = "0"
	HAVE_UUID_GENERATE_TIME_SAFE = "1"
	HAVE_UUID_H = "0"
	HAVE_UUID_UUID_H = "1"
	MODULE__UUID_STATE = ""
root@5034bceb1c24:/# uvx --managed-python python -c "import uuid; print(uuid.getnode())"
16834508421030
root@5034bceb1c24:/# uvx --managed-python python -c "import uuid; print(uuid.getnode())"
113544309182920

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

per https://packages.debian.org/bookworm/libpython3.11-stdlib it looks like they're using https://packages.debian.org/bookworm/libuuid1 which presumably has different behavior.

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

It still seems vaguely wrong to depend on this to extract the MAC address, as the man page only says

The uuid_generate_time() function forces the use of the alternative algorithm which uses the current time and the local ethernet MAC address (if available).

@indygreg
Copy link
Collaborator

Version 1 UUIDs encode the time and MAC address. I think what CPython is doing here is deferring to the external implementation of UUID so they don't have to think about resolving a MAC. This is reasonable behavior IMO.

@zanieb
Copy link
Member Author

zanieb commented Apr 17, 2025

I briefly looked at switching to the util-linux implementation of libuuid (at https://www.kernel.org/pub/linux/utils/util-linux/) but it fails to build on macOS (and I did not bother trying Linux too). I'm hesitant to go deep on making it work.

Version 1 UUIDs encode the time and MAC address. I think what CPython is doing here is deferring to the external implementation of UUID so they don't have to think about resolving a MAC.

Except the documentation does not guarantee that the MAC address is used? If that was reliable, I imagine this would be working. Otherwise, yeah — I agree it seems nice to rely on that instead of the complicated fallback methods they have.

@sfc-gh-tteixeira
Copy link

The more I look at this the more it looks like a Python bug. So I filed one here.

But I still wonder whether python-build-standalone could build Python without libuuid, to make sure it behaves the same as python.org binaries. Is that a possibility?

sfc-gh-tteixeira added a commit to streamlit/streamlit that referenced this issue Apr 21, 2025
…#11138)

## Describe your changes

Rename the new `stableRandomMachineId` to `machineIdV4` to make it clear
(1) we can't rely on `machineIdV3`, and (2) `machineIdV4` is the right
one to use.


[Originally](b44df19)
I avoided doing this because the goal is to keep both IDs side by side,
since that will help debug things or even roll back in the future. But
even if we name it `machineIdV4` nothing precludes us from keeping them
side by side.

In the meantime, we're also following bug reports
astral-sh/python-build-standalone#587 and
python/cpython#132710 , which might
rehabilitate `machineIdV3`.

## GitHub Issue Link (if applicable)

n/a

## Testing Plan

- Explanation of why no additional tests are needed: this PR just
renames variables.
- ~~Unit Tests (JS and/or Python)~~
- ~~E2E Tests~~
- ~~Any manual testing needed?~~

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants