mastodon.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
The original server operated by the Mastodon gGmbH non-profit

Administered by:

Server stats:

353K
active users

@glyph You've got a typo in the second code block: the arguments to the import are backwards.

@jamesh correction much appreciated

@jamesh Fix should be live now.

@glyph 1. I rarely use data classes because of habits, and because without mypyc, they're slower (I choose other slow poisons though...).

But the main reason I sometimes voluntarily don't use a data class (while I'd like to) is that they're not flexible enough. Consider deprecations: how do you rename a parameter? You'd have to write __init__ again right? How do you handle *args and **kwargs?

Definitely not an expert so there might be ways, happy to learn!

@pawamoy

from what I can tell:

```
@dataclass
class StoresDouble:
original_value: InitVar[int]
value: int = field(init=false)
def __post_init__(self, original_value):
self.value = 2 * original_value
```

@pawamoy I'm not proposing eliminating __init__ from the language; dataclasses explicitly have init=False for cases where you really want to do something fancy. I'm just saying it's a bad default, and since __init__ works fine _with_ dataclasses, there's no reason to not have the rest of the behavior.

@glyph I mean that’s _one_ way to end a friendship… 💔

@hynek I'd be a lot happier if dataclasses copied attrs's features more precisely (and if there is serious ensuing discussion, I'll be advocating for a second look at some of those, particularly "private" variable handling), but "defining classes" is one of those places where there's a fair amount of benefit to following the language standard even if it's suboptimal.

@hynek /me waves his arms vaguely in the direction of pretty much every Scheme implementation

@glyph im all for this and am carrying water on hacker news (b.c. it's fun). This is all given that this will likely never happen until a hypothetical python 4 and will be someone else's problem to upgrade :)

@zzzeek I think this will really be one of the easier migrations; it can go as fast or as slow as the community wants to dictate and it can live side-by-side for a really long time if necessary.

@glyph The new syntax could be called `struct`.

struct Point(x: int, y: int): pass

@carlmjohnson I specifically avoided that one because I didn’t want to use more obscure abbreviations with overloaded meanings :-). Although I knew someone would probably suggest it …

@glyph @carlmjohnson can't because of the stdlib module. For historical reasons we'll go with Pascals record, that surely won't confuse anyone 😬😅

@glyph You asked why not all classes should be dataclasses: My first thought was about defaults: The __eq__ default of the dataclass (which says that it's equal when the values are equal) is unlike how normal classes work and unexpected.

And this becomes a bigger problem when you use @property to set values.

@mborus in my ideal world this would raise an exception since this class doesn’t have a __c attribute

@mborus how "unexpected"? this is what the distinction between `is` and `==` is for.

@glyph You have a point there, I should use "is" instead of "==" to compare for equal

Unexpected just because "==" ususally worked like "is" unless it's defined differently.

@mborus `==` is supposed to mean "value equality" whereas `is` is supposed to mean "mutability equality". built-in types follow this convention, so why are user-defined types confusingly different?

>>> x = [1, 2, 3]
>>> y = [1, 2, 3]
>>> a = {1: 2, 3: 4}
>>> b = {1: 2, 3: 4}
>>> print(x == y)
True
>>> x.append(1)
>>> print(x == y)
False
>>> print(a == b)
True
>>> b[5] = 6
>>> print(a == b)
False

@glyph Good question.

I don't know why this was chosen. If I had to guess using "is" as a default for "__eq__" is easy to implement and guaranteed that there's always a result to "==".

I always treat "==" as "whatever __eq__ says or defaults to" and changed it for some of my own classes.

To check I looked up the docs (docs.python.org/release/3.11.2) and "==" is just defined as "equal" (with no concrete definition what that actually means and the class default of "is" is mentioned)

Python documentation
Python documentationBuilt-in TypesThe following sections describe the standard types that are built into the interpreter. The principal built-in types are numerics, sequences, mappings, classes, instances and exceptions. Some colle...

@glyph one thing I've been dealing with and is not handled well by dataclasses: init signatures that don't cleanly match the fields.

You can hand-roll the whole init, but then you loose immutability by default. I had to add it back with custom code.

Having private fields exposed as properties, with the init setting the private fields, also gets messier.

I still try to use dataclasses because of the repr, sort, etc, but they do make those 2 usecases more complicated than the default.

@tonnydourado frozen defaults to False so I don't know what you mean by "immutability by default". If you hand-roll the whole init, the behavior is still the same.

@glyph I mean if you hand roll the init, you can't use frozen=True at all, because you have to mutate self in the init.

@tonnydourado oh, hmmmm. perhaps a design decision that needs revisiting, given that then init=False, frozen=True is sort of mutually incoherent unless you do gymnastics like this

from dataclasses import dataclass

@dataclass(frozen=True, init=False)
class froz:
a: int
b: str
def __init__(self):
super().__setattr__("a", 7)
super().__setattr__("b", "lol")

print(froz())

@glyph there's another thing that intersects with it: a hand-rolled init is never overwritten, even if you don't set init=False. In fact, even if you set init=True, the init already present in the class will not be overwritten.

@glyph which means you can't just crosscheck frozen and init, for instance, to raise an error.

@tonnydourado maybe there needs to be a different hook? __pre_frozen_init__ or something? this is an ugly edge case

@glyph I feel like this is why god created __new__, but for some reason we never use it?

@glyph It most definitely does!

* Mixins
* Behavioural collections
* Dynamically constructed classes

Dataclasses are a special case, and most definitely should remain so.

@glyph Also, dataclasses are a comedy, as currently implemented. attrs is not just a progenitor, it's miles better.

@b11c you think mixins, behavioral collections, and dynamically constructed classes are *not* special cases?

@glyph Doesn't matter. If dataclasses are the default, then there needs to be some support for other use cases. But I can see a new "dataclass" keyword working.

@b11c are mixins even compatible with type annotations? and do dynamically constructed classes even use `class` statements? I don't really understand this criticism.

@glyph 1. Of course they are.
2. Generally no, but can be constructed from different types of classes, with and without attributes.
3. This is not a criticism, I'm just thinking about the common usages.
4. On reflection, I'm actually absolutely certain that dataclasses are a special case and not the other way around.

@b11c

1. huh! I guess they do, you… put a bunch of `@abstractmethod`s on the mixin to say what it needs? I'll have to go fix up some legacy code to do that, at least, where I can't get rid of mixins entirely

2. what is the problem with constructing them from dataclasses, if they are constructed with regular classes?

3. if I'm saying "dataclasses are not a special case" and you are saying "dataclasses are a special case" it feels like definitionally a criticism?

4. agree to disagree 🙂

@b11c I love me a good protocol, but where would you put it in a Mixin?

@glyph Option 1: In the mixin's inheritance tree, like in any other case.

Option 2: Nowhere. That's the beauty of protocols, you don't have to explicitly state it for a class to implement it.

@b11c mixins inherently require things from `self`, so if you do option 2 you get a bunch of type errors when you attempt to use it. but TIL inheriting a protocol in a semi-concrete class is functionally identical to declaring an `@abstractmethod` for all members of that protocol!

@b11c (I don't love the fact that you don't get type errors on any of this stuff until instantiation time, I feel like I really want to know what's concrete and what's abstract explicitly in the implementation. but it's a minor quibble.)

@glyph 2. Dataclasses are a subset of all classes, basically a boilerplate. If they are the default, there are more "special cases" then otherwise.

@glyph 3. It sounds like disagreement. Criticism is personal.

@b11c I'm using it in sense of like "media criticism"; one can criticize an idea and not a person. but fair enough, that is definitely "sense 2" in the dictionary and not obvious when used intransitively

@glyph I think I would prefer the split some other languages already have, between classes and some kind of "record" type whose primary purpose is to be a data-carrying object (which is what I always used tuples for in Python).

At least, if I'm reading it correctly, the things that would benefit most from your proposal are things that primarily are being used as data-carrying objects: primarily-behavior-carrying or mixed-behavior-and-data-carrying objects tend not to be reducible to that kind of simple definition syntax and will usually want to do enough custom stuff that you're writing an __init__() anyway.

@ubernostrum I see this pattern in other OO-ish languages and I really don't get it. Python's an OO language, a container for data *is* an object, that's how you express a collection of attributes. like the only way this would make sense to me is if we added some form of *truly* private variables to classes, where methods exist in a closure with all fields 'nonlocal' by default like some Scheme object models

@ubernostrum re: writing an __init__ anyway, I'm slowly coming to regard __init__ as a bit of an antipattern. __init__ should never do anything but set attributes, if you have custom construction behavior, it makes more sense to have a custom classmethod factory, which can also do things like raising an exception before allocating the object itself as __new__ would, or return new subtypes in some cases to allow for easier API evolution

@ubernostrum this is probably its own blog post

@glyph @ubernostrum
Even though this is all hard for me to follow from my more novice perspective... I did just start using @hynek attrs library, and started thinking about all of this inadvertently. Actually using the library helped dissuade me from hooking into the initialization in the first place, and I found that a classmethod works better for my purposes.

Anyway, thanks for the discussion.

Here's the bit from attrs docs that helped me:

attrs.org/en/stable/init.html#

www.attrs.orgInitialization - attrs 23.1.0 documentation

@glyph Whether it's spelled __init__() or some other way, the fact that I'd be writing custom "build and return an object" logic is what matters to me.

As to the data-carrying versus behavior-carrying distinction, I think in other languages it's more up-front because they enforce a flavor of OO where everything must explicitly be a class or class-like object. But I do still see the distinction in Python code -- it's just that the data-carrying object historically was a tuple.

Speaking for myself, I'm unlikely to use dataclasses or attrs, just because I would first reach for a tuple for that use case. Or if I were going to make it be a class with type-hinted members I'd do it with a library like Pydantic or msgspec that will derive further behavior like validation/serialization for me rather than "just" initialization/comparison. And I'd do it only for objects that are data carriers; for objects that are also carrying behavior I'd use a "standard" (non-dataclass, non-attrs, non-Pydantic, etc.) class.

@glyph I don't have much to add to the discussion that isn't already in these replies, with one exception: you should _definitely_ post this to Discourse 😁

@SnoopJ believe it or not you’re the first person to say this to me :-)