Proper abstractions in Python: not so easy

Jeroen Bos
19 min readFeb 16


A deep dive into Python, its type system, Composition vs Inheritance & Protocols


There’s this thing that’s bugging me, and I’ll use this medium to figure out how I feel about it. I apologize if this gets long, but depth has no shortcuts.

Coding in Python often frustrates me. That’s the motivation. It tells me something’s wrong. Let’s see if I can put it to words, and learn something.

First of all, I should introduce a little of my background.

I’m a C# developer by heart. In adopting the pythonic way, which is different for sure, I had to let go of many tendencies and instead find the pythonic way of doing things.

My view on Python has gone through a few phases:

  • When I was first dabbling with it, I was aware that I had to keep an open mind for different perspectives. Learning it was, for one, frustrating
  • For a long time I thought python had some real beauty going for it
  • Other times I was struck with melancholy by the lack of rigidity. I’m sure many people with various backgrounds can relate.
  • Sometimes I think I understood the pythonic way, but then still somebody points out how my style is actually oozing my C# background. Sometimes I knew it and was stubborn; sometimes I didn’t even know it.

We’ll see how well-versed I actually am in Python now. Quite a bit, but there’s always more to learn. The new growing sense of frustration inside me is a good indication of that!

At the moment of writing I don’t know what the issue is. Most likely it’s caused by my own misunderstanding. It could be in the way I and my team develop in Python, as our conventions have established themselves rather ad hocly. Lastly and least likely, it may be with Python itself.

Let’s dive in.

Composition vs inheritance

We’ve all been told that we should prefer composition over inheritance, but do we all know why?

In generic terms we say that composition is more versatile and more testable, but we’re going to get to the bottom of this, because I am having my doubts.

Let’s discuss this with a simple example of composition in Python:

class Element:
def my_func(self) -> str:
return "an element"

class Composition:
def __init__(self, element):
self.element = element

def my_func(self) -> str:
# demonstrate (contrived) composition using the element:
return "a composition with " + self.element.my_func()

# simulate usage of the types above:
def f(obj):

element = Element()
f(element) # prints "an element"
# proof that the element argument can be substituted for a composition
f(Composition(element)) # prints "a composition with an element"

Technically this is composition, because the class Composition is composed of an Element. But that on its own doesn’t make it useful: it just makes it a different class that happens to take an Element. What makes it useful is that this Composition can be called as though it were an Element. That’s because the Composition and Element classes both have the same interface.

The interface is obviously a single function called my_func taking only the self argument and returning a string.

In that sense an instance of Element is substitutable by an instance of Composition, as hinted by the commented line, and thus follows the Liskov Substitution Principle.

So that’s a rather interesting observation, that in software engineering, the term composition implies not only composition in the regular sense, but also that the object implements the contract of the elemental types it consists of. That’s what makes them substitutible.

In fact, composition in the regular sense without the substitutability also exists in software engineering, but that’s a different kind of composition. It more commonly goes under the name of tuple/data structure/wrapper/…, depending on context.


Let’s be a bit more technical and get our definitions straight: what are the requirements to qualify for “composition”?

  • The behavior of the composing object should be able to differ from the elemental object and other kinds of compositions.
  • That’s a rather trivial requirement. Its abstract expression might make it seem like it’s quite something but it really isn’t.
  • Just to be clear, “kind” here doesn’t refer to “type” but something more abstract: any objects groupable by some distinguishing feature, i.e. its “kind”.
  • Objects abide by the Liskov Substitution Principle (both at runtime and design-time).
  • That implies that the behavioral difference should be encapsulated: a caller shouldn’t need to care about what kind of object it is to get the object to run the appropriate code: that should be abstracted away.

I would say that the above example satisfies these requirements and therefore is composition. ✅. Let’s go home, we’re done… Or are we?

The type system

I’m going to change the example only ever so slightly:

class Element:
def my_func(self) -> str:
return "an element"

class Composition:
- def __init__(self, element):
+ def __init__(self, element: Element):
self.element = element

def my_func(self) -> str:
return "a composition with " + self.element.my_func()

# simulate usage of the types above:
-def f(obj):
+def f(obj: Element):

element = Element()

We’ve made the code make use of Python’s type system, and suddenly at design-time the type checker mypy is complaining:

error: Incompatible types in assignment (expression has type ‘composition”, variable has type “Element”)

That’s clearly unacceptable. You could alleviate this with typing the parameter to be Element | Composition,but that’s no tenable strategy as the full type could be undeterminable.

So although at runtime nothing has changed, this code — merely by adding type hints! — is not abiding by the LSP at design-time anymore. What would we need to do to make mypy happy? What do we need to get composition both at runtime and design-time? Or perhaps the most pressing question is: why does mypy insist that this is wrong even though the type-unannotated version is absolutely fine?

That is the central question of this essay.

The solutions

There are 3 ways of solving this. The first two I will just mention and dismiss quickly here, and then we dive deeply into the other which will prove out to be no solution, but there’s so much to learn from asking why not.

  • Inheritance
  • Magic methods (decorators, metaclasses and the like)
  • Protocols


Deriving Composition from Element would indeed make the type checker happy. That would however obviously and completely defeat the purpose of the adage “composition vs inheritance”. If we need to do composition with inheritance, then that’s not really composition now it is? Or at least so the adage seems to claim (the inclusion of this odd “solution” could be taken as foreshadowing 😉).

Magic methods

Can we achieve composition without inheritance through decorators?
Yes, as evidenced by this example:

class Element:
kind = "Element"
def my_func(self):
return "an element"

The implementation of create_composition is explicitly left out, but you can imagine that it achieved composition. There are several implementations:

  • creating a type hierarchy behind the scenes
  • differentiate based on reference equality, e.g. if it were a doubleton type
  • hiding the design-time typing complexities plaguing regular composition since the advent of the type system by sweeping those parts off to a library

We can surmise that, yes, you can achieve composition without inheritance through decorators. But it involves so much magic behind the scenes that it simply doesn’t qualify for a general-use pattern exhibiting composition. Call it “complicated composition”, if you will: composition that a novice Python programmer would never be able to invent nor understand, composition that any unfamiliar reader would need to read multiple times to understand, composition that is not widely in use, composition that is so complicated it’s probably an overengineered solution to the problem. In that category, composition without inheritance can be achieved in many other ways, like with various other magic methods, metaclasses, or introspection.

On top of that, composition was supposed to be more versatile compared to inheritance, but whatever implementation the decorator or any other magic chooses, you will be heavily tied to its internal design decisions, i.e. less versatile.

In short, both the implementation of the magic as well as its usage are too complicated and cumbersome. I’ve mentioned this category of solution. Now let’s move on.


By and large, the remainder of this essay will work towards and discuss Protocols. Although in the end it will turn out not to be exactly the desired approach to composition, we can’t dismiss the adage without examining this option meticulously.

The reasoning will go along the following lines:

  • mypy is just touting Python’s design, so the issues lies in fact not within mypy; it’s solely about Python
  • Some of the history of Python comes into play
  • We’ll discuss the parts of Python that have friction with each other: nominal typing vs structural typing
  • Python has offered a solution to that friction, through a concept called Protocol
  • We’ll discuss Protocols in detail, including
    - showing that you can implement composition through Protocols
    - and why we don’t hear of them often and why we don’t use them
  • We’ll conclude that those reasons for not using them are related to why it’s not a good use case for composition

So. Why does the topic of Protocols even pop up? It is a journey, but we start with the question

why does mypy insist that the type-annotated example is wrong even though unannotated it's absolutely fine?

Getting mypy out of the equation

Type hints are part of Python, and mypy diligently checks the rules.

To discuss type hints further, we will need to define these terms:

  • nominal typing
  • structural typing
  • dynamic typing

These are not easy topics: even the mypy homepage says that dynamic typing is equivalent to “duck” typing, whereas they probably meant that structural typing is equivalent to “duck” typing. Maybe definitions differ. I don’t know. But it’s obviously complicated.

Nominal typing and structural type systems are opposites (in the pragmatic sense that they’re virtually the only two of their kind, barring some academic languages, and therefore “the opposite” means “not that one but the other”, but not in any hard sense).

Dynamically typed is opposite to statically typed (languages).

  • By a dynamically typed language we mean that verification of types happens at runtime rather than design-time.
  • Statically typed (the opposite) is thus that types are checked at design-time (interchangeable with compile-time) and not at runtime.
  • - A nominal type system is where some instance of a type T is said to be of type U, say, if and only if T is U or T derives from U (“derives”, as in inheritance).
  • - A structural type system is where some instance of a type T is said to be of type U if T has all attributes and methods that U has, i.e. that it has the same interface (and possibly more). Such a relationship is also stated as T is a structural subtype/subclass of U.

Those definitions are quite dense if you haven’t encountered them before. That’s okay. Just keep referring back to them.

“Dynamic vs static” is orthogonal to “structural vs nominal”: you can design languages in each of the four quadrants. Nominal and dynamic really don’t complement each other well, so no such mainstream languages exist, but the other combinations exist in mainstream.

One last generic and dense comment: nominal subtyping is a subset of structural subtyping. That is, types that are nominal subtypes of another are then also structural subtypes of the other (but not necessarily the reverse). So nominal subtyping is stricter than structural subtyping.

Let’s take the conversation back to Python.

Python is dynamic
Python has always been a dynamically typed language. Consider this example:

def f(s):
return len(s)


The above was always fine, until you ran it. Historically, (read: Python 2 but actually Python < 3.5) Python didn’t have type hints, but that doesn’t mean Python didn’t have types! It had types all along; they just weren’t made explicit in source code. The above example fails because the type int doesn’t have a __len__ attribute, implying there were types (at runtime) all along. That’s what’s meant in the definition: “verification of types happens at runtime”. So just because there’s no mention of types in the source code doesn’t mean they don’t exist.

Even with the addition of type annotations in Python 3.5, Python is technically still 100% dynamically typed, as running a type checker is completely optional. But type annotations — when checked — are considered constructing a static type system: because the checking happens at design-time rather than runtime. There’s nothing wrong here: you can have both static and dynamic type systems.

And, just to rule out this cornercase, inspecting type annotations to get different behavior is also still dynamic typing.

Python is structural
Besides being a dynamically typed language, Python has also always been a structurally typed language. Structural typing means duck typing. It means that instances do not have to conform to the type of the symbol (=part of source code) that represents them, rather having the interface suffices. In less abstract language for example, arguments do not have to be of the parameter’s type; as long as all signatures are accounted for and compatible, arguments of different types are perfectly accepted. The same goes for assignments, etc.

We will note that mypy by default uses nominal subtyping for checking the flow of types:

# import class Element from first example
def f(obj: Element):

class NominalSubclass(Element):

f(NominalSubclass()) # no error

class StructuralSubclass:
def my_func(self) -> str:

f(StructuralSubclass()) # error!

But again, it’s not really mypy‘s choice: it’s just diligently following the rules of Python. If mypy is just following the rules of the language, and T and U are structural subtypes thereby following the rules of language, why should there be an error?

We really must direct the proverbial magnifying glass (and my initial gut feeling’s guess) away from mypy:it’s because Python’s rules for type hinting state that they are to be interpreted as nominal types.

But didn’t we just mention that Python is a structural language? But the type hints follow a nominal type system? That can’t be right, now can it? On the face of it that sounds like the rules are inconsistent or applied inconsistently.

So what is it? Nominal or structural?

The language seems to be fundamentally structurally typed, but the type hinting system seems to be purely nominally typed.

Recall that nominal subtyping is a subset of structural subtyping. Therefore the type hints restrict the use of the language. In view of the fact that historically the type hints came later, this feels rather bolted on.

Now that sounds like we’re honing in on a problem. Do you agree with me that this smells? Having boiled down the problem to such a concise formulation, I feel a bit stupefied to be honest.

Let’s make doubly sure there even is a problem by looking at it from another point of view, that of other languages. Some languages are purely nominally typed, and that begs the question: why would python being (restricted to) purely nominally typed be a problem if other languages have exactly that and are doing just fine?

Well first off, given that duck typing was part of python from the start, it heavily influenced its design and that of its early and most used libraries. Cutting it out at a later stage not only goes against the design of the language, but against much of its ecosystem as well. Second, other languages have been designed with other features to lift these limitations in other ways. It’s having neither of these feature sets that’s the problem. Given the above, it’s probably unsurprising to learn that the restriction (to nominal typing) at a later stage leads to adverse consequences. Fair enough, Python is a versatile language and therefore it takes a long time to feel them, but we feel them eventually nonetheless.

The cheap way out
One way out — but it’s a cheap one — is to realize that type hinting is called type hinting, because the type annotations don’t enforce anything, they just hint. If they were actually enforced, then the language would truly be confined, but it’s not enforced, so it isn’t restricted.

So that’s one way out, but leaves much to be desired.

The next way out is admitting that the type hinting system in fact does support structural typing; it’s just a little more involved. It involves a concept called a Protocol.

Summary so far

So far we haven’t found a way to do composition without inheritance or magic. In our search, we described some aspects of the language and how it deals with types, and in doing so we’ve stumbled on something that appears to be a problem:

the type system restricts using nominal typing only, which is against a fundamental pythonic principle.

But our team has been doing so, heedlessly! Now we have 2 problems! We have neither composition, nor the comprehension whether to continue developing nominal-style, or to embrace pythonic structuralism.

And I hope I’ve convinced you that this truly was a problem: in Python versions [3, 3.8) the type hints really didn’t cover the fundamentals of the language. Only with Python 3.8 (which has been released very recently if you think about it), this missing structural part finally got covered in the type system. They achieved that through the introduction of a concept called Protocol.


In short, by deriving your class from a Protocol , you tag your type to be interpreted as structurally typed, not nominally, and they were introduced in Python 3.8 with PEP 544.

Use Protocols for composition?
In this section I’d like to note that it is in fact possible to get composition without inheritance through Protocols. Just like in our very first example that was not type-annotated, the code just worked. Then type hints came along and the type checker started hindering. But now we can satisfy the type checker and omit inheritance/nominal subtypes:

-class Element:
- def my_func(self) -> str:
- return "an element"
class Element(Protocol):
def my_func(self):
return "in protocol"

class Composition:
def __init__(self, element: Element):
self.element = element

def my_func(self) -> str:
return "a composition with " + self.element.my_func()

# simulate usage of the types above:
def f(obj: Element):

-element = Element()
+element: Element = ... # type: ignore # Imagine getting an Element from somewhere

The Liskov Substitution Principle is satisfied both at runtime and design-time! So Protocols can solve our problem! We have composition without inheritance!

The upsides are that

  • usages of Elementare substitutable by usages of Composition, both at runtime and design-time
  • the obj can be typed to Element, benefitting from type checking
  • The downsides are manyfold though, but I will get to them in a dedicated section, under “Protocol difficulties”.

Now I have the following question:

Why don’t we use Protocols? Are they too advanced? Discouraged? Unnecessary? Ignored out of sheer convention?

I’ll argue that there’s something to be said for all of those, and that it explains why Protocols aren’t in mainstream use now, or at the very least, not in our team.

From September 2015, with the introduction of type hints, until the introduction of Protocols in October 2019 with Python 3.8, it seems that the static type system truly didn’t fully cover the fundamental Pythonic way of development. That would at least partially explain why it’s not in use so much today: people coped without before. I’m just conjecturing here, but it seems plausible that conventions established in that 4 year gap period are still lingering today. Furthermore, PEP 3107 with “function annotations” (in a retroperspective nut shell: type hints with undefined semantics) were available since Python 3.0 in 2006!

Just to be clear, I’m not saying that the incomplete covering of the type system was unsound design. I’m fully aware of the rationale (emphasis mine):

opening up Python code to easier static analysis …”

And let’s not forget that there pre-existed a whole unannotated ecosystem at the time. That limits the feasibility of certain designs that would have been feasible from a clean slate. Let’s continue assuming that this was the best design at the time.

Protocolsare an unconventional concept, as the language has been fine (and evolved) without them for a long time.

Protocols were introduced only in Python 3.8. That may explain why adoption isn’t widespread yet: it’s relatively new. It also raises the question of whether developers are even aware of their existence. After all, not everyone is interested in keeping themselves up-to-date of their favorite type system’s features. I’m not even sure people have a favorite type system 😅

By the way, whether you’re aware of it or not, you have most likely been using the feature already. Common examples of protocol types from the standard library are Callable, Iterable, Iterator and Awaitable.mypy doesn’t complain in those cases. You don’t have to think about it (in cases above) because those Protocols have already been implemented by the standard library and generally this abstraction does not leak. That you can roll into using this concept mostly seamlessly is amazing, and quite the achievement of the feature. But I would like to contrast the usage of these protocol types though with designing and creating protocol types yourself.

Protocols are a relatively new and thus unadopted concept.

That’s something to have in mind. But let’s look onwards for reasons why we’re not using Protocols, specifically, technical reasons, which requires we explain Protocols first.

Protocol difficulties
My claim is that the type Protocol is more of an advanced concept. The downside is that, until the time the entire team is effortlessly competent on it, mental capacity is expended each time the feature is used (i.e. written or read).

When dealing with a Protocol, some of that mental capacity is expended on awareness on things like:

  • the difference of structural typing a nominal typing:
    - keeping track which is adhering to which typing system
    - isinstance checks on protocols do not work by default (partially achievable with runtime_checkable — but it’s complicated — )
  • peculiar behavior of Protocols compared to most of Python:
    - types can subtype Protocol explicitly or implicitly
    - implicit implementers of Protocols can’t reuse from superclass (well, the idea of reusing parts of your supertype is a nominal typing thing anyway).
    - Protocols exhibit much of the same behavior as abstract base classes, but not exactly
    - Protocols cannot be instantiated (this follows from the previous statement)
    - attributes must be declared on the type (like dataclasses); not assigned in __init__: in fact, those are errors
  • in other areas:
    - the IDE can’t aid in refactoring when a Protocol is implemented implicitly (fortunately the type checker should spot these)
    - new cases of mypy situations and errors
    - let alone advanced usages like the ability to emulate intersection types

The developer needn’t know all of these in detail; but they certainly add cognitive load. It’s mitigated by the fact/mindset that Protocols have no runtime semantics other ways than ABCs do (barring inspection, obviously), and it is therefore simply thought of as a static-only feature. Still. Maybe this all is not too bad for you. In my opinion it’s a sizeable cognitive step. Manageable, but sizable. But I build on experience other type systems in other languages. If this were your first language, I could very well imagine that grokking this is going to take a while.

In short, Protocols are an advanced concept.

The official guideline in the form of PEP 544 literally discourages the use of duck typing:

“We still slightly prefer nominal subtyping over structural subtyping in the static typing world.”

We could go into a long digression as to why that is the official preference, but we’ll leave that to other sources to explain. To be honest, I don’t know. I’ll quickly give my best speculation, namely that the guideline is for non-experts only. One of the distinguishing characteristics of an expert is that they know when to adhere to a guideline, and when not to. They are expected to know the intricacies that lead to the guideline in the first place, thereby knowing when it applies and when it doesn’t. In this sense, as the advancedness of this feature requires expertise, the guideline is for non-experts only.

We’ll just surmise that Protocols are officially (slightly) discouraged.

When to use Protocols?
To be experts in the language means that we must know when to overrule the discouragement and go ahead anyway. And at the moment I myself do not even know.

My team and I have been developing strictly nominally typed. It might be that some of the frustration we’ve been experiencing is because we’ve been hammering on a screw. A screw amidst many nails, but a screw nonetheless?

Maybe it’s time to ask the question: when should you define your own Protocol?

  • In comparison to inheritance, Protocol implementers don’t need to explicitly derive from their signature definition. If that’s a requirement somehow then Protocols might be your only options. I can hardly come up with scenarios though. Maybe there’s a use case in dynamically loaded modules? C interop? The best one I’ve come up with is preventing circular imports, which is still not the best argument.
  • Protocols don’t require implementation of a whole abstract base class, but just the parts of the signature you’re interested in, e.g. in testing (with mocks).
  • We will also note that Protocols come in 2 varieties as defined by PEP 544: Data Protocols and Non-Data Protocols. Data Protocols have only data attributes; Non-Data Protocols have only methods.
    - A Data Protocol can come up when you want to think about your objects from an equality comparison point of view:
    if your objects should be equal merely by their attributes’ values being equal, then that’s a good use case for a Protocol. Although it can also be solved with dataclasses. TypedDicts come to mind, but those aren’t Protocols; they’re special-cased.
  • I’m sure I’m missing scenarios here.

We will circle back to our primary question by asking: should we use Protocols to do composition without inheritance?

Except for the first scenario above, the rest isn’t particularly suitable for achieving composition.

Protocols are officially not encouraged which is already an indication that they shouldn’t be used for daily matters such as composition. On top of that, there’s the whole slew of difficulties for Protocols previously outlined. If anything, the primary purpose of a Protocol is to signal about what kind of object it is. That is to be thought of as a structural type rather than nominal is the main intent of deriving from a Protocol. It is to adopt a different mindset. And yes, the type checker switches mindset with you. However, I must say that in order to attain composition we don’t really care about in which type system mindset we achieve it, as long as it’s achieved.

If you have a reason (none of which I can come up with) that prevents you from deriving the composed class from the elemental class, then Protocols are your only bet.

Other than that is seems composition in Python doesn’t exist without inheritance without giving up on substitutability.

Why inheritance might not be a bad thing in Python

It’s worth noting here that the main argument against inheritance (your mileage may vary) in the discussion versus composition is that most languages only allow deriving from one class and therefore this limits the malleability of your type: you would only be able to do composition through inheritance in one class hierarchy, whereas with proper composition your class could participate in many. Other languages often have interfaces to accommodate for that limitation. However, that’s not an issue at all in Python, as it supports multiple inheritance, and coincidentally therefore doesn’t need the separate concept of an interface. So you might just deem this composition with inheritance unproblematic and call it a day: derive Composition from Element and call it composition, as it doesn’t (really) hinder your type’s malleability.

From another language’s point of view, the statement that composition in Python doesn’t exist without inheritance could be unsurprising: composition in other languages is implemented using the concept of an interface (in the sense like in C#/Java/TS). The closest concept in Python is the abstract class, but implementation of abstract classes falls into the category of “inheritance”. In a sense, the concept of “implementing interfaces”, a concept parallel to inheritance, is in Python folded into inheritance. So that there’s then composition implementations only through inheritance could be no surprise.

By the way, adhering to nominal typing with design and development makes the code feel more and more like that of other languages.


What a long ride it has been.

We investigated an innocuous looking question, but have uncovered quite a number of wrinkles. Only until recently, the discussion of composition vs inheritance was moot: there was no composition without inheritance.

The type hinting system put the dagger through composition. We’ve learned about Protocols. That is the new and only non-magical way of attaining pure composition without inheritance, but it’s beset by many disadvantages: advanced, discouraged, unconventional and relatively unadopted. It is quite striking that the usage of a fundamental principle of the language (structural typing) has been relegated to the “advanced” concepts.

Given that the pattern of composition will not go away anytime soon, we have three options:

  • embrace inheritance
  • employ Protocols more
  • or maybe we’re just being zealots, and 95% type-annotated is enough

But most of all we have learned that a ubiquitous adage — however ubiquitous in other languages — doesn’t automatically carry over into Python. We have to keep thinking.