Interface and its implementation in Python

June 25, 2022 by Maria

By an interface I understand an object (in a general sense) that abstracts out operations common for some concrete objects. In fact, those concrete objects may be defined by “having certain operations”. Example of such interface from mathematics is a vector V space over a field K, having:

  • vector addition
  • scalar multiplication.

Among concrete classes of vector space, we have a coordinate space and space of all continuous functions. The interface definition may be used both ways - to construct an object that is a vector space and to check if an existing object fulfills the criteria and is therefore a vector space.

Writing software usually involves abstracting out some parts of code to a common interface. Following Real Python article we will explore using different implementation of interfaces in Python and see how they relate to interface defined above.

Good interface criteria

In Haskell concept of interfaces is expressed with Typeclasses. One simple example is Ord typeclass:

class Eq a => Ord a where
  compare :: a -> a -> Ordering
  (<) :: a -> a -> Bool
  (<=) :: a -> a -> Bool 
  ...

A type MyType is an instance of Ord if it is an instance of Eq and defines at least compare or <= method (rest may be derived by using comparing and equality).

Criteria

Having mathematical interface example and functional interface example we explore how this concept may be implemented in Python exploring four implementations of the interface. To compare them and set a baseline for further search, we define a set of properties that a good interface should follow: 1. Show a way to create a concrete object: - what methods should be defined - what are those methods inputs and outputs - what are the properties of requires methods (e.g. logs or raises specific errors, has an inverse) 2. Allow for a check if an object belongs to the interface - check if required methods are defined - check required methods inputs and outputs 3. Be readable.

To set a level of expectations, Haskell’s typeclasses score in all three criterias - (1) in the definition of typeclass required methods with input and output types are defined. Properties of required methods are imposed by conditions on method’s parameters (being for example an instance of a different class) and we do not allow for side effects. (2) is given “for free” - if the value of function parameter is not instance of required class, a compilation error will be raised. (3) implementation is readable, for example writing an instance of Ord:

data MyType = Cat | Dog deriving (Eq)
instance Ord MyType where
  (<=) Dog Cat = True
  (<=) Cat Dog = False
  (<=) Dog Dog = True
  (<=) Cat Cat = True

Checking Cat < Dog gives False.

Going back to Python, in RealPython article, authors follow the example of a data parser, which has two methods - load data and extract text. The concrete parser implementation may be an XML, PDF, JSON, sound parser etc. We construct an example of JSON parser following the interface and discuss pros and cons of the approach. The code to experiment with is on GITHUB. The rest of the blogpost is an exploration of Python implementation of the interface. Examples are gathered from the code I have seen and from the research on good practices in OOP and Python in particular.

Informal interface

In this approach, a basic class plays a role of an interface. It contains methods without implementation that should be filled with a concrete representation. The relation of the concrete class and the interface is based on inheritance.

class InformalParserInterface:
    def load_data_source(self, path: str, file_name: str) -> str:
        """Load in the file for extracting text."""
        raise NotImplementedError

    def extract_text(self, full_file_name: str) -> dict:
        """Extract text from the currently loaded file."""
        raise NotImplementedError

When we want to construct a new parser - for PDF documents or XML or JSON, we create a new class that inherits over InformalParserInterface. If methods definitions for concrete class are not implemented, NotImplementedError is raised when a method is called. An example of concrete implementation is an email parser:

class JSONParser(InformalParserInterface):
    def load_data_source(self, path: str, file_name: str) -> str:
        with open(os.path.join(path, file_name)) as f:
            data = json.load(f)
        return data

    def extract_text(self, full_file_path: str) -> dict:
        pass

This approach does not prevent from implementing a subclass that does not contain a required method. Such subclass will still be correctly recognized by issubclas() method. There is some fix to that by using metaclasses and overriding __instancecheck__, but this is both controvertial and verbose.

Another unwanted behavious is that InformalParserInterface appears in method resolution order MRO. Not knowing a concrete class makes it harder to debug.

Is it a good interface?

According to the criteria we adopted - no.

Althought this implementation is readable and roughtly shows way to create a concere object we cannot be sure if an object belongs to this interface without some further modifications of the interface.

ABC interface

The interface is implemented by an abstract base class (ABC) which are virtual subclasses. Following PEP-3119:

ABCs are simply Python classes that are added into an object’s inheritance tree to signal certain features of that object to an external inspector. Tests are done using isinstance(), and the presence of a particular ABC means that the test has passed.

from abs import ABC, abstractmethod

class FormalParserInterface(ABC):

    @abstractmethod
    def load_data_source(self, path: str, file_name: str):
        """Load in the data set"""
        raise NotImplementedError

    @abstractmethod
    def extract_text(self, full_file_path: str):
        """Extract text from the data set"""
        raise NotImplementedError

On the bright side, using this approach raises an error when a class inheriting from FormalParserInterface, and not having all the methods defined, is instantiated. That pushes a potential runtime error earlier in the program run. Typecheckes in IDE and mypy also highlights a problem when such mistake is made.

The interface is used via inheritance. Nevertheless, there is a way to check if a class is a concrete implementation of the interface even if explicitly does not inherit by changing the definition of __subclasshook__. That allows function issubclass to correctly identify if a class may be considered a subclass.

Is it a good iterface?

For Python - yes. It scores well in all three categories. Given the class methods have type hints and comments regarding hints on concrete implementation. Still, thehints are not as good as the properties defined strictly. # Protocol interface An even simpler and more Pythonic version of duck typing are Protocols(see PEP-0544).

class ProtocolPeopleParserInterface(Protocol):

    def load_data_source(self, file_path: str) -> str:
        ...

    def extract_persons(self, file_path: str) -> List[Person]:
        ...

Concrete class that fits the protocol :

class JSONParser:
    def load_data_source(self, path: str, file_name: str) -> str:
        with open(os.path.join(path, file_name)) as f:
            data = json.load(f)
        return data

    def extract_text(self, full_file_path: str) -> dict:
        pass

Protocols are useful when we use external libraries and cannot define a concrete class.

A nice comparison between ABC and Protocol by ArjanCodes.

Is it a good interface?

Yes and no. It shows a way how to create a concrete object and in some cases (like use of external libraries) may simplify or even allow for type hints. Protocols are also more readabel and elegant than duck typing showed in the first example. Nevertheless, they fail at checking if an object is an instance of a protocol. This may be fixed by adding runtime_checkable decorator, see documentation.

Pydantic interface

In this approach, a concrete interface is an instance of FunctionalParser class. It follows similar logic to Haskell typeclass. Treating functions as a first class citizen, we want them to be a properties of FunctionalPeopleParser class.

class FunctionalPeopleParser(BaseModel):
    load_data_source: Callable[[str], str]
    extract_persons: Callable[[str], List[Person]]

    def get_persons(self, file_path: str) -> List[Person]:
        data: str = self.load_data_source(file_path)
        persons = self.extract_persons(data)
        return persons

Types of the class fields - callables - may be even more specific using Protocol from typing module.

Is it a good interface?

No. This experiment failed due to the way mypy checks fields being Callable. Unfortunately, even though it is impossible to create concrete class of FunctionalPeopleParser without load_data_source and extract_persons fields, but it is hard to assure proper types. See related GitHub issue. Another problem is verbose implementation of concrete typeclass - functions for fields must be defined outside the class.

Summary

Python was not designed to be strict and make it so by pydantic-mypy-functions combo makes it hard to implement and to read. Following Python Zen (LINK):

If the implementation is hard to explain, it’s a bad idea.

If the implementation is easy to explain, it may be a good idea.

The conslusion is that we can get furthest and still be pythonic using abstract base classes abc module, as this is not much more verbose than duck-typing informal approach, and pushes a time when error is raised sooner in time.

There should be one– and preferably only one –obvious way to do it.

Although that way may not be obvious at first unless you’re Dutch.

Anyway it is a good idea to abstract out parts of the code having same properties. Writing in Python, it is a good idea to use abc or Protocol.