How to ensure data type correctness at Python

Photo by Andrea De Santis on Unsplash

Python is a programming language which is famous for its polymorphism. It means that the same operation will work differently with objects of different data types. It’s very useful but in some cases can lead to unexpected behaviour of your code.

Python is a programming language which is famous for its polymorphism. It means that the same operation will work differently with objects of different data types. It’s very useful but in some cases can lead to unexpected behaviour of your code. Let’s imagine, that you’ve sent a request and have got a response. You expected to get two numbers a=1 and b=2, and you were going to calculate their sum (a+b=3). But you’ve received the same numbers in form of strings (a=”1" and b=”2"). Then a+b=”12". In this case your code will work, but the answer is incorrect. What can be done in this case? You should assure that the variables have expected data types, and if they do not, change them.

Let’s consider the situation using the described case, but in form of function which gets two arguments and allows to calculate their sum:

                def summation(a, b):
    return a+b
            

First of all, you can announce what data types are expected:

                def summation(a: int, b: int) -> int:
    return a+b
            

The syntax won’t lead to an error occurrence in case of using wrong data types:

But it is a good practice which makes it easier for you and for anybody else, who is going to reuse your code, to understand what the function should take and return.

Now, we can move on to converting data types. The simplest solution will be the following:

                def summation(a: int, b: int) -> int:
    return int(a)+int(b)
            

On the one hand, if a and b have data types which can be converted into integers without any problems (e.g., strings from the example above), it will work. But if at least one variable can’t be converted to integer, you’ll get a ValueError:

On the other hand, if variables are represented by floats, then the function won’t return any error, but the resulting sum will be wrong because applying “int” rounds a float down.

So, it’s not a perfect solution. First of all, let’s try to make error message clearer.

                def summation(a: int, b: int) -> int:
    if type(a) == int and type(b) == int:
        return a+b
    raise ValueError(f«a is {type(a)} and b is {type(b)} while a is int and b is int are expected»)
            

Thus, you can see that any deviation from required data types will return a ValueError with a provided text which allows to understand what variable became a source of the problem.

At the same time we’ve lost an opportunity to make the calculations in case when a and b are strings which can be easily converted to integers.

Moreover, even for such a simple function we had to add some functionality which distracts from its main functionality and makes the code less transparent. In order to save clearness of the code, and add data types verification at the same time, we can use pydantic library. It can be easily installed using “pip install pydantic” and used for data types validation as well as converting them to required ones when it’s possible.

Initially, we need to create a class which declares what data types are expected, and use the class in our function.

                from pydantic import BaseModel, conint
class MyModel(BaseModel):
    a: conint(strict=True)
    b: conint(strict=True)
def summation(a: int, b: int) -> int:
    model = MyModel(a=a, b=b)
    return model.a + model.b
            

Pay attention that instead of “int” I use “conint(strict=True)”. This is because “int” or “conint(strict=False)” will suffer from rounding down even in this library:

As you can see in the last example on the screenshot, it returns an error which enumerates all variables which became a reason for the error occurence. It helps easily solve any problem related to data types.

But again, we had to add additional code for the validation. Pydantic can do better. It provides us a decorator which interpretes types described at the first line of a function declaration, and automatically transforms this into a model.

                from pydantic import validate_arguments
@validate_arguments
def summation(a: int, b: int) -> int:
    return a+b
            

For getting an error in case of float inputs, you’d better declare types as following:

                @validate_arguments
def summation(a: conint(strict=True), b: conint(strict=True)) -> int:
    return a+b
            


And finally let me say couple more words about the pydantic library.

This library allows to create whole systems of nested classes as well as supports recurrent validation of such data types as List[int]. Moreover, developers declare that it works much faster than its competitors. Detailed documentation can be found by the link: https://pydantic-docs.helpmanual.io/


Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies and get more readers

Join other developers and claim your FAUN account now!

Stats
1

Influence

174

Total Hits

1

Posts