estepanov_coder Feb 2 2022 at 07:47

Stop losing clients! Or how a developer can test a website, by the example of PVS-Studio. Part 1

15 min

PVS-Studio corporate blogIT systems testing*Python*Django*Web services testing*

Tutorial

A website with bugs could be a real pain in the neck for business. Just one 404 or 500 error could end up costing an obscene amount of money for the company and hurt a good reputation. But there is a way to avoid this issue: the website testing. That's sort of what this article is about. After reading this article, you will learn how to test code in Django, create your "own website tester" and much more. Welcome to the article.

How do you feel when you are writing tests?

How would you answer this question? I would say that I'm enjoying writing them. Each developer has his own opinion about tests. Personally, I really love the process. The process of writing test helps me not only write more secure code, but also understand my own and other people's programs better. And cherry on top is that feeling when all tests go green. At this point, my perfectionism scale reaches its peak.

Sometimes when testing, I get sucked into the process as if I'm playing Half-Life. I start to spend all my working time and free time on this process. Of course, over time, I get tired of tests, and then I have to take a break. After the break, I can become a no-lifer again for a few weeks, as if Valve released a new episode. If you're the same as me, then you know what I'm talking about. Enough talk, let's get down to business!

Backend testing

We build our website on Django, so the code examples are for this framework.

Before starting, I invite you to read the list of recommendations that structure the process of writing tests and make it more comfortable. I made the list on the basis of my personal experience and other developers' tips.

The test files are stored in the tests folder inside the app;
model tests, view tests and form tests are located in the test_models.py, test_views.py and test_forms.py respectively;

the test method name starts with the test_ prefix (e.g, test_get_sum or test_status_code);
the name of the class that contains tests has the following form: TestedEntityTests (e.g, TrialTests or FeedbackFileTests).

Testing of models

Let's create the my_app application and fill in the models.py file with the following code:

from django.db import models


class Trial(models.Model):
    """Simple user trial model"""

    email = models.EmailField(
        verbose_name='Email',
        max_length=256,
        unique=False,
    )

    def __str__(self):
        return str(self.email)

    class Meta:
        verbose_name = 'Trial'
        verbose_name_plural = 'Trials'

This model is a simplified version of our Trial model. Here's what we can check with this model:

The verbose_name parameter of the email field – "Email".
The max_length parameter of the email field – 256.
The unique parameter of the email field – False.
The __str__ method returns the email parameter value.
The verbose_name parameter of the model – "Trial".
The verbose_name_plural parameter of the model – "Trials".

I have heard from some programmers that testing of models is a waste of time. But my experience suggests that this opinion is erroneous. Let me show you a simple example. For the email field, we set a maximum length of 256 characters (in accordance with RFC 2821). Accidentally deleting the last digit is not a big deal. If such an oversight suddenly happens, the user with the my_super_long_email@gmail.com (29 characters) email will get an error and won't be able to request a trial. This means that the company will lose a prospective client. Of course, you can write additional validation, but it's better to be sure that the program works successfully without it.

Let's move on to the tests and first decide where they will be located. You can write all the tests in one file — tests.py (Django adds this file when you create the application). Or you can follow the recommendations above and sort them.

If you like the second option more, delete tests.py. Then create the tests folder with the empty __init.py__ file. When running the tests, the file will tell Python where to look for tests. Let's add 3 more files to the same folder: test_forms.py, test_models.py, and test_views.py. The content of the application directory will be something like this:

Let's open the test_models.py file and add the following code to it:

from django.test import TestCase

from my_app.models import Trial


class TrialTests(TestCase):
    """Tests for Trial model"""

    def test_verbose_name(self):
        pass

    def test_max_length(self):
        pass

    def test_unique(self):
        pass

    def test_str_method(self):
        pass

    def test_model_verbose_name(self):
        pass

    def test_model_verbose_name_plural(self):
        pass

Django has a special django.test module for testing. One of the most important classes of this model is TestCase. It is the class that allows you to write tests. To write tests, we just need to inherit our class from TestCase.

All our tests are methods of the TrialTests class. The tests don't check anything yet, but it won't be for long. Each of the methods will test one condition from the list above. Let's figure out how to run the tests. To run all tests of your website at once, enter this command in the console:

python manage.py test

To run tests of a specific class, for example, TrialTests, write:

python manage.py test my_app.tests.test_models.TrialTests

Any of these commands will run our 6 tests. Select one of them, enter it into the console, press Enter. We will get something like this:

The output shows that 6 tests were checked in 0.001 seconds. "OK" at the end of the output indicates their successful execution. 

Now let's write real tests. To write them, we need to access the parameters of the Trial model object. So, we need to create this object. And here, it's important to know that Django uses a separate clean database for tests. Before running the tests, the database is created. After running the tests, the database is deleted. That's what are the first and last lines about in the screenshot above. If suddenly, for some reason, the base could not be deleted, Django tells you about that issue. You need to delete it manually.

To work with this database, you can use 3 methods:

setUp — is executed before running each test;
tearDown — is executed after completion of each test;
setUpTestData — is executed before running all tests of a particular class.

Let's use the latter. Since it is a method of the class, let's add the appropriate decorator. Inside, we create an object of the Trial class and get the email field from it. We will use the field in the tests.

class TrialTests(TestCase):
    """Tests for Trial model"""

    @classmethod
    def setUpTestData(cls):
        """Set up the database before running tests of the class"""

        cls.trial = Trial.objects.create(
            email='test@gmail.com'
        )
        cls.email_field = cls.trial._meta.get_field('email')

Now, when running tests of the TrialTests class, a trial object is created in the new database. After the run, the object is deleted.

Let's write the test of the verbose_name parameter.

def test_verbose_name(self):
    """The verbose_name parameter test"""

    real_verbose_name = getattr(self.email_field, 'verbose_name')
    expected_verbose_name = 'Email'

    self.assertEqual(real_verbose_name, expected_verbose_name)

From the email_field field we extract the value of the verbose_name parameter. Then we apply the assertEqual method from the TestCase class. The method compares two parameters - the real and expected values of verbose_name. If the values are equal, the test runs successfully. Otherwise, it fails.

Let's write the same tests for the max_length and unique parameters.

def test_max_length(self):
    """The max_length parameter test"""

    real_max_length = getattr(self.email_field, 'max_length')

    self.assertEqual(real_max_length, 256)

def test_unique(self):
    """The unique parameter test"""

    real_unique = getattr(self.email_field, 'unique')

    self.assertEqual(real_unique, False)

It's the same as with verbose_name.

By the way, in the unique parameter test, we check that the value is False. The assertFalse command makes it easier to do. Let's rewrite the code of this test.

def test_unique(self):
    """The unique parameter test"""

    real_unique = getattr(self.email_field, 'unique')

    self.assertFalse(real_unique)

The code is shorter and more readable. By the way, Django has many such helpful assertions.

Now let's check the string representation of the object.

def test_string_representation(self):
    """The __str__ method test"""

    self.assertEqual(str(self.trial), str(self.trial.email))

That one's easy. We check that the string representation of the object equals its email.

And the last thing is the tests of the model fields:

def test_model_verbose_name(self):
    """The test of the verbose_name field of the Trial model"""

    self.assertEqual(Trial._meta.verbose_name, 'Trial')

def test_model_verbose_name_plural(self):
    """The test of the verbose_name_plural fields of the Trial model"""

    self.assertEqual(Trial._meta.verbose_name_plural, 'Trials')

Let's access the fields of the Trial model through _meta and compare their value with the expected one.

If you run the tests now, they will run successfully, as before. Well, that's no fun! Let's break something. Let the verbose_name parameter of the Trial model become our victim. Open the model's code and change the value of this field from "Trial" to "Something else". Let's run the tests.

As you can see, one of the tests failed. Django tells us about the failure and that the real value of the field ("Something else") doesn't equal the expected value ("Trial").

Mixins - the helpful guys

Model tests are homogeneous. So, when you have a lot of entities, testing them is not the most pleasant routine. I tried to simplify this process somewhat with mixins. My method is not perfect, and I do not insist on using it. However, you may find it useful.

I think you noticed that when we test the verbose_name, max_length, and unique fields, we see some code duplication. We get the value of the object field and compare it with the expected one. And so it's in all three tests. That means, you can write one function that does all the work.

def run_field_parameter_test(
        model, self_,
        field_and_parameter_value: dict,
        parameter_name: str) -> None:
    """Test field’s parameter value"""

    for instance in model.objects.all():
        # Example 1: field = "email"; expected_value = 256.
        # Example 2: field = "email"; expected_value = "Email".
        for field, expected_value in field_and_parameter_value.items():
            parameter_real_value = getattr(
                instance._meta.get_field(field), parameter_name
            )

            self_.assertEqual(parameter_real_value, expected_value)

Let's figure out what parameters we use. I think it's clear why we use model. Then we use self_ and we need it only to call the assertEqual method. Since self is a keyword in Python, we add _ to avoid misunderstandings. field_and_parameter_value is a dictionary with a field and the value of the field's parameter. For example, if we check the max_length parameter, we can pass email and 256 to this variable. If we check verbose_name, then we pass email and "Email". parameter_name is the parameter being tested: max_length, verbose_name etc.

Now let's turn to the code. First, we get all the objects of the model and go through them. Next, we go through the dictionary that contains fields and expected parameter values. After that, we get the real parameter values by referring to the object. And then we compare them with the expected values. The code is very similar to the one previously written in tests. Only now it's all in one function. By the way, if the function name had started with the test_ prefix, Django would have considered this function the real test and would have tried to run it along with the others.

Let's write mixins. Each field should have its own mixin. For example, let's take the verbose_name and max_length fields.

class TestVerboseNameMixin:
    """Mixin to check verbose_name"""

    def run_verbose_name_test(self, model):
        """Function that tests verbose_name"""

        run_field_parameter_test(
            model, self, self.field_and_verbose_name, 'verbose_name'
        )


class TestMaxLengthMixin:
    """Mixin to check max_length"""

    def run_max_length_test(self, model):
        """Function that tests max_length"""

        run_field_parameter_test(
            model, self, self.field_and_max_length, 'max_length'
        )

We create the necessary method. In this method, we call our single function with the corresponding parameters. self.field_and_verbose_name and self.field_and_max_length are taken from the class inherited from the mixin. Namely, it is taken from the setUpTestData method of the TrialTests class.

@classmethod 
def setUpTestData(cls): 
    # ...
    cls.field_and_verbose_name = {
        'email': 'Email',
    }

    cls.field_and_max_length = {
        'email': 256,
    }

Let's inherit the TrialTests class from our mixins.

class TrialTests(TestCase, TestVerboseNameMixin, TestMaxLengthMixin):
    # ...

If you have a lot of mixins, you can combine them. For example, combine them into a tuple and unpack it when inheriting.

MIXINS_SET = (
    TestVerboseNameMixin, TestMaxLengthMixin,
)


class TrialTests(TestCase, *MIXINS_SET):
    # ...

Now we can rewrite our tests:

def test_verbose_name(self):
    """The verbose_name parameter test"""

    super().run_verbose_name_test(Trial)

def test_max_length(self):
    """The max_length parameter test"""

    super().run_max_length_test(Trial)

When you have a lot of tests for different models, this method turns out to be very useful.

Testing logic

Let's test the code from the views.py file. For example, let's take the function that gets a domain from an email.

def get_domain(email: str) -> str:
    """Return email's domain"""

    try:
        _, domain = email.split('@')
    except ValueError:
        domain = ''

    return domain

This is what the test of the function might look like:

from django.test import TestCase

from my_app.views import get_domain


EMAIL_AND_DOMAIN = {
    'test1@gmail.com': 'gmail.com',
    'test2@wrong_email': 'wrong_email',
    'test3@mail.ru': 'mail.ru',
    'test4@@wrong_email.com': '',
}


class FunctionsTests(TestCase):
    """Test class for views"""

    def test_get_domain(self):
        """Test get_domain function"""

        for email, expected_domain in EMAIL_AND_DOMAIN.items():
            real_domain = get_domain(email)

            self.assertEqual(real_domain, expected_domain)

The constant stores emails and their real domains. In the test, we go through the emails. With the help of the function under test, we get the domain and compare it with the expected one.

Now let's talk a little about one useful construction. Let me change our emails somehow. For example, let's change test1@gmail.com to test1@habr.com , and test2@wrong_email to test2@habr. Time to run the tests.

They failed as expected. But why do we see that only one email is incorrect, even though we changed two? You see, by default, Django doesn't continue testing if a failure occurs. Django simply stops to run tests, as if the command break is called inside the loop. This fact can hardly please you, especially if your tests could take forever. But, luckily, there is a solution — the with self.subTest() construction. The construction is specified after the loop declaration. Let's add it to our test:

# ...
for email, expected_doamin in EMAIL_AND_DOMAIN.items():
    with self.subTest(f'{email=}'):
        real_domain = get_domain(email)

        self.assertEqual(real_domain, expected_doamin)

In the brackets of the subTest method, we specify the line that we want to output when the test fails. In our case, this is the email being tested.

Now, if any test fails, Django will save a report about the failure and continue running. And after the run is completed, Django will display information on every test that failed.

Let's look at the test of another function. When we get a promo code from a user, we transform it into a more convenient form – we remove the "#" characters and spaces. To do this, we have the get_correct_promo function:

def get_correct_promo(promo: str) -> str:
    """Get promo without # and whitespaces"""

    return promo.replace('#', '').replace(' ', '')

This is what the function test might look like:

from django.test import TestCase

from my_app.views import get_correct_promo


PROMO_CODES = {
    '#sast': 'sast',
    '#beauty#': 'beauty',
    '#test test2': 'testtest2',
    'test1 test2 test3': 'test1test2test3',
}


class FunctionsTests(TestCase):
    """Test class for views"""

    def test_get_correct_promo(self):
        """Test get_correct_promo function"""

        for incorrect_promo, correct_promo in PROMO_CODES.items():
            real_promo = get_correct_promo(incorrect_promo)

            self.assertEqual(real_promo, correct_promo)

The constant stores incorrect and correct promo codes. In the test, we go through the promo codes. After that, we compare the promo code obtained with the get_correct_promo function and the correct promo code.

Probably, views testing is the simplest of this triad of tests. In this kind of testing, we simply call the function we need. Then we check that the value the function returns matches the expected one. By the way, when creating constants with data for testing, I recommend that you come up with many different values as possible. This way you will increase the chances that your test will become more effective.

Form tests

Form tests are similar to model tests. In form tests, we can also check fields and methods.

Let's create a Trial model form:

from django import forms

from my_app.models import Trial


class TrialForm(forms.ModelForm):
    """Form of Trial model"""

    class Meta:
        model = Trial
        exclude = ()

This is what the function test might look like:

from django.test import TestCase

from my_app.forms import TrialForm


class TrialFormTests(TestCase):
    """Tests for TrialForm form"""

    def test_field_labels(self):
        """Test field's labels"""

        form = TrialForm()
        email_label = form.fields['email'].label

        self.assertEqual(email_label, 'Email')

In the test, we create an object of our form and compare the label of the field with the expected one. This is how you can write form tests. But we hardly use them. There is a more effective way to test forms. That's sort of what the second part of the article is about.

How to create your "own website tester"

So, you tested the backend of your website. But suddenly, you noticed the 404 error on one of the pages. The tests you wrote did not find this error. These tests also won't help, for example, when searching for dead links on pages. Such tests are simply not designed for bugs of this kind. But then how to catch these bugs? In this case, we need tests that simulate user actions. You can use django.test.Client, but it allows you to run tests only on the website server itself. It's not always convenient. So, let's turn to the Python requests library.

The tests usually turn out to be voluminous. It's better to put them in a separate file (or files), for example — test_requests.py.

Checking status codes

To check the page status code, you need:

Go to the website page;
Get the status code of the website page;
Check that the status code is 200.

The requests library has many useful methods. The head method will help us to do the 1st and the 2nd list points. We will use the method to send the HEAD request to the website pages. Let's import this method.

from requests import head

We only need to pass the URL to the method to get a response with all the necessary information about the page. And from this information, you can extract the status code:

response = head('<page url>')
print(response.status_code)

Now let's move on to test writing. Create the necessary constants: the website domain and the relative paths of the website pages. For simplicity, let's take the domain of only the English website version.

DOMAIN = 'https://pvs-studio.com/en/'

PAGES = (
    '',
    'address/',
    'pvs-studio/',
    'pvs-studio/download/',
    # ...
)

PAGES = (DOMAIN + page for page in PAGES)

Of course, ideally, it is better to take the relative paths of pages from the database. But if there is no such possibility — you can use a tuple.

Let's add the PagesTests class together with the test_status_code test:

from django.test import TestCase


class PagesTests(TestCase):
    """Tests for pages"""

    def test_status_code(self):
        """Test status code for pages"""

        for page in PAGES:
            with self.subTest(f'{page=}'):
                response = head(page) # (1)

                self.assertEqual(response.status_code, 200) # (2) и (3)

In the test, we send the HEAD request to each page and save the response. After that, we check whether the page status code is equal to 200.

Checking links on pages

Here's a way how to check a link:

Send the GET request to the page and get the page content;
Use a regular expression to get all the links from the content;
Go through each link and check that the link status code is 200.

To search for links, let's use the findall method of the re module. To send a GET request, let's use the get method of the same requests library. And remember about the head method.

from re import findall

from requests import get, head

Next, let's move on to the variables. For this test, we need the PAGES constant declared earlier, and the variable with a regular expression for the link.

LINK_REGULAR_EXPRESSION = r'<a[^>]* href="([^"]*)"'

И, наконец, напишем сам тест.

def test_links(self):
    """Test links on all site pages"""

    valid_links = set()

    for page in PAGES:
        page_content = get(page).content # (1)
        page_links = set( # (2)
            findall(LINK_REGULAR_EXPRESSION, str(page_content))
        )

        for link in page_links:
            if link in valid_links:
                continue

            with self.subTest(f'{link=} | {page=}'):
                response = head(link, allow_redirects=True)

                if response.status_code == 200:
                    valid_links.add(link)

                self.assertEqual(response.status_code, 200) # (3)

We send the GET request to each page and extract content from the received response. Next, we use the regular expression and the findall method and get all the links located on the page. We put these links to the set to remove duplicates. The last stage is a familiar scenario: we go through all the links, send the HEAD request to these links, and check the status code. If the link variable is a redirect, the allow_redirects parameter will indicate whether we can execute the redirect. By default, its value is False. We also add valid links to set in order not to send a request to them in the future.

By the way, sometimes you can find relative links on the page. For example, "/ru/pvs-studio/faq/". The website adds the URL to these links, while the test does not do this. As a result, the test cannot handle the request.

To avoid this issue, let's create a function:

SITE_URL = 'https://pvs-studio.com'

def get_full_link(link: str) -> str:
    """Return link with site’s url"""

    if not link.startswith('http'):
        link = SITE_URL + link

    return link

If the received link is relative, the function adds the URL of the website to this link. Now in the test, when we receive the link, we will use the following function:

# ...
for link in page_links:
    link = get_full_link(link)
# ...

There are situations when the test does not show the real status code of the page. It is usually either 403 or 404. For example, for this page, head will return the 404 status code. This happens because some websites don’t want to give page data to robots. To avoid this, you need to use the get method, and for greater confidence in the test, add a header with the User-Agent.

from requests import get

head_response = head(link)
print(head_response.status_code) # 404

get_response = get(link, headers={'User-Agent': 'Mozilla/5.0'})
print(get_response.status_code) # 200

Redirect Tests

Another variant of tests where you can use requests is redirect tests. To check redirect tests, we need to:

Follow the link and get the response;
Compare the response URL with the expected one.

So, we need two URLs. The first URL is a redirect link that the user clicks on. The second one is the URL of the page that the visitor eventually went to. As in the example with status codes, it's better to get these URLs from the database. If this is not possible, then I recommend using a dictionary.

REDIRECT_URLS = {
    '/ru/m/0008/': '/ru/docs/',
    '/en/articles/': '/en/blog/posts/',
    '/ru/d/full/': '/ru/docs/manual/full/',
}

Let's remember about the SITE_URL variable created earlier.

SITE_URL = 'https://pvs-studio.com'

Now, let's write the test.

def test_redirects(self):
    """Test the correctness of the redirect"""

    for link, page_url in REDIRECT_URLS.items():
        with self.subTest(f'{link=} | {page_url=}'):
            page_response = head(
                SITE_URL + link, allow_redirects=True
            ) # (1)

            expected_page_url = SITE_URL + page_url

            self.assertEqual(page_response.url, expected_page_url) # (2)

First, we send the HEAD request using the link. At the same time, we allow the usage of redirects. From the received response, we take the URL of the page and compare it with the expected one.

The requests library allows you to perform many different website tests. The main methods for tests, as you may have noticed are head and get. But there are other methods. And they can also be useful. It all depends on your tasks.

Conclusion

So, now you know how to write tests for the backend and how to create your "own website tester". We will talk about form testing, JS, page translation testing and so on in the next parts of the article. Do you have any comments or feedback? Write down them bellow or to my instagram. Thank you for reading this article, and see you soon!)

Hubs: