[Separation of Concerns], even if not perfectly possible, is yet the only available technique for effective ordering of one’s thoughts, that I know of.” — Edsger W. Dijkstra http://deviq.com/separation-of-concerns/

Background

The project under test is a cinema ticket booking system. Users can issues certain queries related to schedules for upcoming movie showtimes. System models include:

Cinema¹: The geographic places you would go to watch movies
Theater²: These are the little rooms inside each cinema
Movie: “Nativity”, “Star Wars”, “Passion of the Christ”…
Schedule: aka, showtimes

The focus of our interest should be the schedules.

The ingredients

PyTest

A Python unit testing facility which features:

Fixture dependency injection
Isolated
Composable
Plus unittest compatibility

See this slide for advanced features of PyTest. Also, I would recommend this site if you are really into testing, especially Python <3.

In my case, I use pytest dependency injection to inject Flask app test client, and dataset into each test methods.


class TestQuerySchedules():

     def test_query_by_movie_title(self, client, dataset_saigon_weekend):

         response = client.get('/api/query')

YAML

If you have heard of JSON, then you should see YAML³. It is much friendlier than JSON and yet it is by no means less expressive than JSON. Hence, it is much easier to maintain especially you have thousands of LOC.

The following fragment of YAML presents a list of movies, each of which has code, title and status:


movies:

-   code: 10.5240/0067-DEFB-A9F6-DD23-70DA-1

     title: The Fox

     status: 2

-   code: 10.5240/0067-DEFB-A9F6-DD23-70DA-3

     title: Star Wars

     status: 2

No curly braces, double quotes whatsoever! And it also looks very Pythonic <3. FYI, Google AppEngine uses .yaml files for application configurations.

Factory Boy

Initially I used Factory Boy to replace the needs for file-based fixtures. I do enjoy the concepts of building test fixtures with factory:

Using custom sequence to generate unique yet meaningful values
Faker to generate human friendly fields
Built-in integration with SQLAlchemy, Google Datastore, Django…
Fixture dependency support with SubFactory


class ScheduleFactory(SQLAlchemyModelFactory):

     class Meta:

         model = Schedule

         sqlalchemy_session = db.session

     theater = SubFactory(TheaterFactory)

     movie = SubFactory(MovieFactory, status=Movie.STATUS_PUBLISHED)

While testing features, we do not really care about a field’s value, we care more about the logicalness in such values. For example, a fixture with full name “Elton John”, we would expect:

This is a person
This person’s email is “elton.john@gmail.com”
This person’s job is “singer”
He works at a company named “Rocket Music Entertainment Group”

Factory Boy stubs in default, meaningful values for fields unless you override it with one your own.

You can find more about Factory Boy and its inner working here

Docstring

In Python, docstrings are blocks of string right beneath a Python class/method/function quoted by triple quotation marks. The purpose of docstrings are to describe the class/method/function it belongs to.


def demo():

     """

     Demo is short for demonstration

     """ 

     pass

It is very nice of Python <3 that it lets you access this piece of information out-of-the-box.

And yes - you can parse this block of text to PyYAML to complete the big picture of UnitTesting Ingredients: PyTest, Factory Boy, YAML and docstring

See this thread on Stack Overflow

See more on docstrings, PEP8 and PEP257.

The mix

You need to pip install PyYAML as an dependency of your project.

Now, in order to test our showtime query features, we really need a lot of data. Unlike other operations in CRUD, ad-hoc queries needs a manageable well-controlled dataset to verify whether or not such and such combination of filtering conditions would contain the correct subset of data while maintaining the constraints of data integrity enforced by DBMS.

In other words, you have to fake them consistently and fake a lot of them. I have given a few criteria of acceptance regarding our testing setup:

Manageable
Well controlled
Large dataset

Imagine all that can be achieved with the following chunk of text. You can skip it, tho. Just know that:

Three movies are created
Three cinemas are created, each has three theaters
Six schedules are created, three of which are approved
All are managed under a well-known name dataset_saigon_weekend
All are visible under one Python file test_schedule_api.py

in readable format, with ~133 LOC + data:

~100 lines of data in YAML format
~34 LOC. Let’s call it an overhead


@pytest.fixture(scope='function')

def dataset_saigon_weekend(request, db):

     """

     movies:

     -   code: 10.5240/0067-DEFB-A9F6-DD23-70DA-1

         title: The Red Fox

         status: 2

     -   code: 10.5240/0067-DEFB-A9F6-DD23-70DA-2

         title: Kramus

         status: 2

     -   code: 10.5240/0067-DEFB-A9F6-DD23-70DA-3

         title: Star Wars

         status: 2

     cinemas:

     -   name: Galaxy Tan Binh

         group: Galaxy

         prefix: GALAXYTB

         district: Tân Bình

         city: Ho Chi Minh

         country: Vietnam

         status: 2

         theaters:

         -   code: GALAXYTB-T000001

             name: Theater One

         -   code: GALAXYTB-T000002

             name: Theater Two

         -   code: GALAXYTB-T000003

             name: Theater Three

     -   name: Galaxy Nguyen Trai

         group: Galaxy

         prefix: GALAXYNT

         district: Quan 1

         city: Ho Chi Minh

         country: Vietnam

         status: 2

         theaters:

         -   code: GALAXYNT-T000001

             name: Theater One

         -   code: GALAXYNT-T000002

             name: Theater Two

         -   code: GALAXYNT-T000003

             name: Theater Three

     -   name: Lotte Cong Hoa

         group: Lotte

         prefix: LOTTCONG

         district: Tân Bình

         city: Ho Chi Minh

         country: Vietnam

         status: 2

         theaters:

         -   code: LOTTCONG-T000001

             name: Theater One

         -   code: LOTTCONG-T000002

             name: Theater Two

         -   code: LOTTCONG-T000003

             name: Theater Three

     schedules:

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-1

         theater_code: GALAXYTB-T000001

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 2

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-2

         theater_code: GALAXYTB-T000002

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 2

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-3

         theater_code: GALAXYTB-T000003

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 2

     # The same movies are not published at Galaxy Nguyen Trai (GalaxyNT)

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-1

         theater_code: GALAXYNT-T000001

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 1

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-2

         theater_code: GALAXYNT-T000002

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 1

     -   movie_code: 10.5240/0067-DEFB-A9F6-DD23-70DA-3

         theater_code: GALAXYNT-T000003

         start_at: 2016-01-01 09:00:00 UTC

         end_at: 2016-01-01 10:30:00 UTC

         status: 1

     """

     from tests.fixtures.simplefactories import (CinemaFactory,

                                                 TheaterFactory,

                                                 MovieFactory,

                                                 ScheduleFactory

                                                 )

     for cinema in dataset['cinemas']:

         inserted = CinemaFactory(**{key:value for key, value in cinema.items()
if key != 'theaters'})

         for theater in cinema['theaters']:

             theater['cinema'] = inserted

             TheaterFactory(**theater)

     for movie in dataset['movies']:

         MovieFactory(**movie)

     for schedule in dataset['schedules']:

         schedule['start_at'] = datetime.strptime(schedule['start_at'],
DATETIME_FORMAT)

         schedule['end_at'] = datetime.strptime(schedule['end_at'],
DATETIME_FORMAT)

         ScheduleFactory(**schedule)

Imagine how you would achieve the same goals otherwise. Keep in mind with this setup, we do not need to add more to the ~34 LOC as we load our dataset with a variety of more data.

Conclusion

Since I started with “Separation of Concerns”, let me recap likewise: I have observed a few concerns while doing unit-testing:

Manageability concern
Controllability concern
Scalability concern
Readability concern

One must treat these as mutually orthogonal vectors. It is a must to do unit-testing. It is only a matter of how to keep our own sanity while maintaining the test cases. Keep the concerns separate as a change in one vector should not mess with others.

My special thanks to:

Holger Krekel (@hpk42) and pytest-dev team
Raphaël Barrois, Mark Sandstrom for Factory Boy
Kirill Simonov for PyYAML (249kB of awesomeness)
Guido van Rossum for the snake, I mean Python <3

This blog article is a part of an upcoming series: Building thebox: A cinema ticket booking system

For the sake of giving real world object names while maintaining readers’ sanity, let’s say cinema is the house and ↩
… theaters are the rooms inside ↩
http://www.yaml.org/spec/1.2/spec.html ↩